Fetching latest commit…
Cannot retrieve the latest commit at this time.
|Failed to load latest commit information.|
CRAMTools is a set of Java tools and APIs for efficient compression of sequence read data. Although this is intended as a stable version the code is released as early access. Parts of the CRAMTools are experimental and may not be supported in the future. http://www.ebi.ac.uk/ena/about/cram_toolkit Version 0.3 Input files: Reference sequence in fasta format <fasta file> Reference sequence index file <fasta file>.fai created using samtools (samtools faidx <fasta file>) Input BAM file <BAM file> sorted by reference coordinates BAM index file <BAM file>.bai created using samtools (samtools index <BAM file>) Download and run the program: Download the prebuilt runnable jar file from: https://github.com/vadimzalunin/crammer/blob/master/cramtools.jar?raw=true Execute the command line program: java -jar cramtools.jar Usage is printed if no arguments were given To convert a BAM file to CRAM: java -jar cramtools.jar cram --input-bam-file <bam file> --reference-fasta-file <reference fasta file> [--output-cram-file <output cram file>] To convert a CRAM file to BAM: java -jar cramtools.jar bam --input-cram-file <input cram file> --reference-fasta-file <reference fasta file> --output-bam-file <output bam file> To build the program from source: To check out the source code from github you will need git client: http://git-scm.com/ Make sure you have java 1.6 or higher: http://openjdk.java.net/ or http://www.oracle.com/us/technologies/java/index.html Make sure you have ant version 1.7 or higher: http://ant.apache.org/ git clone git://github.com/vadimzalunin/crammer.git ant -f build/build.xml runnable java -jar cramtools.jar To run unit tests: ant -f build/build.xml test Read quality masking (RQM). By default in this version quality scores are not stored unless a special file containing read quality masks is provided in the input. Two RQM formats are provisionally supported: 1. Each line is a combination of 'on' and 'off' symbols. 2. Each line consists of decimal read positions delimited by space. Examples can be found in the test datasets provided. Known issues - BamRoundTripTests fails on one of the test datasets (set1). This is due to the design defect in dealing with long-distance pairing information. The issue has high priority and will be fixed in the next release. - BAM->CRAM->BAM produces BAM file which is much smaller than the original BAM file. Therefore when comparing BAM vs CRAM this fact should be taken into account. - Performance is heavily affected by file IO, namely by reading BAM and reference files.