Skip to content
Reference-based compression of SRA data
Java JavaScript
Find file
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.
.settings
build
data
lib
src
target
.classpath
.project
README
cramtools.jar

README

CRAMTools is a set of Java tools and APIs for efficient compression of sequence read data. Although this is intended as a stable version the code is released as early access. Parts of the CRAMTools are experimental and may not be supported in the future.
http://www.ebi.ac.uk/ena/about/cram_toolkit

Version 0.3
 
Input files:
Reference sequence in fasta format <fasta file>
Reference sequence index file <fasta file>.fai created using samtools (samtools faidx <fasta file>)
Input BAM file <BAM file> sorted by reference coordinates
BAM index file <BAM file>.bai created using samtools (samtools index <BAM file>)
Download and run the program:
Download the prebuilt runnable jar file from: https://github.com/vadimzalunin/crammer/blob/master/cramtools.jar?raw=true
Execute the command line program: java -jar cramtools.jar
Usage is printed if no arguments were given 
To convert a BAM file to CRAM:

java -jar cramtools.jar cram --input-bam-file <bam file> --reference-fasta-file <reference fasta file> [--output-cram-file <output cram file>]
To convert a CRAM file to BAM:
java -jar cramtools.jar bam --input-cram-file <input cram file> --reference-fasta-file <reference fasta file> --output-bam-file <output bam file>
To build the program from source:
 
To check out the source code from github you will need git client: http://git-scm.com/
Make sure you have java 1.6 or higher: http://openjdk.java.net/ or http://www.oracle.com/us/technologies/java/index.html
Make sure you have ant version 1.7 or higher: http://ant.apache.org/
git clone git://github.com/vadimzalunin/crammer.git
ant -f build/build.xml runnable
java -jar cramtools.jar
To run unit tests:
 ant -f build/build.xml test
 
Read quality masking (RQM).
By default in this version quality scores are not stored unless a special file containing read quality masks is provided in the input. Two RQM formats are provisionally supported:
1. Each line is a combination of 'on' and 'off' symbols. 
2. Each line consists of decimal read positions delimited by space. 

Examples can be found in the test datasets provided. 

Known issues
- BamRoundTripTests fails on one of the test datasets (set1). This is due to the design defect in dealing with long-distance pairing information. The issue has high priority and will be fixed in the next release. 
- BAM->CRAM->BAM produces BAM file which is much smaller than the original BAM file. Therefore when comparing BAM vs CRAM this fact should be taken into account. 
- Performance is heavily affected by file IO, namely by reading BAM and reference files.
Something went wrong with that request. Please try again.