compress-alignments

You can find the full manual as well as a simple tutorial at http://boiler.readthedocs.io/.

'boiler.py' is the main script that runs compression and decompression. Python 3 is required to run Boiler. The input SAM file must be sorted by read start position. To compress, run the following:.

./boiler.py compress [--frag-len-z-cutoff 0.125] [--split-discordant] [--split-diff-strands] [--preprocess tophat | stringtie] path/to/alignments.sam path/to/compressed.bl

--frag-len-z-cutoff sets the z-score for paired-end read lengths at which to set the cutoff for placing mates in different bundles. 0.125 seems to be a good z-score. Alternatively, you can use --frag-len-cutoff to set the cutoff directly. If --split-discordant is present, discordant reads will be treated as unpaired reads. If --split-diff-strands is present, reads with contradicting XS values will be treated as unpaired reads.

To decompress, run the following:

./boiler.py decompress [--force-xs] path/to/compressed.bl path/to/expanded.sam

--force-xs will assign XS tags to all spliced reads, as required by Cufflinks. If spliced reads are found with XS tags, they will be assigned at random. The decompressed SAM file will appear in the given directory, named expanded.sam.

To sort and convert to BAM, run:

samtools view -bS expanded.sam | samtools sort - expanded

To compare 2 cufflinks files, run:

./compareGTFs.py transcripts1.gtf transcripts2.gtf

To query a compressed file for bundles, coverage, or reads:

./boiler.py query [--bundles | --coverage | --reads] --chrom c [--start s] [--end e] path/to/compressed.bl path/to/output

If no output argument is provided, standard output will be used If --start or --end is absent, Boiler will use the beginning or end of the chromsome as bounds on the query.

Name		Name	Last commit message	Last commit date
Latest commit History 251 Commits
docs		docs
sam_util		sam_util
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
TODO.md		TODO.md
alignments.py		alignments.py
binaryIO.py		binaryIO.py
boiler.py		boiler.py
bucket.py		bucket.py
compareStringtieQuantification.py		compareStringtieQuantification.py
compress.py		compress.py
cross_bundle_bucket.py		cross_bundle_bucket.py
enumeratePairs.py		enumeratePairs.py
expand.py		expand.py
filterByLength.py		filterByLength.py
inferXStags.py		inferXStags.py
iterator.py		iterator.py
pairedread.py		pairedread.py
peakMem.py		peakMem.py
preprocess.py		preprocess.py
processHISAT.sh		processHISAT.sh
processTopHat.sh		processTopHat.sh
query.py		query.py
read.py		read.py
readPRO.py		readPRO.py
readSAM.py		readSAM.py
removeUp.py		removeUp.py
testFindReads.py		testFindReads.py
testMem.py		testMem.py
transcript.py		transcript.py

License

jpritt/boiler

Folders and files

Latest commit

History

Repository files navigation

compress-alignments

About

Resources

License

Stars

Watchers

Forks

Languages