Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/smaller chromfiles #2458

Merged
merged 23 commits into from Mar 22, 2017

Conversation

Projects
None yet
3 participants
@hroest
Copy link
Contributor

commented Mar 20, 2017

makes chromatogram files produced by OpenSWATH substantially smaller

On a sample dataset [1] using OpenSwathWorkflow [2]. This PR reduces the output chromatographic file size from ca 1.9 GB (486 GB in .gz) to 1.2 GB (196 GB in .gz) by removing duplicate information and enabling compression of the chromatogram data. This amounts to over 60% compression for raw .mzML files and almost 2.5 fold compression for .mzML.gz files [3]. This is done by enabling numpress compression by default for all chromatogram output files from within OpenSwathWorkflow. It also includes some changes to the mzML code, reducing duplicate or non-informative CVTerms.

Overall, this allows much more compact storage of chromatogram data.

Ref 1: http://www.biorxiv.org/content/early/2016/03/18/044552
Ref 2:

OpenSwathWorkflow -in olgas_K121026_001_SW_Wayne_R1_d00.wiff.mzML \
-tr Mtb_TubercuList-R27_iRT_UPS_newDecoy.csv  -sort_swath_maps -swath_windows_file \
SWATHwindows_analysis.tsv -tr_irt /tmp/iRTassays.TraML \
-out_tsv /tmp/olgas_K121026_001_SW_Wayne_R1_d00.wiff.profile.csv -readOptions normal \
-threads 4  -out_chrom /tmp/olgas_K121026_001_SW_Wayne_R1_d00.chrom.osw_output.mzML

Ref 3: note, these values were achieved by creating numpress+zlib mzML files which are technically not valid

@hroest hroest force-pushed the hroest:feature/smaller_chromfiles branch from a814de4 to 10d69bc Mar 21, 2017

hroest added some commits Mar 21, 2017

[FIX] bugfix for indexed files
- wow, never been so happy that we have good tests
@grosenberger

This comment has been minimized.

Copy link
Member

commented Mar 22, 2017

Looks great and I know a few users and admins who will be very happy about the changes. 👍

@timosachsenberg timosachsenberg merged commit adc0372 into OpenMS:develop Mar 22, 2017

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

@hroest hroest deleted the hroest:feature/smaller_chromfiles branch Mar 22, 2017

@hroest hroest added this to the Release 2.2 milestone May 29, 2017

@hroest hroest referenced this pull request Dec 8, 2017

Merged

Feature/fix comet #3082

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.