MSFTBX

The acronym stands for Mass Spectrometry File Toolbox. This is a library for access to some common mass-spectrometry/proteomics data formats from Java:

mzML
mzXML
pepXML/pep.xml
protXML/prot.xml
mzIdentML
cef (Agilent)
GPMdb XML

This library is what drives BatMass.

Citing

Please cite the following paper if you used MSFTBX or BatMass in your work:
Avtonomov D.M. et al: J. Proteome Res. June 16, 2016. DOI: 10.1021/acs.jproteome.6b00021

Maven dependency

Latest version on Maven Central

<dependency>
    <groupId>com.github.chhh</groupId>
    <artifactId>msftbx</artifactId>
    <version>1.8.8</version>
</dependency>

How to use

To get started, follow the tutorial: http://www.batmass.org/tutorial/data-access-layer/#parsing-lc-ms-data-mzml-mzxml-files
Check out a fully working example repo: https://github.com/chhh/msftbx-examples
- The exmaple compiles and runs with a single command, only requires java to be installed, nothing else.

Features

Parsers for mzML/mzXML with unified API
- Very fast, multi-threaded
- Rich standardized API for contents of those files (scan and run meta-info, not just spectra).
- msNumpress compression support for mzML
- Automated LC/MS run structure determination:
  - Data structures for parent-child relationship between spectra
  - Indexes for scans based on scan numbers, retention times both globally and for each MS level separately
  - Convenient methods to get next-previous scans at the same MS level
- Tolerant to malformed data
  - Can handle MS2 scan tags nested inside MS1 scans
  - Tolerant to missing or broken file index
  - Reindexing on the fly
- Memory management
  - Automated spectra parsing on demand
    - You can parse just the structure of an LC/MS run without the spectral data, the memory footprint in this case will be very small. Only when spectra are requested will they be parsed.
    - Soft referencing of spectral data for GC
  - Tracking of which loaded data is not being used by any components with automated unloading.
Upcoming support for Thermo RAW files on Windows
pepXML parser and writer
protXML parser and writer
mzIdentML parser
GPMdb XML files parser
Agilent .cef files parser

Binary distribution

Get pre-built jars from Maven Central.

Building with Maven (preferred)

cd ./MSFileToolbox && mvn clean package
Will produce the jar files with just the library msftbx-X.X.X.jar as well as one large jar msftbx-X.X.X-jar-with-dependencies.jar. The latter can be used as is, it includes all the needed dependencies.

Building a NetBeans Platform module

NetBeans Module: Open the root directory in NetBeans as a project. You will see MSFTBX module suite which consists of 3 modules: MSFileToolbox Module - (this is the main thing), MSFileToolbox Libx - these are the depencies, and Auto Update (MSFTBX) - this is the update center for NetBeans Platform projects (you definitely don't need this) .

Dependencies

SLF4J
Google Guava
Apache Commons Pool 2
OboParser from Biojava's submodule Ontology
Javolution Core (slightly modified, sources are here, this modified dependency is published on Maven Central)

Notes

When dealing with mzIdentML files (.mzid) you will encounter AbstractParamType. In the definition of mzIdentML both cvParam and userParam inherit from it and both cvParam and userParam can be stored in the same list. Thus, when you get such a list, you'll need to cast manually to the concrete type like so:

List<AbstractParamType> paramGroup = blabla.getParamGroup();
for (AbstractParamType param : paramGroup) {

	if (param instanceof CVParamType) {
		CVParamType p = (CVParamType)param;
		// do something with cvParam

	} else if (param instanceof UserParamType) {
		UserParamType p = (UserParamType)param;
		// do something with userParam

	}
}

Release notes

v1.8.2

Make MSFTBX Java 9 compatible. JAXB dependencies included.

v1.8.0

Incompatible change to previous versions. PepXml, ProtXml, MzIdentMl parsers now use Doubles instead of Floats everywhere. Any old code using old Float properties might break now.

Name		Name	Last commit message	Last commit date
Latest commit History 248 Commits
Autoupdate		Autoupdate
MSFileToolbox		MSFileToolbox
MSFileToolboxLibs		MSFileToolboxLibs
nbproject		nbproject
updates		updates
xsd-jaxb2-maven-build		xsd-jaxb2-maven-build
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
build.xml		build.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MSFTBX

Citing

Maven dependency

How to use

Features

Binary distribution

Building with Maven (preferred)

Building a NetBeans Platform module

Dependencies

Notes

Release notes

v1.8.2

v1.8.0

About

Releases 5

Packages

Contributors 3

Languages

License

chhh/MSFTBX

Folders and files

Latest commit

History

Repository files navigation

MSFTBX

Citing

Maven dependency

How to use

Features

Binary distribution

Building with Maven (preferred)

Building a NetBeans Platform module

Dependencies

Notes

Release notes

v1.8.2

v1.8.0

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 5

Packages 0

Contributors 3

Languages

Packages