Skip to content

jtprince/ms-error_rate

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

63 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

An Mspire library for calculating or dealing with error rates. These may be from target-decoy searches, sample bias validation, or other sources.

Examples

Target-Decoy with Mascot

Generate q-values (right now only with Mascot and MascotPercolator):

require 'ms/error_rate/qvalue'
target_hits = Ms::ErrorRate::Qvalue::Mascot.qvalues(target_files, decoy_files)
# target_hit is a PeptideHit Struct (:filename, :query_title, :charge, :sequence, :mowse, :qvalue)

# or on the commandline:
% qvalues.rb <target>.dat <decoy>.dat

The same output can be produced from Mascot-Percolator output:

require 'ms/error_rate/qvalue'
target_hits = Ms::ErrorRate::Qvalue::Mascot::Percolator.qvalues(datp_files, tab_dot_text_files)
# or commandline:
% qvalues.rb <target>.datp <target>.tab.txt

Sample Bias Validation

Sample Bias Validation allows error rate determination based on expected biases in sample composition. Here is an example using transmembrane sequence content. We will assume a fasta file called ‘proteins.fasta`:

# create a peptide-centric database
fasta_to_peptide_centric_db.rb proteins.fasta  # defaults 2 missed cleavages, min aaseq 4
   # generates a file: proteins.msd_clvg2.min_aaseq4.yml

# create a transmembrane sequence prediction file
fasta_to_phobius.rb proteins.fasta     # => generates proteins.phobius

generate_sbv_input_hashes.rb proteins.msd_clvg2.min_aaseq4.yml --tm proteins.phobius,1
   # creates two files:
   # proteins.msd_clvg2.min_aaseq4.tm_min1.by_aaseq.yml
   # proteins.msd_clvg2.min_aaseq4.tm_min1.freq_by_length.yml

# cytosolic fraction (transmembrane sequences not expected):
error_rate qvalues.yml --fp-sbv proteins.msd_clvg2.min_aaseq4.tm_min1.by_aaseq.yml,\
    proteins.msd_clvg2.min_aaseq4.tm_min1.freq_by_length.yml,0.05

Installation

gem install ms-error_rate

See LICENSE

About

an mspire library for calculating error rates (false discovery rates) in mass spec proteomics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages