# JP09 Fragmentation for peptide identification

[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/CSBiology/BIO-BTE-06-L-7/gh-pages?filepath=JP09_Fragmentation_for_peptide_identification.ipynb)

1. [Understanding MS2 spectra: From peptide to fragment](#Understanding-MS2-spectra:-From-peptide-to-fragment)
2. [Simulate MS2 Fragmentation](#Simulate-MS2-Fragmentation)<br>
2. [References](#References)

## Understanding MS2 spectra: From peptide to fragment
<a href="#Fragmentation-for-peptide-identification" style="display: inline-block"><sup>&#8593;back</sup></a><br>
<a href="#Fragmentation-for-peptide-identification" style="display: inline-block"><sup>&#8593;back</sup></a><br>

<div class="container">
<div class="container">
The currency of information for identification in MS-based proteomics is the fragment ion spectrum (MS/MS spectrum) that is typically 
derived from the fragmentation of a specific peptide in the collision cell of a mass spectrometer. Peptides produce fragments that provide 
information on their amino acid sequence. The correct assignment of such a spectrum to a peptide sequence is the central step to link 
m/z values and ion intensities to biology<sup><a href="#31">31</a></sup>. 

<div Id="figure4" Style="float: right ; display: inline-block ; color: #44546a ; width: 60% ; padding: 15px">
<div Id="figure4" Style="float: right ; display: inline-block ; color: #44546a ; width: 60% ; padding: 15px">
    <img src="img/FragmentIonNomenclature.png" Style="width: 100%">
    <div Style="padding-left: 1rem ; padding-right: 1rem ; margin-top: 1rem ; text-align: justify ; font-size: 0.8rem">
        <b>Figure 4: The Roepstorff-Fohlmann-Biemann nomenclature of fragment ions.</b>
        N-terminal and C-terminal peptide fragments result of dissociation of electron bonds along the peptide backbone.
    </div>
</div>    

During the unimolecular peptide ion dissociation processes, different chemical reactions can lead to different types 
of product ions. The types of ions observed in MS/MS experiments depend on the physicochemical properties of the amino 
acids and their sequence, on the amount of internal energy, and on the peptide’s charge state. In addition, product ion formation 
is strongly influenced by the fragmentation method<sup><a href="#32">32</a></sup>. The most widely used fragmentation methods today 
are low-energy collision-induced dissociation (CID)<sup><a href="#33">33</a></sup> and electron transfer dissociation 
(ETD)<sup><a href="#34">34</a></sup>. These methods favor fragmentation along the peptide backbone and result in an N-terminal prefix 
fragment and a C-terminal suffix fragment. The standard nomenclature for the C-terminal fragments is x, y and z whereas the corresponding 
N-terminal fragments are denoted as a, b and c depending on the position where the breakage occurs at the peptide backbone level. The numbering 
of each fragment starts from the N-terminus for a,b,c series and from the C-terminus for x,y,z series (<a href="#figure4">Figure 4</a>). 
One should keep in mind that during parent ion selection many of the same peptide ions are selected and dissociated into fragments, with the 
resulting fragment ions having different relative abundances according to the preferred fragmentation reaction. In addition to the 
fragmentation along the peptide backbone, fragment ions containing the amino acids R, K, N, or Q can lose ammonia (-17 Da) and are then 
denoted a*, b* and y*. Fragments containing the amino acids S, T, E, or D may lose water (-18 Da) and are then denoted a°, b° and y°. 
These losses do not change the charge of the ions and are observable as natural losses <sup><a href="#35">35</a>,<a href="#36">36</a></sup>.

</div>
</div>



In [1]:
#r "nuget: BioFSharp, 2.0.0-beta5"
#r "nuget: BioFSharp.IO, 2.0.0-beta5"
#r "nuget: Plotly.NET, 2.0.0-beta6"
#r "nuget: BioFSharp.Mz, 0.1.5-beta"

#r "nuget: Plotly.NET, 2.0.0-beta6"
#r "nuget: Plotly.NET.Interactive, 2.0.0-beta6"

open Plotly.NET
open BioFSharp


## Simulate MS2 Fragmentation
<a href="#Fragmentation-for-peptide-identification" style="display: inline-block"><sup>&#8593;back</sup></a><br>
<a href="#Fragmentation-for-peptide-identification" style="display: inline-block"><sup>&#8593;back</sup></a><br>


For the simulation we first define a short peptide. The peptide we take for this example is from rbcL.




In [2]:
// Code-Block 1

let peptide = 
    "DTDILAAFR"
    |> BioList.ofAminoAcidString

peptide


[Asp; Thr; Asp; Ile; Leu; Ala; Ala; Phe; Arg]

<div class="container">
<div class="container">
In the <code>Mz</code> namespace of <a href="https://csbiology.github.io/BioFSharp/">BioFSharp</a>, we can find a function that can 
generate the theoretical series of y-ions from the given peptide. This function provides a lot of information, but we are only interested 
in the mass. Notice, that we do not know the intesity of the fragment ions and just use '1.' for simulation.
</div>



In [3]:
// Code-Block 2

let ionSeriesY =
    peptide
    |> Mz.Fragmentation.Series.yOfBioList BioItem.initMonoisoMassWithMemP
    |> List.map (fun aac -> aac.MainPeak.Mass,1.)
    
ionSeriesY


[(1020.52401, 1.0); (905.4970666, 1.0); (804.4493882, 1.0); (689.4224451, 1.0); (576.3383812, 1.0); (463.2543172, 1.0); (392.2172034, 1.0); (321.1800896, 1.0); (174.1116757, 1.0)]

Similarly, we can simulate the b-ion series.



In [4]:
// Code-Block 3

let ionSeriesB =
    peptide
    |> Mz.Fragmentation.Series.bOfBioList BioItem.initMonoisoMassWithMemP
    |> List.map (fun aac -> aac.MainPeak.Mass,1.)

ionSeriesB


[(115.026943, 1.0); (216.0746215, 1.0); (331.1015645, 1.0); (444.1856285, 1.0); (557.2696925, 1.0); (628.3068062, 1.0); (699.34392, 1.0); (846.4123339, 1.0); (1002.513445, 1.0)]

Now, we can just plot the simulated data and look at our theoretical spectrum.



In [5]:
// Code-Block 4

let ionChart =
    [    
        Chart.Column (ionSeriesB, Name="b ions")
        Chart.Column (ionSeriesY, Name="y ions")
    ]
    |> Chart.Combine
ionChart


<hr>
<hr>
<nav class="level is-mobile">
    <div class="level-left">
        <div class="level-item">
            <button class="button is-primary is-outlined" onclick="location.href='/JP08_Centroidisation.html';">&#171; JP08</button>
        </div>
    </div>
    <div class="level-right">
        <div class="level-item">
            <button class="button is-primary is-outlined" onclick="location.href='/JP10_Peptide_Identification.html';">JP10 &#187;</button>
        </div>
    </div>
</nav>

## References
<a href="#Fragmentation-for-peptide-identification" style="display: inline-block"><sup>&#8593;back</sup></a><br>
<a href="#Fragmentation-for-peptide-identification" style="display: inline-block"><sup>&#8593;back</sup></a><br>

<ol>
<ol>
<li Value="31" Id="31"> Nesvizhskii, A. I., Vitek, O. & Aebersold, R. Analysis and validation of proteomic data generated by tandem mass spectrometry. Nature methods 4, 787–797; 10.1038/nmeth1088 (2007).
<li Id="32"> Medzihradszky, K. F. Peptide sequence analysis. Method Enzymol 402, 209–244; 10.1016/S0076-6879(05)02007-0 (2005).
<li Id="33"> Johnson, R. S., Martin, S. A., Biemann, K., Stults, J. T. & Watson, J. T. Novel fragmentation process of peptides by collision-induced decomposition in a tandem mass spectrometer: differentiation of leucine and isoleucine. Anal. Chem. 59, 2621–2625; 10.1021/Ac00148a019 (1987).
<li Id="34"> Mikesh, L. M. et al. The utility of ETD mass spectrometry in proteomic analysis. Biochimica et biophysica acta 1764, 1811–1822; 10.1016/j.bbapap.2006.10.003 (2006).
<li Id="35"> Forner, F., Foster, L. J. & Toppo, S. Mass spectrometry data analysis in the proteomics era. Curr Bioinform 2, 63–93; 10.2174/157489307779314285 (2007).
<li Id="36"> Steen, H. & Mann, M. The ABC's (and XYZ's) of peptide sequencing. Nat. Rev. Mol. Cell Biol. 5, 699–711; 10.1038/nrm1468 (2004).
</ol>

