# Peak Picking

One of the powerful tools of *MassDash* is the ability to perform peak picking on the chromatograms. 

Currently supported peak pickers are:

In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
# Please run this before executing any cell
import os
os.chdir("../../test/test_data/") #### Insert path to data, this is the path to the tutorial data. 

In [3]:
from massdash.loaders import SqMassLoader
import os
pep = "NKESPT(UniMod:21)KAIVR(UniMod:267)"
charge = 3
loader = SqMassLoader(dataFiles=["xics/test_chrom_1.sqMass"], rsltsFile="osw/test_data.osw")
transitionGroup = list(loader.loadTransitionGroups(pep, charge).values())[0]
transitionGroupFeatures = loader.loadTransitionGroupFeaturesDf(pep, charge)

If the above code does not look familliar, please look at previous notebooks.

## MRMTransitionGroupPicker

### 1. Initiate MRMTransitionGroupPicker Object

In [4]:
from massdash.peakPickers import MRMTransitionGroupPicker

noSmoothing = MRMTransitionGroupPicker("original") # No smoothing
guassSmoothing = MRMTransitionGroupPicker("gauss", gauss_width=50.0) # Gaussian smoothing
sgolaySmoothing = MRMTransitionGroupPicker("sgolay", sgolay_frame_length = 11, sgolay_polynomial_order=3) #Sgolay smoothing 

For the following example we will use the sgolay smoother as this is the default for OpenSwath.

### 2. Customize Parameters

Here to cap ourselves at a reasonable number of features we will change `stop_after_feature` to 5. Also since we know that for this example the precursor signal is reasonable we will try turn on the `use_precursor` parameter. Since this precursor is quite high in intensity we can change the `signal_to_noise` cutoff to be more stringent. 

In [5]:
sgolaySmoothing.setGeneralParameters(stop_after_feature=5, signal_to_noise=0.001, use_precursors='true')

### 3. Pick Precursor

In [6]:
features = sgolaySmoothing.pick(transitionGroup)

For easier inspection of the features, we can convert them to a pandas dataframe

In [7]:
from massdash.structs import TransitionGroupFeature
TransitionGroupFeature.toPandasDf(features)

Unnamed: 0,leftBoundary,rightBoundary,areaIntensity,qvalue,consensusApex,consensusApexIntensity
0,843.5,901.700012,223735.421875,,865.628726,
1,818.099976,843.900024,65524.886719,,839.389013,
2,1177.900024,1207.0,32929.136719,,1196.055723,
3,1051.099976,1087.5,48799.734375,,1069.299988,
4,978.400024,1011.099976,19260.796875,,995.780439,
5,1112.5,1152.5,36867.34375,,1133.06705,


### 4. Visualize Results

As shown in the plotting1D notebook, the chromatogram can easily be visualized directly from the transitionGroup object. Here instead of linking `OpenSwath` or `DIA-NN` found features, we can use the features that we just computed. 

In [8]:
transitionGroup.plot(transitionGroupFeatures=features)

## PyMRMPeakPicker

### 1. Initiate the Peak Picker

This peak picker requires no arguments for initating the peak picking object

In [9]:
from massdash.peakPickers.pyMRMTransitionGroupPicker import pyMRMTransitionGroupPicker
picker = pyMRMTransitionGroupPicker()

### 2. Customize Parameters

For this peak picker no parameters can be customized

### 3. Pick Precursor

In [10]:
features = picker.pick(transitionGroup)
TransitionGroupFeature.toPandasDf(features)

You are setting values through chained assignment. Currently this works in certain cases, but when using Copy-on-Write (which will become the default behaviour in pandas 3.0) this will never work to update the original DataFrame or Series, because the intermediate object on which we are setting values will behave as a copy.
A typical example is when you are setting values in a column of a DataFrame, like:

df["col"][row_indexer] = value

Use `df.loc[row_indexer, "col"] = values` instead, to perform the assignment in a single step and ensure this keeps updating the original `df`.

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

  newPeaks['areaIntensity'].iloc[-1] += peaksDf['areaIntensity'].iloc[idx]


Unnamed: 0,leftBoundary,rightBoundary,areaIntensity,qvalue,consensusApex,consensusApexIntensity
0,843.5,901.700012,1537205.913208,,,235346.499471
1,818.099976,843.5,425795.080078,,,78247.752761
2,1189.199951,1203.800049,2428.022644,,,64678.458586
3,1167.400024,1189.199951,226734.0625,,,28407.271623
4,1131.099976,1156.5,2091.005585,,,10654.984463


Like above can inspect in pandas dataframe. We can see that the features are slightly different from above however the top two most intense features are the same. 

#### 4. Visualization

In [11]:
transitionGroup.plot(transitionGroupFeatures=features)

Note: Some of the intensities of the features are so small they cannot be seen.

### Implementing Your Own Peak Picker