# DDA - Label-Free (with FlashLFQ)

This tutorial involves how to analyze DDA LFQ data with combining DB search tools and FlashLFQ (for quantification).

For DDA label-free analysis, sometimes, we need to use stand-alone Label-free quantifcation tools such as [FlashLFQ](https://github.com/smith-chem-wisc/FlashLFQ) to quantify with more detailed options (e.g. MBR).

To utilize FlashLFQ, please find tutorials and installation guides in FlashLFQ documentation. This supports Docker, GUI, and conda environment.

## Data Preparation
### Load Required Pacakages

In [2]:
import msmu as mm
import pandas as pd

### Read Data and PSM Filtering
In this tutorial, we will use [PXD012986](https://www.ebi.ac.uk/pride/archive/projects/PXD012986) (Uszkoreit _et al_., 2022) dataset which is mentioned in [DDA-LFQ](../dda-lfq) tutorial section.

To combine FlashLFQ quantification result with msmu, we need to read PSM result file from DB search tools and filter PSMs based on q-value or other criteria, which is because FlashLFQ assumes that the input PSMs are already filtered.

In [3]:
base_dir = "https://raw.githubusercontent.com/bertis-informatics/msmu/refs/heads/main/data/sage_lfq"
sage_idents = f"{base_dir}/sage/results.sage.tsv"
meta = f"{base_dir}/meta.csv"

mdata = mm.read_sage(identification_file=sage_idents, label="label_free")

meta_df = pd.read_csv("https://raw.githubusercontent.com/bertis-informatics/msmu/refs/heads/main/data/sage_lfq/meta.csv")
meta_df = meta_df.set_index("sample_id")

mdata.obs = mdata.obs.join(meta_df)
mdata.push_obs()
mdata.obs

mdata = mm.pp.add_filter(mdata, modality="psm", column="q_value", keep="lt", value=0.01)
mdata = mm.pp.apply_filter(mdata, modality="psm")

mdata

INFO - Identification file loaded: (5000, 40)
INFO - Decoy entries separated: (345, 15)


## Export FlashLFQ Input File
After filtering PSMs, we can export the PSMs to FlashLFQ input format using `mm.io.write_flashlfq_input` function.

In [4]:
mm.io.write_flashlfq_input(mdata, "flashlfq_input.tsv")

## (optional in here) Run FlashLFQ

After exporting FlashLFQ input file, we can run FlashLFQ with proper parameters (e.g. MBR) to quantify peptides.

The command line example below shows how to run FlashLFQ in Linux. Please adjust the parameters based on your experimental design and FlashLFQ documentation.

You can skip this step in this tutorial and directly use the provided FlashLFQ quantification result file.

In [None]:
# bash

# dotnet CMD.dll --idt "flashlfq_input.tsv" --rep "/path/to/spectra/directory/" --ppm 5 --chg

# or using Docker

# docker run --rm -v /path/to/local/directory:/data smithchemwisc/flashlfq:1.0.3 \
#     --idt "/data/flashlfq_input.tsv" \
#     --rep "/data/spectra/" \
#     --ppm 5 \
#     --chg

## Attach FlashLFQ result to mdata
Peptide quantification result from FlashLFQ can be attached to `mdata` using `mm.utils.add_quant` function with `quant_tool="flashlfq"` parameter with a file named "QuantifiedPeptides.tsv" containing peptide level quantification values and evidences.

In [5]:
flashlfq_dir = f"https://raw.githubusercontent.com/bertis-informatics/msmu/refs/heads/main/data/flashlfq"
flashlfq_peptides = f"{flashlfq_dir}/QuantifiedPeptides.tsv"

mdata = mm.pp.to_peptide(mdata)

mdata = mm.utils.add_quant(mdata, quant_data=flashlfq_peptides, quant_tool="flashlfq")

mdata = mm.pp.log2_transform(mdata, modality="peptide")

mdata

INFO - Peptide-level identifications: 3683 (3664 at 1% FDR)


Building new peptide quantification data.


INFO - Added quantification modality 'peptide' using flashlfq data.
INFO - Quantification data shape: (3547, 6)

