ModiFinder Analysis

This repository provides codes and examples for the analysis of the paper: ModiFinder: Tandem Mass Spectral Alignment Enables Structural Modification Site Localization

Mohammad Reza Zare Shahneh, Michael Strobel, Giovanni Andrea Vitale, Christian Geibel, Yasin El Abiead, Berenike C Wagner, Karl Forchhammer, Neha Garg, Allegra T Aron, Vanessa V Phelan, Daniel Petras, Mingxun Wang

Install and setup

After cloning the repository, you need to add the ModiFinder module:

git submodule update --init --recursive
Install the conda enviroment, We recommend using mamba instead of conda for fast install (e.g., mamba env create -f environment.yml):

conda env create -f environment.yml
Install nextflow
Activate the environment:

conda activate modi-finder-analysis

Data

First, you need to set the directory of the data in run_config file. Then you can download the data or create it from scratch:

You can download the files used in this project from: Zenodo and put them in the data directory defined earlier. The final format should be similar to this:
```
your_data_directory/
├── matches/
├── helpers/
├── SIRIUS/
└── cfmid_exp/
```
Please note that if you choose to download data in this manner, due to the necessity of requesting information for each individual compound in real-time, it is essential to restrict the number of concurrent processes to avoid exceeding the server's request limits.
You can download and create the data used in this project from scratch, by running the data_prepare_main.py:
```
conda activate modi-finder-analysis
python ./data_preparation/data_prepare_main.py
```
Please note that the data for SIRIUS has to be dowloaded from the provided link in the previous section or use gnps2 to run the workflow.

You can download the random forest model and then load it using:

import joblib
trained_model = joblib.load(trained_model_path)
inputs = trained_model['input']
model = trained_model['model']

given that scikit-learn==1.3.2 is installed.

Experiments

To run our experiments, you can run the following command: conda activate modi-finder-analysis python ./experiments_runners/experiments_runner.py './experiments_settings/all_experiments_settings.csv'

Results

Performance result

you can check paper_figures/performance_results.ipynb notebook for performance result illustrations.

Helpers contribution

you can check paper_figures/how_much_helpers_help.ipynb notebook for helpers contribution.

Evaluation score illustration

you can check paper_figures/evaluation_score_illustration.ipynb notebook for evaluation score illustration.

dataset stats

you cak check paper_figures/datasets.ipynb notebook for dataset stats.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
SmallMol_Mod_Site_Localization @ 33ff167		SmallMol_Mod_Site_Localization @ 33ff167
data_preparation		data_preparation
experiments_runners		experiments_runners
experiments_settings		experiments_settings
models		models
paper_figures		paper_figures
predict_number_of_modifications		predict_number_of_modifications
results		results
.gitignore		.gitignore
.gitmodules		.gitmodules
ALL_GNPS_cleaned.csv		ALL_GNPS_cleaned.csv
README.md		README.md
environment.yml		environment.yml
libraries.csv		libraries.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ModiFinder Analysis

Install and setup

Data

Experiments

Results

Performance result

Helpers contribution

Evaluation score illustration

dataset stats

About

Uh oh!

Releases

Packages

Languages

ahmad00m/ModiFinder_analysis

Folders and files

Latest commit

History

Repository files navigation

ModiFinder Analysis

Install and setup

Data

Experiments

Results

Performance result

Helpers contribution

Evaluation score illustration

dataset stats

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages