This repository contains research and experimental results for the project aimed at using latest advances in Deep Learning for Computer Vision to detect with high accuracy off-sample images in mass spectrometry imaging data.
The results presented in this repository have been used to implement new features for the Project Metaspace with purpose to help its users to automatically filter out molecular search results that represent the area outside of the biological sample.
The model is used in Metaspace as a service with simple API. The details of the model inference service can be found Here
Examples of on-sample ion images:
Examples of off-sample ion images:
A preprint of the paper describing the methods developed in the repository as well as other methods can be found on bioRxiv:
Katja Ovchinnikova, Vitaly Kovalev, Lachlan Stuart, Theodore Alexandrov Recognizing off-sample mass spectrometry images with machine and deep learning
Final model performance estimated by 5 fold group cross validation:
Class | F1 | Precision | Recall | Accuracy |
---|---|---|---|---|
off | 0.988 | 0.986 | 0.990 | 0.990 |
on | 0.992 | 0.993 | 0.990 | 0.990 |
The dataset used for model training can be downloaded from S3
wget https://s3-eu-west-1.amazonaws.com/sm-off-sample/GS.tar.gz
tar -xf GS.tar.gz
It includes 23238 manually tagged ion images, of them 13325 belong to the “off-sample” class and 9913 to the "on-sample” class.
- Ubuntu >= 14.04
- fastai==2.1.5
- pytorch>=1.0
- Nvidia GPU for fast training
All dependencies can be installed using the provided conda environment files.
- Clone repository and install dependencies
pip install -r requirements.txt
- If Jupyter is already installed, add a new kernel
python -m ipykernel install --user --name fastai --display-name fastai
- Otherwise install Jupyter into the environment
pip install jupyter