Skip to content

thbuerg/MetabolomicsCommonDiseases

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Logo

Metabolomic profiles predict individual multi-disease outcomes

Open In Colab Paper DOI CC BY-NC-SA 4.0

Description

Code related to the paper "Metabolomic profiles predict individual multi-disease outcomes in the UK Biobank cohort". This repo is a python package for preprocessing UK Biobank data and preprocessing, training and evaluating the proposed MetabolomicStateModel score.

Workflow

Methods

The MetabolomicStateModel is based on DeepSurv (the original implementation can be found here). Using a residual neural network, it learns a shared-representation of the NMR metabolomics data to predict log partial hazards for common disease endpoints.

Architecture

Assets

This repo contains code to preprocess UK Biobank data, train the MetabolomicStateModel and analyze/evaluate its performance.

  • Preprocessing involves parsing primary care records for desired diagnosis.
  • Training involves Model specification via pytorch-lightning and hydra.
  • Evaluation involves extensive benchmarks with linear Models, and calculation of bootstrapped metrics.
  • Visualization contains the code to generate the figures displayed in the paper.

Use the MetabolomicStateModel on your data

We provide you a ready-to-use Google colab notebook with a trained version of our MetabolomicStateModel. Upload your dataset of Nightingale NMR metabolomics and run the model!

NOTE: Data must be provided in this format.

DISCLAIMER: This model is intended for research use only. We provide the NMR normalization pipeline as fitted on UK Biobank. Cohort-specific rescaling might be advisable.

How to train the MetabolomicStateModel

  1. First, install dependencies
# clone project   
git clone https://github.com/thbuerg/MetabolomicsCommonDiseases

# install project   
cd MetabolomicsCommonDiseases
pip install -e .   
pip install -r requirements.txt
  1. Download UK Biobank data. Execute preprocessing notebooks on the downloaded data.

  2. Set up Neptune.ai

  3. Edit the config.yaml in metabolomicstatemodel/run/config/:

data_dir: /path/to/data
code_dir: /path/to/repo_base
setup:
  project: <YourNeptuneWorkspace>/<YourProject>
experiment:
  tabular_filepath: /path/to/processed/data
  1. Train the NeuralCVD Model (make sure you are on a machine w/ GPU)
# module folder
cd source

# run training
bash run/run_MSM.sh

License

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

CC BY-NC-SA 4.0

Citation

@article{buergel2022metabolomic,
  title={Metabolomic profiles predict individual multidisease outcomes},
  author={Buergel, Thore and Steinfeldt, Jakob and Ruyoga, Greg and Pietzner, Maik and Bizzarri, Daniele and Vojinovic, Dina and Upmeier zu Belzen, Julius and Loock, Lukas and Kittner, Paul and Christmann, Lara and others},
  journal={Nature Medicine},
  pages={1--12},
  year={2022},
  publisher={Nature Publishing Group}
}