<img src="materials/images/introduction-metabolomics-cover.png"/>

# Introduction to Metabolomics

`🕒 This module should take 30 minutes to complete.`

`✍️ This notebook is written using Python.`

<div class="alert alert-block alert-info">
<h3>⌨️ Keyboard shortcut</h3>

These common shortcut could save your time going through this notebook:
- Run the current cell: **`Enter + Shift`**.
- Add a cell above the current cell: Press **`A`**.
- Add a cell below the current cell: Press **`B`**.
- Change a code cell to markdown cell: Select the cell, and then press **`M`**.
- Delete a cell: Press **`D`** twice.

Need more help with keyboard shortcut? Press **`H`** to look it up.
</div>

---

## What is metabolomics?

Metabolomics is the scientific study of the set of metabolites present within an organism, cell, or tissue. Scientists find studying metabolomics useful because it reveals the unique “chemical fingerprints” that help them track different biological processes that are happening in the body [1]. 

For example, if you donate your blood samples to a scientist after you eat a healthy high fiber lunch (eg. a colorful salad with spinach, red pepper, black beans), versus a high fat and high carb lunch (e.g., a bowl of spaghetti with meatballs), the scientist can see different metabolites compositions in your blood samples. The different metabolites compositions are the result of your body trying to break down the meals in those two different lunch scenarios. 


> #### Apply what you learned:
> 
> Next time you hear the phrase “You are what you eat”, you will know it is true from the metabolomics perspective [2]. Having healthier diets that contain diverse fibers and proteins creates metabolites in your body that allows you to function more efficiently, such as having better attention, athletic performance, and immune responses. Adding something healthy in your meal each day is the most under-rated, economic, and effective way to lead a better quality life. Moreover, It is an exercise of learning to spontaneously follow disciplines. The earlier you start, the more you will see the compounded benefits it brings to your study, work, relationship with yourself and other people.


---

### What are the metabolites? 

Metabolites are made or used when the body breaks down food, drugs or chemicals [3]. By identifying and measuring the metabolites in a biological sample, we could tell what biological processes are happening in the sample. 


Metabolite profiles (i.e. the metabolites compositions) vary between individuals as it is a product of microbial, genetic and environmental factors. This is why studying metabolomics is important. It helps us understand the underlying biochemical activity and state of cells or tissues [4] when the person is under the influence of microbial, genetic, and environmental factors. 

<img src="materials/images/metabolism.png"/>

Metabolites can be grouped into four major classes [5]: 
1. Xenobiotics
2. Genome
3. Gut Microflora
4. Environment



<img src="materials/images/types-of-metabolites.png"/>

### What is metabolome?

Metabolome is the total collection of metabolites presented within an organism, cell, or tissue. It is the product of gene expression and protein activity. 
To make the distinction from the word “metabolomics”, which refers to the academic field, “metabolome” refers to what the scientists in metabolomics are studying.


---

## How to quantify metabolomics?

Similar to proteomics, we also use mass spectrometry for metabolomics to measure the mass-to-charge ratio (m/z) of one or more molecules presented in a sample. The m/z ratio is important information that helps us understand the identity of the metabolic feature (e.g. metabolite).


<img src="materials/images/quantify-metabolomics.png"/>

---

## What are the major applications of metabolomics?

Metabolomics covers a wide range of applications, such as human diseases, plant biotechnology, and pharmacology.

The human metabolome probably consists of far more than a million endogenous and exogenous compounds. Blood is the matrix of choice to study human disease because it is easy to collect, and is an integrated map of all tissues in the body.

Studies show that metabolic profile impacts progression of Alzheimer's disease (AD) [6]. For example, oxidative stress is a condition when antioxidant levels are low in the body. This condition creates an imbalance between free radicals and antioxidants. Free radicals react easily with other molecules and if there are more free radicals compared to antioxidants, then free radicals can start doing damage  to DNA and proteins [7].


---

#### References:

[1] Metabolomics. (2022); last accessed August 26, 2022, from https://en.wikipedia.org/wiki/Metabolomics

[2] https://www.youtube.com/watch?v=kWch9J0EGG8, last accessed Aug 26, 2022.

[3] https://www.cancer.gov/publications/dictionaries/cancer-terms/def/metabolite, last accessed Aug 23, 2022.

[4] https://www.ebi.ac.uk/training/online/courses/metabolomics-introduction/what-is/, last accessed Aug 23, 2022.

[5] Johnson, C. H., Patterson, A. D., Idle, J. R., & Gonzalez, F. J. (2012). Xenobiotic metabolomics: major impact on the metabolome. Annual review of pharmacology and toxicology, 52, 37.

[6] Orešič, M., Hyötyläinen, T., Herukka, S. K., Sysi-Aho, M., Mattila, I., Seppänan-Laakso, T., ... & Soininen, H. (2011). Metabolome in progression to Alzheimer's disease. Translational psychiatry, 1(12), e57-e57.

[7] https://www.healthline.com/health/oxidative-stress, last accessed Aug 26, 2022.

---

In [None]:
import pandas as pd

data = pd.read_csv('data/metabolome_abundance.txt', sep='\t', index_col=0)
data.iloc[0:8, 0:8]

Here, we see that our metabolite names are coded, and can be found in the Excel file `iPOP_Metablolite_Annotation.xlsx`. 

In [None]:
metabolites = pd.read_excel('data/iPOP_Metablolite_Annotation.xlsx')
metabolites.iloc[0:10, ]

In [None]:
(nRows, nCols) = data.shape
metadata = data.iloc[:, nCols-5:]
data = data.iloc[:, :nCols-5]
(nRows, nCols) = data.shape
print('Our dataset contains the abundance data for', nCols, 'metabolites from', nRows, 'samples.')

In [None]:
patientID = 'ZOZOW1T'
patientData = data.filter(like=patientID, axis=0)
patientData.iloc[0:8, 0:8]

In [None]:
import numpy as np

i = np.argmax(patientData.iloc[0, :])
print(metabolites['Metabolite'][i], 'is the metabolite with the highest abundance in sample ZOZOW1T-01.')

In [None]:
# *** Can use this code to preserve row and columns. However, it may be easier to append relative abundances to a list (like in Proteomics module) *** 

# relativeAbundance = np.empty_like(patientData)
# for metabolite in range(patientData.shape[1]):
#   maxAbundance = max(patientData.iloc[:, metabolite])
#   relativeAbundance[:, metabolite] = patientData.iloc[:, metabolite] / maxAbundance
# relativeAbundance = pd.DataFrame(relativeAbundance, columns=list(metabolites['Metabolite']), index=patientData.index)

In [None]:
relativeAbundance = []
for metabolite in range(patientData.shape[1]):
  maxAbundance = max(patientData.iloc[:, metabolite])
  relativeAbundance.append(patientData.iloc[:, metabolite] / maxAbundance)
relativeAbundance = pd.DataFrame(relativeAbundance, index=list(metabolites['Metabolite']))

In [None]:
relativeAbundance

In [None]:
import seaborn as sns

df = relativeAbundance.iloc[0:20, 0:5]
ax = sns.heatmap(df, vmin=0, vmax=1)
ax.figure

---

## Contributions & acknowledgement

- **Module Content:** Ryan Park
- **Engineering:** Amit Dixit
- **UX/UI Design & Illustration:** Kexin Cha
- **Video Production:** Francesca Goncalves
- **Project Management:** Amir Bahmani, Kexin Cha

---

Copyright (c) 2022 Stanford Data Ocean (SDO)

All rights reserved.