<a href="https://colab.research.google.com/github/NICE-MSI/NPL-Academy/blob/main/NPL_NiCE-MSI.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Using Mass Spectrometry Imaging to Map Molecules**

In this notebook, you will be able to investigate different mean spectra from some tissues of interest. You will overplot spectra from the different tissues, study the intensity ratios of different compounds of interest, as well as .....

In [None]:
![alt text](NPL-Academy/tissues-image.png "Title")

First, we need to import the python packages that are needed to read and plot the data. (numpy, matplotlib, pandas):

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

We clone the Mass Spectrometry Imaging data that we are going to study in this Notebook.

In [None]:
!git clone https://github.com/NICE-MSI/NPL-Academy.git

We use "pandas" (python package) to read the file with the mean spectra of the different tissues. 
We can print the file on the next cell to see its structure.

In [None]:
df = pd.read_csv("NPL-Academy/spectra.csv", index_col=None)  # read data files
print(df)  #print data file

As you can see, there are 6 columns in the file. The first column corresponds to the m/z values (X-axis of the spectrum). Columns 2 to 6 correspond to the intensities of the spectra for the different tissues (Y-axis).

Note: You can save your figures by "uncomment" the last line (removing #), but you need to comment #plt.show() for the figure to be saved. Your plots will be saved in the "ouputs" folder.

In [None]:
plt.plot(df["m/z"],df["A_APCKRAS"], color='blue', label='tissue A-APCKRAS')
plt.plot(df["m/z"],df["D_APCKRAS"], color='red', label='tissue D-APCKRAS')
plt.legend()
plt.show()
#plt.savefig("C:/Users/ag12/PycharmProjects/NPL-Academy/outputs/tissue_spectra.png")


Can you overplot all the spectra in "spectra.csv"? Remeber to include the labels on your plot.
You can customize your plot in many different ways (colors, linestyles, linewidth,...). If you want to look at all the options you can check here:
https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.plot.html 

You can also zoom-in into specific areas of the spectrum for a better visualisation. For example:

In [None]:
plt.plot(df["m/z"],df["A_APCKRAS"], color='blue', label='tissue A-APCKRAS')
plt.xlim((300,320))
plt.show()

Noise determination:
One of the first problems we have when analysing MSI data, is to differentiate signal from noise. MSI normally contains big amount of data, so it is important to save only the compounds of interest, and remove the noise. In this example, we can determine noise level in a basic way, obtaining the standard deviation of the mean spectrum. 

In the next cell we obtain the standard deviation of two of the tissues (A and D) by using the "std" function in the numpy package (np.std)

In [None]:
print('standard deviation for tissue A=', np.std(df["A_APCKRAS"]))
print('standard deviation for tissue D=',np.std(df["D_APCKRAS"]))

We can use this standard deviation as a threshold to determine signal and noise.
In the cell below we are plotting the noise in red and the signal in blue.

In [None]:
treshold = np.std(df["A_APCKRAS"])
plt.plot(df["m/z"],df["A_APCKRAS"].where(df["A_APCKRAS"]<treshold), color='red', label='noise A-APCKRAS')
plt.plot(df["m/z"],df["A_APCKRAS"].where(df["A_APCKRAS"]>=treshold), color='blue', label='signal A-APCKRAS')
plt.legend()
plt.show()

Do you think this noise level is correct? 
Let's zoom-in at one specific area to have a better visualisation.

In [None]:
treshold = np.std(df["A_APCKRAS"])

plt.plot(df["m/z"],df["A_APCKRAS"].where(df["A_APCKRAS"]<treshold), color='blue', label='noise A-APCKRAS')
plt.plot(df["m/z"],df["A_APCKRAS"].where(df["A_APCKRAS"]>=treshold), color='red', label='signal A-APCKRAS')
plt.xlim((410,420))
plt.ylim((-2,40))
plt.legend()
plt.show()
#plt.savefig("C:/Users/ag12/PycharmProjects/NPL-Academy/outputs/Noise_level.png")

Can you lower the noise level for this spectrum? Use the cell above to determin the noise level that you think could be best in this case. 

Can you investigate what happens in other areas of the spectrum when you use different tresholds? Can you use the same threshold for all the spectrum to remove noise properly?

Remeber to safe some plots for your presentation on Friday.