# Exercise set 14

> As you near the end of TKJ4175 for 2023, it's time to put your newly acquired skills to the test! In this final exercise, you will analyze NMR spectra and identify unknown oils using the knowledge you have gained in this course.

The data file [Data/nmr_oil.csv](./Data/nmr_oil.csv) contains ¹H NMR spectra measured for 
six edible oils: sesame, olive, peanut, sunflower, canola, and corn. For each oil, we have five spectra, and each spectrum is recorded at 1100 chemical shifts. We also have three spectra of unknown oils in the data file [Data/nmr_unknown_oil.csv](./Data/nmr_unknown_oil.csv). 

We have a limited amount of information on the unknown samples. Their type(s) are of the six known oils we have measured, but the three unknown oils may be of the same kind, or they can all be different.


**Use your chemometrics skills and identify the three oils!**

## Plotting example spectra

To get you started, here are some code to plot example spectra:

In [None]:
import pandas as pd
from matplotlib import pyplot as plt
import numpy as np
import seaborn as sns

%matplotlib inline
sns.set_context("notebook")

In [None]:
data = pd.read_csv("Data/nmr_oil.csv")
data_unknown = pd.read_csv("Data/nmr_unknown_oil.csv")
data.head()
# The column oil contains the oil type, and the other
# columns contain the intensity at the shift value given
# by the column name.

In [None]:
fig, axes = plt.subplots(constrained_layout=True, nrows=6, sharex=True, figsize=(9, 12))
# ppm values are:
ppms = np.array([float(i.split("ppm")[0]) for i in data.columns if "ppm" in i])
# Loop over oil types and plot one example of each:
for i, oil_type in enumerate(data["oil"].unique()):
    intensity = data[data["oil"] == oil_type].to_numpy()[0, 1:]
    # Note: The selection [0, 1:] above picks the first of
    # the five spectra for the selected oil type, and then
    # it skips the first (index 0 columns) since this is
    # the oil column.
    axes[i].plot(ppms, intensity)
    axes[i].set(ylabel="Intensity")
    axes[i].set_title(f"Oil: {oil_type}", loc="left")
axes[-1].invert_xaxis()
axes[-1].set_xlabel("ppm")
sns.despine(fig=fig)

In [None]:
fig, axes = plt.subplots(constrained_layout=True, figsize=(9, 3))
# ppm values are:
ppms = np.array([float(i.split("ppm")[0]) for i in data_unknown.columns if "ppm" in i])
# Show all the unknowns
spectra = data_unknown.to_numpy()[:, 1:]
for i, intensity in enumerate(spectra):
    axes.plot(ppms, intensity, label=f"Unknown oil {i+1}")
axes.set(ylabel="Intensity")
axes.invert_xaxis()
axes.set_xlabel("ppm")
axes.legend(loc="upper left")
sns.despine(fig=fig)