## Plot PAM50 spine-generic normative data 

To start this Jupyter notebook, run the following commands in your terminal

```
conda create --name jupyter
conda activate jupyter
conda install jupyterlab matplotlib seaborn pandas plotly
pip install ipynb
jupyter lab
```

In [None]:
# Import libraries
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

from ipynb.fs.full.utils_plotly import create_subplot, create_subplot_hue, create_regplot
from ipynb.fs.full.read_source_data_from_github import read_csv_files_from_github

%matplotlib inline

You can read the source CSV files directly from the PAM50-normalized-metrics repository (https://raw.githubusercontent.com/spinalcordtoolbox/PAM50-normalized-metrics/r20230222/). 
BUT, the fetching of the source CSV files takes ~30s.

In [None]:
# df = read_csv_files_from_github()
# number_of_subjects = len(df['participant_id'].unique().tolist())
# print(f'Number of subjects: {number_of_subjects}')

This is why we aggregated all individual single-subject CSV files into a one CSV file.

In [None]:
# Define path
file_spine_generic_aggregated_metrics = 'data/spine-generic_subjects_aggregated_metrics.csv'
# Note: HC_metrics.csv was generated by the following lines:
# https://github.com/spinalcordtoolbox/PAM50-normalized-metrics/blob/dad996831a0b98c2850ffa288981b8b12008e46e/statistics/generate_figures.py#L601-L602

# Read the CSV file with aggregated metrics for all subjects 
df = pd.read_csv(file_spine_generic_aggregated_metrics)

In [None]:
number_of_subjects = len(df['participant_id'].unique().tolist())
print(f'Number of subjects: {number_of_subjects}')

In [None]:
create_regplot(df, show_cv=True)

In [None]:
create_subplot(df)

In [None]:
create_subplot_hue(df, hue='sex')

In [None]:
create_subplot_hue(df, hue='manufacturer')

In [None]:
create_subplot_hue(df, hue='age')