## PAM50-normalized-metrics - figures

To start this Jupyter notebook, run the following commands in your terminal

```
conda create --name jupyter
conda activate jupyter
conda install jupyterlab matplotlib seaborn pandas plotly
pip install ipynb
# to save images, install also kaleido
pip install -U kaleido
jupyter lab
```

In [1]:
# Import libraries
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

from utils import fetch_data, read_metrics, create_subplot, create_subplot_hue, create_regplot

%matplotlib inline

You can read the source CSV files directly from the PAM50-normalized-metrics repository (https://github.com/spinalcordtoolbox/PAM50-normalized-metrics/tree/r20230222). 

BUT, the fetching of the source CSV files takes ~30s.

In [2]:
# from read_source_data_from_github import read_csv_files_from_github
# df = read_csv_files_from_github()
# number_of_subjects = len(df['participant_id'].unique().tolist())
# print(f'Number of subjects: {number_of_subjects}')

This is why we aggregated all individual single-subject CSV files into a one CSV file.

In [3]:
# Fetch data using repo2data
fetch_data()

---- repo2data starting ----
/Users/valosek/miniconda3/envs/jupyter/lib/python3.11/site-packages/repo2data
Config from file :
data_requirement.json
Destination:
./data/PAM50-normalized-metrics-neurolibre

Info : ./data/PAM50-normalized-metrics-neurolibre already downloaded


In [4]:
# Define path
file_spine_generic_aggregated_metrics = 'data/PAM50-normalized-metrics-neurolibre/PAM50-normalized-metrics-neurolibre/spine-generic_subjects_aggregated_metrics.csv'
# Note: data/spine-generic_subjects_aggregated_metrics.csv was generated by the following lines:
# https://github.com/spinalcordtoolbox/PAM50-normalized-metrics/blob/dad996831a0b98c2850ffa288981b8b12008e46e/statistics/generate_figures.py#L601-L602

# Read the CSV file with aggregated metrics for all subjects 
df = read_metrics(file_spine_generic_aggregated_metrics)

In [5]:
number_of_subjects = len(df['participant_id'].unique().tolist())
print(f'Number of subjects: {number_of_subjects}')

Number of subjects: 203


In [6]:
create_subplot(df, output='save', output_fname='figure2.png')

Created: figures/figure2.png.


In [7]:
create_regplot(df, show_cv=True, output='save', output_fname='figure3.png')

Created: figures/figure3.png.


In [8]:
create_subplot_hue(df, hue='sex', output='save', output_fname='figure4.png')

Created: figures/figure4.png.


In [9]:
create_subplot_hue(df, hue='vendor', output='save', output_fname='figure5.png')

Created: figures/figure5.png.


In [10]:
create_subplot_hue(df, hue='age', output='save', output_fname='figure6.png')

Created: figures/figure6.png.
