Name		Name	Last commit message	Last commit date
parent directory ..
coefficients		coefficients
correlations		correlations
scripts/nbconverted		scripts/nbconverted
README.md		README.md
interpret_SCM_model_coefficients.ipynb		interpret_SCM_model_coefficients.ipynb
interpret_model.sh		interpret_model.sh
interpret_model_coefficients.ipynb		interpret_model_coefficients.ipynb
model_coefficient_correlations.ipynb		model_coefficient_correlations.ipynb

README.md

4. Interpret Model

In this module, we interpret the ML models.

After training the final and baseline models in 2.train_model, we load the coefficents of these models from models/. These coefficients are interpreted with the following diagrams:

We use seaborn.heatmap to display the coefficient values for each phenotypic class/feature.
We use seaborn.clustermap to display a hierarchically-clustered heatmap of coefficient values for each phenotypic class/feature
We use seaborn.kedeplot to display a density plot of coeffiecient values for each phenotypic class.
We use seaborn.barplot to display a bar plot of average coeffiecient values per phenotypic class.

In model_coefficient_correlations.ipynb, we compare the coefficients from the mutli-class and single-class models. The coefficients matrix from multi-class models are of shape (# phenotypic classes, # features), while the coefficients from single-class models are of shape (1, # features). Thus, we are able to compare the coefficient vectors for each phenotypic class per model.

We graph these coefficient vectors in a scatterplot where the coordinate pairs represent (mutli-class model coefficient value, single-class model coefficient value) for a particular feature. For each of the coefficient vectors for the multi-class and single-class mdoels, we derive the Pearson correlation coefficient with numpy.coercoef to get an idea of how correlated these vectors are. We also derive the Clustermatch Correlation Coefficient (CCC) introduced in Pividori et al, 2022. This is a not-only-linear coefficient based on machine learning models and gives an idea of how correlated the feature coefficients are (where 0 is no relationship and 1 is a perfect relationship). The correlations for each pair of coefficient vectors are displayed above their scatterplots.

Results

Each model's interpretations are located in interpret_model_coefficients.ipynb.

Notes:

Intermediate .tsv data are stored in tidy format, a standardized data structure (see Tidy Data by Hadley Wickham for more details).
SCM stands for "single cell model(s)" and is used as an abbrevation for the binary, sinlge-class models throughout this module.

Step 1: Interpret Model

Use the commands below to interpret the ML models:

# Make sure you are located in 4.interpret_model
cd 4.interpret_model

# Activate phenotypic_profiling conda environment
conda activate phenotypic_profiling

# Interpret model
bash interpret_model.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

4.interpret_model

4.interpret_model

README.md

4. Interpret Model

Results

Step 1: Interpret Model

Files

4.interpret_model

Directory actions

More options

Directory actions

More options

Latest commit

History

4.interpret_model

Folders and files

parent directory

README.md

4. Interpret Model

Results

Step 1: Interpret Model