# Cellular Deconvolution Notebook

author: Simon Lee (slee@celsiustx.com)

You will need to install the following methods to properly use this notebook 

- [cellanneal](https://github.com/LiBuchauer/cellanneal)
- [Kassandra](https://github.com/BostonGene/Kassandra)

These repos can be cloned inside the "source" folder for usability. The structure below demonstrates what I mean:

src</br>
|_ cellanneal </br>
|_ Kassandra </br>
|_ master_deconvolution.ipynb</br>



# Table of Contents
* [Motivation](#0)
* [Imports & Loading functions](#1)
* [cellanneal](#2)
    * [GSE107572](#2_1)
    * [GSE1479433](#2_2)
* [Kassandra](#3)
    * [GSE107572](#3_1)
    * [GSE1479433](#3_2)
* [SVR](#4)
    * [GSE107572](#4_1)
    * [GSE1479433](#4_2)



# Motivation <a class="anchor" id="0"></a>

Cellular deconvolution (also referred to as cell type composition or cell proportion estimation) refers to computational techniques aiming at estimating the proportions of different cell types in samples collected from a tissue. Over the past few years many methods have been implemented using a wide spread of machine learning methods which have been considered the "State of the Art". However based on the paper, [Clustering-independent estimation of cell abundances in bulk tissues using single-cell RNA-seq data](https://www.biorxiv.org/content/10.1101/2023.02.06.527318v1.full.pdf), we were able to learn that a lot of deconvolutions methods accuracy is highly driven on the gene expression signature which are typically required as input to estimate the cell proportions. Since then there have been more methods developed that don't require a gene signature set but require some form of single cell reference to infer the cellular proportions. Therefore this notebook takes a look at widely different methods found across the literature and provides an easy to use interface for the user. Part of the challenge when working with open source codebases is that reproducability becomes a lot of work because there may be missing files, classified datasets involving real patients, etc. Therefore this notebook's emphasis is really just to provide the user with everything they will need to be able to perform a proper benchmark of different deconvolution methods for themselves. In this repository you will find two datasets in the `\data` folder, with the required paired bulk samples along with an othogonal flow cytometry matrix ("ground truth) to benchmark on PBMC related datasets (`GSE107572, GSE1479433`). If you wou;d like to retrain these models from scratch, you will need to provide a training set with a gene signature set coming from the same tissue. For validation purposes some form of orthogonal qunaitifcation is required. If you are interested in including your own models, you are also free to do so by following the pipeline demonstarted of the three different methods seen in this notebook: **cellanneal (annealing/rank coefficients minimization function), Kassandra (Ligh Gradient Boosting Decision Tree Model), & SVR (support vector regression)**. 

# TODO Update Notebook with new deconv

## Imports & Loading Functions <a class="anchor" id="1"></a>

In [1]:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
import random


# statistical tests
from scipy.stats import pearsonr
from sklearn.metrics import mean_absolute_error
from sklearn.metrics import mean_squared_error as mse
from sklearn.metrics import r2_score

# stats & plot specific from .py files
from plot import Plot
from stats import statsTest
from helper import flatten, gene_intersection
import project_configs as project_configs
from deconv import Deconvolution

from tqdm import tqdm

  from pandas.core.index import Index as PandasIndex


In [2]:
# plot parameters all coming from project.configurations
plt.style.use(project_configs.style)
plt.rcParams['font.family'] = project_configs.font_family
plt.rcParams['font.serif'] = project_configs.font_serif
plt.rcParams['font.monospace'] = project_configs.font_monospace
plt.rcParams['font.size'] = project_configs.font_size
plt.rcParams['axes.labelsize'] = project_configs.axes_label_size
plt.rcParams['axes.labelweight'] = project_configs.axes_label_weight
plt.rcParams['axes.titlesize'] = project_configs.axes_title_size
plt.rcParams['xtick.labelsize'] = project_configs.xtick_label_size
plt.rcParams['ytick.labelsize'] = project_configs.ytick_label_size
plt.rcParams['legend.fontsize'] = project_configs.legend_font_size 
plt.rcParams['figure.titlesize'] = project_configs.figure_title_size
plt.rcParams['image.cmap'] = project_configs.image_cmap 
plt.rcParams['image.interpolation'] = project_configs.image_interpolation 
plt.rcParams['figure.figsize'] = project_configs.figure_size
plt.rcParams['axes.grid']=project_configs.axes_grid
plt.rcParams['lines.linewidth'] = project_configs.lines_line_width
plt.rcParams['lines.markersize'] = project_configs.lines_marker_size
cells_p = project_configs.cells_p

### other helpful methods

In [None]:
def gene_sig_builder(df, ref):
    '''
    Gene Signature constructor
    '''
    

### Models Fitting

In [3]:
signature = pd.read_csv('./cellanneal/example_data/sc_PBMC_gene_signature.csv',index_col=0)
bulk = pd.read_csv('../data/GSE107572_expr.tsv.tar.gz',sep='\t',index_col=0)
cytof1 = pd.read_csv('../data/GSE107572_cytof.tsv.tar.gz', sep='\t', index_col=0)
bulk2 = pd.read_csv('../data/GSE1479433.tsv', sep='\t', index_col=0)
cytof2 = pd.read_csv('../data/GSE1479433_cytometry_df.tsv', sep='\t', index_col=0)


In [4]:
signature_match, bulk_match = gene_intersection(signature, bulk)
print(signature_match.shape[0],"==", bulk_match.shape[0])

12998 == 12998


In [5]:
# prepare training data
training_data = [bulk, bulk2]

# import models for training and fitting
deconv = Deconvolution()
models = deconv.train(signature, training_data)

Preparing Cell Anneal Model
3862 highly variable genes identified in cell type
        reference.
	1951 of these are within thresholds for sample SRR6337113
	1723 of these are within thresholds for sample SRR6337114
	2159 of these are within thresholds for sample SRR6337115
	2199 of these are within thresholds for sample SRR6337116
	1912 of these are within thresholds for sample SRR6337117
	1876 of these are within thresholds for sample SRR6337118
	1835 of these are within thresholds for sample SRR6337119
	1856 of these are within thresholds for sample SRR6337120
	1926 of these are within thresholds for sample SRR6337121
3862 highly variable genes identified in cell type
        reference.
	2475 of these are within thresholds for sample F0222
	2469 of these are within thresholds for sample F0223
	2507 of these are within thresholds for sample F0224
	2343 of these are within thresholds for sample F0229
	2477 of these are within thresholds for sample F0230
	2412 of these are within thres

**!!! The deconvolution takes pretty long depending on sample #'s... and methods (Cellanneal & SVR) take signifacnt time!!!**

In [6]:
# prepare for fitting
test_data = [signature, bulk, bulk2]
predictions = deconv.deconvolution(models, test_data)

Performing cellanneal deconvolution 

Deconvolving sample 1 of 9 (SRR6337113) ...
Deconvolving sample 2 of 9 (SRR6337114) ...
Deconvolving sample 3 of 9 (SRR6337115) ...
Deconvolving sample 4 of 9 (SRR6337116) ...
Deconvolving sample 5 of 9 (SRR6337117) ...
Deconvolving sample 6 of 9 (SRR6337118) ...
Deconvolving sample 7 of 9 (SRR6337119) ...
Deconvolving sample 8 of 9 (SRR6337120) ...
Deconvolving sample 9 of 9 (SRR6337121) ...
Deconvolving sample 1 of 45 (F0222) ...
Deconvolving sample 2 of 45 (F0223) ...
Deconvolving sample 3 of 45 (F0224) ...
Deconvolving sample 4 of 45 (F0229) ...
Deconvolving sample 5 of 45 (F0230) ...
Deconvolving sample 6 of 45 (F0231) ...
Deconvolving sample 7 of 45 (F0232) ...
Deconvolving sample 8 of 45 (F0274) ...
Deconvolving sample 9 of 45 (F0303) ...
Deconvolving sample 10 of 45 (F0304) ...
Deconvolving sample 11 of 45 (F0305) ...
Deconvolving sample 12 of 45 (F0306) ...
Deconvolving sample 13 of 45 (F0307) ...
Deconvolving sample 14 of 45 (F0308) ...
D

 56%|█████▌    | 5/9 [21:08<16:42, 250.64s/it]

### Cellanneal <a class="anchor" id="2"></a>

## GSE107572 <a class="anchor" id="2_1"></a>

Multiply by 100 to get cell percentages not just units that add up to 1

In [None]:
# plotting object
plotter = Plot()

In [None]:
plotter.stack_plot(ca_preds_100)
plotter.heat_map(ca_preds_100)

need the columns to match names to get the correlation plots

In [None]:
ca_preds_100

In [None]:
# renaming columns to match cytof "ground truth" data
ca_preds_copy = ca_preds_100.T.copy()
ca_preds_copy.loc['B_cells'] = ca_preds_copy.loc[['B', 'B-naive']].sum()
ca_preds_copy.loc['CD4_T_cells'] = ca_preds_copy.loc[['CD4', 'CD4-naive']].sum()
ca_preds_copy.loc['CD8_T_cells'] = ca_preds_copy.loc[['CD8']].sum()
ca_preds_copy.loc['NK_cells'] = ca_preds_copy.loc[['NK']].sum()
ca_preds_copy.loc['Tregs'] = ca_preds_copy.loc[['Treg']].sum()
ca_preds_copy.loc['T_cells'] = ca_preds_copy.loc[['CD8_T_cells', 'CD4_T_cells', 'Tregs', 'T_undef']].sum()
ca_preds_copy.loc['Lymphocytes'] = ca_preds_copy.loc[['B_cells', 'T_cells', 'NK_cells']].sum()

In [None]:
cytof1 = pd.read_csv('../data/GSE107572_cytof.tsv.tar.gz', sep='\t', index_col=0)

In [None]:
plotter.plot_cell(ca_preds_copy, cytof1, pallete=cells_p)

#### The statsTest class takes the statistics of the whole mixture

In [None]:
title = 'GSE107572'
plotter.plot_whole(ca_preds_copy, cytof1, pallete = cells_p, title=title, stat=True)

In [None]:
plotter.bland_altman(ca_preds_copy, cytof1, pallete = cells_p)

In [None]:
# get a color pallete for color samples
ind_names = ca_preds_100.index.intersection(cytof1.index)
sample_color = plotter.get_cmap(len(ind_names))


In [None]:
plotter.plot_sample(ca_preds_copy, cytof1, pallete = sample_color)

In [None]:
plotter.bland_altman_v2(ca_preds_copy, cytof1, pallete = sample_color)

## GSE1479433 <a class="anchor" id="2_2"></a>

In [None]:
title = 'GSE1479433'
bulk2 = pd.read_csv('../data/GSE1479433.tsv', sep='\t', index_col=0)
cytof2 = pd.read_csv('../data/GSE1479433_cytometry_df.tsv', sep='\t', index_col=0)

In [None]:
plotter.stack_plot(ca_preds2_100)
plotter.heat_map(ca_preds2_100)

In [None]:
# renaming columns to match cytof "ground truth" data
ca_preds_copy2 = ca_preds2_100.T.copy()
ca_preds_copy2.loc['T_cells'] = ca_preds_copy2.loc[['CD8_T_cells', 'CD4_T_cells', 'Tregs']].sum()
ca_preds_copy2.loc['Lymphocytes'] = ca_preds_copy2.loc[['B_cells', 'T_cells', 'NK_cells']].sum()

In [None]:
flatten(ca_preds_copy2, cytof2)

In [None]:
plotter.plot_cell(ca_preds_copy2, cytof2, pallete=cells_p)

In [None]:
plotter.plot_whole(ca_preds_copy2, cytof2, pallete = cells_p, title=title, stat=True)

these results make sense as alot of cell subtypes are being incorrectly identified compared to the CyTOF data

In [None]:
plotter.bland_altman(ca_preds_copy2, cytof2, pallete = cells_p)

## Kassandra <a class="anchor" id="3"></a>

## GSE107572 <a class="anchor" id="3_1"></a>

In [None]:
k_preds = model.predict(bulk) 
k_preds.loc['Lymphocytes'] = k_preds.loc[['B_cells', 'T_cells', 'NK_cells']].sum()
k_preds_100 = k_preds * 100

# drop parent nodes so we can plot child nodes stack plots
parent_nodes = ['Non_plasma_B_cells', 'Monocytes', 'Granulocytes', 'B_cells', 'T_cells', 'NK_cells', 'Myeloid_cells', 'Lymphoid_cells', 'Lymphocytes', 'CD8_T_cells', 'Cytotoxic_NK_cells', 'CD4_T_cells', 'Memory_T_helpers', 'Memory_CD8_T_cells']
k_preds_child = k_preds_100.drop(parent_nodes)

In [None]:
# cell proportion plots
plotter.stack_plot(k_preds_child.T)
plotter.heat_map(k_preds_child.T)

In [None]:
# check for intersections
flatten(k_preds_100, cytof1)

In [None]:
plotter.corr_plot(k_preds_100, cytof1, pallete=cells_p)

In [None]:
plotter.print_cell_whole(k_preds_100, cytof1, pallete = cells_p, title=title, stat=True)

In [None]:
plotter.bland_altman(k_preds_100, cytof1, pallete = cells_p)

In [None]:
# get a color pallete for color samples
ind_names = k_preds_100.index.intersection(cytof1.index)
sample_color = plotter.get_cmap(len(ind_names))
plotter.plot_sample(k_preds_100, cytof1, pallete = sample_color)
plotter.bland_altman_v2(k_preds_100, cytof1, pallete = sample_color)

## GSE1479433 <a class="anchor" id="3_2"></a>

In [None]:
k_preds2 = model.predict(bulk2) 
k_preds2.loc['Lymphocytes'] = k_preds2.loc[['B_cells', 'T_cells', 'NK_cells']].sum()
k_preds2_100 = k_preds2 * 100

# drop parent nodes so we can plot child nodes stack plots
parent_nodes = ['Non_plasma_B_cells', 'Monocytes', 'Granulocytes', 'B_cells', 'T_cells', 'NK_cells', 'Myeloid_cells', 'Lymphoid_cells', 'Lymphocytes', 'CD8_T_cells', 'Cytotoxic_NK_cells', 'CD4_T_cells', 'Memory_T_helpers', 'Memory_CD8_T_cells']
k_preds2_child = k_preds2_100.drop(parent_nodes)

In [None]:
# cell proportion plots
plotter.stack_plot(k_preds2_child.T)
plotter.heat_map(k_preds2_child.T)

In [None]:
# check for intersections
flatten(k_preds2_100, cytof2)

In [None]:
plotter.corr_plot(k_preds2_100, cytof2, pallete=cells_p)

In [None]:
plotter.print_cell_whole(k_preds2_100, cytof2, pallete = cells_p, title=title, stat=True)

In [None]:
plotter.bland_altman(k_preds2_100, cytof2, pallete = cells_p)

In [None]:
# get a color pallete for color samples
ind_names = k_preds2_100.index.intersection(cytof2.index)
sample_color = plotter.get_cmap(len(ind_names))
plotter.plot_sample(k_preds2_100, cytof2, pallete = sample_color, specific_col='F0598')
plotter.bland_altman_v2(k_preds2_100, cytof2, pallete = sample_color,specific_col='F0598')

## SVR<a class="anchor" id="4"></a>

In [None]:
signature = pd.read_csv('./cellanneal/example_data/sc_PBMC_gene_mean_signature.csv',index_col=0)
bulk2 = pd.read_csv('../data/GSE107572_expr.tsv.tar.gz',sep='\t',index_col=0)
bulk = pd.read_csv('../data/GSE1479433.tsv', sep='\t', index_col=0)

In [None]:
set1 = set(bulk.index)
set2 = set(bulk2.index)
set3 = set(signature.index)
intersection = (set1.intersection(set2)).intersection(set3)
inter = list(intersection)

In [None]:
signature = signature.filter(items=inter,axis=0)
bulk = bulk.filter(items=inter,axis=0)
bulk2 = bulk2.filter(items=inter,axis=0)

In [None]:
print(signature.shape, bulk.shape, bulk2.shape)

In [None]:
scaler = StandardScaler()
  
# transform data
train  = scaler.fit_transform(signature)
test_data = scaler.fit_transform(bulk2)
ind = bulk2.columns

## GSE107572 <a class="anchor" id="4_1"></a>

In [None]:
genes = bulk.index
ind = bulk2.columns
Nus=[0.25, 0.5, 0.75]

SVRcoef = np.zeros((signature.shape[1], bulk2.shape[1]))
Selcoef = np.zeros((bulk.shape[0], bulk2.shape[1]))

for i in tqdm(range(bulk2.shape[1])):
    sols = [NuSVR(kernel='linear', nu=nu).fit(train,test_data[:,i]) for nu in Nus]
    im_name = signature.columns
    RMSE = [mse(sol.predict(train), test_data[:,i]) for sol in sols]
    Selcoef[sols[np.argmin(RMSE)].support_, i] = 1
    SVRcoef[:,i] = np.maximum(sols[np.argmin(RMSE)].coef_,0)
    SVRcoef[:,i] = SVRcoef[:,i]/np.sum(SVRcoef[:,i])
svr_preds = pd.DataFrame(SVRcoef,index=im_name, columns=ind)
svr_preds = svr_preds.reindex(sorted(svr_preds.columns), axis=1)
svr_preds

In [None]:
svr_preds_100 = svr_preds * 100

In [None]:
svr_preds_copy = svr_preds_100.copy()
svr_preds_copy.loc['B_cells'] = svr_preds_copy.loc[['B', 'B-naive']].sum()
svr_preds_copy.loc['CD4_T_cells'] = svr_preds_copy.loc[['CD4', 'CD4-naive']].sum()
svr_preds_copy.loc['CD8_T_cells'] = svr_preds_copy.loc[['CD8']].sum()
svr_preds_copy.loc['NK_cells'] = svr_preds_copy.loc[['NK']].sum()
svr_preds_copy.loc['Tregs'] = svr_preds_copy.loc[['Treg']].sum()
svr_preds_copy.loc['T_cells'] = svr_preds_copy.loc[['CD8_T_cells', 'CD4_T_cells', 'Tregs', 'T_undef']].sum()
svr_preds_copy.loc['Lymphocytes'] = svr_preds_copy.loc[['B_cells', 'T_cells', 'NK_cells']].sum()

In [None]:
svr_preds_copy

In [None]:
flatten(svr_preds_copy, cytof1)

In [None]:
plotter.corr_plot(svr_preds_copy, cytof1, pallete=cells_p)

In [None]:
title = 'GSE107572'
plotter.print_cell_whole(svr_preds_copy, cytof1, pallete = cells_p, title=title, stat=True)

In [None]:
plotter.bland_altman(svr_preds_copy, cytof1, pallete = cells_p)

## GSE1479433 <a class="anchor" id="4_2"></a>

In [None]:
scaler = StandardScaler()
  
# transform data
train  = scaler.fit_transform(signature)
test_data = scaler.fit_transform(bulk)
ind = bulk.columns

In [None]:
test_data

In [None]:
genes = bulk.index
ind = bulk.columns
Nus=[0.5]

SVRcoef = np.zeros((signature.shape[1], bulk.shape[1]))
Selcoef = np.zeros((bulk.shape[0], bulk.shape[1]))

for i in tqdm(range(bulk.shape[1])):
    sols = [NuSVR(kernel='linear', nu=nu).fit(train,test_data[:,i]) for nu in Nus]
    im_name = signature.columns
    RMSE = [mse(sol.predict(train), test_data[:,i]) for sol in sols]
    Selcoef[sols[np.argmin(RMSE)].support_, i] = 1
    SVRcoef[:,i] = np.maximum(sols[np.argmin(RMSE)].coef_,0)
    SVRcoef[:,i] = SVRcoef[:,i]/np.sum(SVRcoef[:,i])
svr_preds_2 = pd.DataFrame(SVRcoef,index=im_name, columns=ind)
svr_preds_2 = svr_preds.reindex(sorted(svr_preds.columns), axis=1)
svr_preds_2

In [None]:
svr_preds2_100 = svr_preds_2 * 100
svr_preds_copy2 = svr_preds2_100.copy()
svr_preds_copy2.loc['B_cells'] = svr_preds_copy2.loc[['B', 'B-naive']].sum()
svr_preds_copy2.loc['CD4_T_cells'] = svr_preds_copy2.loc[['CD4', 'CD4-naive']].sum()
svr_preds_copy2.loc['CD8_T_cells'] = svr_preds_copy2.loc[['CD8']].sum()
svr_preds_copy2.loc['NK_cells'] = svr_preds_copy2.loc[['NK']].sum()
svr_preds_copy2.loc['Tregs'] = svr_preds_copy2.loc[['Treg']].sum()
svr_preds_copy2.loc['T_cells'] = svr_preds_copy2.loc[['CD8_T_cells', 'CD4_T_cells', 'Tregs', 'T_undef']].sum()
svr_preds_copy2.loc['Lymphocytes'] = svr_preds_copy2.loc[['B_cells', 'T_cells', 'NK_cells']].sum()


# Methods to score which method performed the best

#### TODO
- Takes the difference, of true and predicted between cell types. The most with the lowest difference wins
- Make benchmarking plot that takes in a list of dataframes as input, takes the RMSE & Pearson between every sample and forms a plot where each datapoint is the RMSE and corrlation.

In [None]:
def benchmark_rmse(df_list, cytof, name_list):

    df_final = pd.DataFrame()
    length = 999
    for i, df in enumerate(df_list):
        ind_names = df.dropna().index.intersection(cytof.dropna().index)
        col_names = df.dropna().columns.intersection(cytof.dropna().columns)
        predicted_values = df.loc[ind_names, col_names]
        true_values = cytof.loc[ind_names, col_names]
        predicted_values = predicted_values.T
        true_values = true_values.T
        cells = true_values.columns
        stat = statsTest()

        temp2 = predicted_values.shape[1]
        if temp2 < length:
            length = temp2

        rmse_list = []
        for x, cell in enumerate(cells):
            val = stat.rmse(predicted_values[cell], true_values[cell])
            rmse_list.append(val)
        rmse_list = rmse_list[0:length]
        df_final[name_list[i]] = rmse_list
        if i == 0:
            index_name = predicted_values.columns
            df_final.index = index_name
    
    sns.lineplot(data=df_final)
    sns.scatterplot(data=df_final, legend=False)
    plt.xlabel("Cell Types")
    plt.ylabel("RMSE")
    plt.title("RMSE across different Methods at Cell type level")
            

def benchmark_correlation(df_list, cytof, name_list):
    df_final = pd.DataFrame()
    length = 999
    for i, df in enumerate(df_list):
        ind_names = df.index.intersection(cytof.index)
        col_names = df.columns.intersection(cytof.columns)
        predicted_values = df.loc[ind_names, col_names]
        true_values = cytof.loc[ind_names, col_names].astype(float)
        predicted_values = predicted_values.T
        true_values = true_values.T
        cells = true_values.columns
        stat = statsTest()

        temp2 = predicted_values.shape[1]
        if temp2 < length:
            length = temp2

        r_list = []
        for cell in cells:
            val = stat.correlation(predicted_values[cell], true_values[cell])
            r_list.append(val)
        r_list = r_list[0:length]
        df_final[name_list[i]] = r_list
        if i == 0:
            index_name = predicted_values.columns
            df_final.index = index_name
    
    display(df_final)
    
    #sns.lineplot(data=df_final)
    # sns.scatterplot(data=df_final, x='ind', y='cellanneal', legend=False)
    # sns.scatterplot(data=df_final, x='ind', y='Kassandra', legend=False)
    # sns.scatterplot(data=df_final, x='ind', y='SVR', legend=False)
    sns.lineplot(data=df_final)
    # sns.lineplot(data=df_final['cellanneal'])
    # sns.lineplot(data=df_final['Kassandra'], style='-')
    # sns.lineplot(data=df_final['SVR'], style= '.')
    #plt.ylim([-1, 1])
    plt.xlabel("Cell Types")
    plt.ylabel("Pearson Correlation (r)")
    plt.title("Pearson Correlation across different Methods at Cell type level")

def correlation_matrix(predicted, true):
    ind_names = predicted.index.intersection(true.index)
    col_names = predicted.columns.intersection(true.columns)
    predicted_values = predicted.loc[ind_names, col_names]
    true_values = true.loc[ind_names, col_names]
    true_values = true_values.add_suffix('_x')
    result = predicted_values.corrwith(true_values, axis = 0)
    display(result)

    

In [None]:
df_list = [ca_preds_copy, k_preds_100, svr_preds_copy]

In [None]:
names = ['cellanneal', 'Kassandra', 'SVR']

In [None]:
benchmark_rmse(df_list, cytof1, names)

In [None]:
benchmark_correlation(df_list, cytof1, names)

In [None]:
df_list2 = [ca_preds_copy2, k_preds2_100, svr_preds_copy2]

In [None]:
cytof2 = cytof2.fillna(0)