**Goal**: this notebook is iseful to prepare data for DEG Analysis.

We want to build four different datasets:

1. RNA-seq data (PR/CR)
2. RNA-seq data (SD/PD)
3. Normalized RNA-seq data (PR/CR)
4. Normalized RNA-seq data (SD/PD)

The further goal of DEG analysis is to identify the genes that are differentially expressed with respect to the response, in particular this information is included in the variable **Harmonized_Confirmed_BOR** that contains four different levels:

1. **CR**: Complete Response
2. **PR**: Partial Response
3. **SD**: Stable Disease
4. **PD**: Progressive Disease

There is also another level, **NE** (Not Evaluable) that will not be considered for the analysis.

### Import Utils and Setup

In [None]:
## Communication drive-colab
from google.colab import drive
import warnings
import os

## Data Structure and Data Analysis
import pandas as pd
import numpy as np

In [None]:
## Mount drive
drive.mount("/content/drive", force_remount = True)

Mounted at /content/drive


In [None]:
## Toggle as needed
warnings.filterwarnings("ignore")

## To see the maximum number of columns
pd.set_option("display.max_columns", None)

## Source path
source_path_data = "/content/drive/MyDrive/Tesi/Code/Personal_Code/Saved/Data/"

## Save path
save_path_data   = "/content/drive/MyDrive/Tesi/Code/Personal_Code/Saved/Data/"

### Read Clinical Data (Shape: n = 152, p = 38)

We need to work on the Clinical data, in order to extract the patients ID with the four different conditions, and then filter the RNA-seq data.

In [None]:
## Load Data
master_annotations_df_hq = pd.read_csv(source_path_data + "master_annotations_df_hq.csv", sep = ",", usecols = lambda x: x != "Unnamed: 0")

In [None]:
## Check
print("Shape is: ", master_annotations_df_hq.shape)
display(master_annotations_df_hq.head())

Shape is:  (152, 38)


Unnamed: 0,WES_Cohort_1,WES_Cohort_2,WES_All,RNA_Cohort_1,RNA_Cohort_2,RNA_All,Institution,Harmonized_SU2C_Participant_ID_v2,Harmonized_SU2C_WES_Tumor_Sample_ID_v2,Harmonized_SU2C_WES_Normal_Sample_ID_v2,Harmonized_SU2C_RNA_Tumor_Sample_ID_v2,Pre-treatment_RNA_Sample_QC,Patient_Age_at_Diagnosis,Patient_Sex,Patient_Race,Patient_Smoking_Status,Patient_Smoking_Pack_Years_Harmonized,Histology_Harmonized,Histology_Detail,Initial_Stage,Initial_Stage_Substage,PDL1_TPS,PDL1_TPS_Description,Local_Antibody_Clone,Clinical_Driver,Sequencing_Platform,Advanced_Diagnosis_Date,Line_of_Therapy,Agent_PD1,Agent_PD1_Category,Prior_Platinum,Prior_TKI,Harmonized_PFS_Event,Harmonized_PFS_Days,Harmonized_Confirmed_BOR,Harmonized_BOR_RECIST,Harmonized_OS_Event,Harmonized_OS_Days
0,,,,1.0,,1.0,Cleveland Clinic,SU2CLC-CLE-NIVO10,,,SU2CLC-CLE-NIVO10-T1,Flag,55,F,2.0,1.0,30.0,Adeno,,4.0,,,,,,,-321.0,3.0,Nivolumab,PD(L)1,1.0,0.0,1.0,63.0,PD,,1.0,86.0
1,1.0,,1.0,1.0,,1.0,Cleveland Clinic,SU2CLC-CLE-NIVO18,SU2CLC-CLE-NIVO18-T1,SU2CLC-CLE-NIVO18-N1,SU2CLC-CLE-NIVO18-T1,Keep,68,F,0.0,0.0,0.0,Adeno,,2.0,A,,,,EGFR,,-533.0,4.0,Nivolumab,PD(L)1,1.0,1.0,1.0,50.0,PD,,1.0,161.0
2,1.0,,1.0,1.0,,1.0,Cleveland Clinic,SU2CLC-CLE-NIVO19,SU2CLC-CLE-NIVO19-T1,SU2CLC-CLE-NIVO19-N1,SU2CLC-CLE-NIVO19-T1,Keep,57,F,2.0,2.0,15.0,Adeno,,4.0,,,,,,,-35.0,1.0,Nivolumab,PD(L)1,0.0,0.0,1.0,297.0,PR,,1.0,297.0
3,,,,1.0,,1.0,Cleveland Clinic,SU2CLC-CLE-NIVO2,,,SU2CLC-CLE-NIVO2-T1,Keep,63,F,0.0,0.0,0.0,Adeno,,4.0,,,,,,,-262.0,4.0,Nivolumab,PD(L)1,1.0,1.0,1.0,68.0,PD,,1.0,123.0
4,,,,1.0,,1.0,Cleveland Clinic,SU2CLC-CLE-NIVO20,,,SU2CLC-CLE-NIVO20-T1,Keep,63,F,0.0,1.0,35.0,Adeno,,4.0,,,,,KRAS,,-301.0,2.0,Nivolumab,PD(L)1,1.0,0.0,1.0,273.0,PR,,1.0,273.0


Considering two features: **Harmonized_SU2C_RNA_Tumor_Sample_ID_v2** and **Harmonized_Confirmed_BOR**, extract 2 different subset:

1. Harmonized_SU2C_RNA_Tumor_Sample_ID_v2 & Harmonized_Confirmed_BOR == PR/CR
2. Harmonized_SU2C_RNA_Tumor_Sample_ID_v2 & Harmonized_Confirmed_BOR == SD/PD

**PR/CR**: Response

**SD/PD**: Resistance

In [None]:
## Count levels of BOR and chech if there are NaN
print("Total observations number with NaN: ", master_annotations_df_hq["Harmonized_Confirmed_BOR"].isna().sum())
print("\n")

display(master_annotations_df_hq.groupby("Harmonized_Confirmed_BOR").size())

Total observations number with NaN:  13




Harmonized_Confirmed_BOR
CR     8
NE     3
PD    45
PR    44
SD    39
dtype: int64

In [None]:
## Create a function that takes in input the dataframe and make a subset
## This function contains two internal variables thar are fixed for the extractino of subset
## Harmonized_SU2C_RNA_Tumor_Sample_ID_v2 and Harmonized_Confirmed_BOR

def make_subset(data, feature, value_1, value_2):
    """Input: data    ---> dataset
              feature ---> variable on which we want to filter
              value   ---> value that we want to filter"""
    """Output: subset"""
    rna_patients = data[(data[feature] == value_1) | (data[feature] == value_2)]
    ## Check
    print("Shape is: ", rna_patients.shape)
    display(rna_patients.head())
    print("\n")

    ## Extract only the ID patient and the BOR
    rna_patients = rna_patients.loc[:, ["Harmonized_SU2C_RNA_Tumor_Sample_ID_v2", "Harmonized_Confirmed_BOR"]]
    ## Check
    print("Shape is: ", rna_patients.shape)
    display(rna_patients.head())

    return rna_patients

In [None]:
## Extract the first one (PR/CR)
rna_patients_pr_cr = make_subset(master_annotations_df_hq, "Harmonized_Confirmed_BOR", "PR", "CR")
## Class: Response

Shape is:  (52, 38)


Unnamed: 0,WES_Cohort_1,WES_Cohort_2,WES_All,RNA_Cohort_1,RNA_Cohort_2,RNA_All,Institution,Harmonized_SU2C_Participant_ID_v2,Harmonized_SU2C_WES_Tumor_Sample_ID_v2,Harmonized_SU2C_WES_Normal_Sample_ID_v2,Harmonized_SU2C_RNA_Tumor_Sample_ID_v2,Pre-treatment_RNA_Sample_QC,Patient_Age_at_Diagnosis,Patient_Sex,Patient_Race,Patient_Smoking_Status,Patient_Smoking_Pack_Years_Harmonized,Histology_Harmonized,Histology_Detail,Initial_Stage,Initial_Stage_Substage,PDL1_TPS,PDL1_TPS_Description,Local_Antibody_Clone,Clinical_Driver,Sequencing_Platform,Advanced_Diagnosis_Date,Line_of_Therapy,Agent_PD1,Agent_PD1_Category,Prior_Platinum,Prior_TKI,Harmonized_PFS_Event,Harmonized_PFS_Days,Harmonized_Confirmed_BOR,Harmonized_BOR_RECIST,Harmonized_OS_Event,Harmonized_OS_Days
2,1.0,,1.0,1.0,,1.0,Cleveland Clinic,SU2CLC-CLE-NIVO19,SU2CLC-CLE-NIVO19-T1,SU2CLC-CLE-NIVO19-N1,SU2CLC-CLE-NIVO19-T1,Keep,57,F,2.0,2.0,15.0,Adeno,,4.0,,,,,,,-35.0,1.0,Nivolumab,PD(L)1,0.0,0.0,1.0,297.0,PR,,1.0,297.0
4,,,,1.0,,1.0,Cleveland Clinic,SU2CLC-CLE-NIVO20,,,SU2CLC-CLE-NIVO20-T1,Keep,63,F,0.0,1.0,35.0,Adeno,,4.0,,,,,KRAS,,-301.0,2.0,Nivolumab,PD(L)1,1.0,0.0,1.0,273.0,PR,,1.0,273.0
7,1.0,,1.0,1.0,,1.0,Cleveland Clinic,SU2CLC-CLE-NIVO3,SU2CLC-CLE-NIVO3-T1,SU2CLC-CLE-NIVO3-N1,SU2CLC-CLE-NIVO3-T1,Keep,57,F,0.0,0.0,0.0,Adeno,,4.0,,,,,ALK,,-497.0,4.0,Nivolumab,PD(L)1,1.0,1.0,1.0,392.0,PR,,1.0,636.0
10,,1.0,1.0,,1.0,1.0,Cleveland Clinic,SU2CLC-CLE-NIVO5,SU2CLC-CLE-NIVO5-T1,SU2CLC-CLE-NIVO5-N1,SU2CLC-CLE-NIVO5-T1,Keep,61,M,0.0,1.0,60.0,Adeno,,4.0,,,,,,,-686.0,2.0,Nivolumab,PD(L)1,1.0,0.0,1.0,169.0,PR,,1.0,547.0
11,,,,,1.0,1.0,Cleveland Clinic,SU2CLC-CLE-NIVO52,,,SU2CLC-CLE-NIVO52-T1,Flag,64,M,0.0,1.0,75.0,Adeno,,4.0,,,,,EGFR,,-1054.0,6.0,Nivolumab,PD(L)1,1.0,1.0,1.0,622.0,PR,,1.0,911.0




Shape is:  (52, 2)


Unnamed: 0,Harmonized_SU2C_RNA_Tumor_Sample_ID_v2,Harmonized_Confirmed_BOR
2,SU2CLC-CLE-NIVO19-T1,PR
4,SU2CLC-CLE-NIVO20-T1,PR
7,SU2CLC-CLE-NIVO3-T1,PR
10,SU2CLC-CLE-NIVO5-T1,PR
11,SU2CLC-CLE-NIVO52-T1,PR


In [None]:
## Extract the second one (PR/CR)
rna_patients_sd_pd = make_subset(master_annotations_df_hq, "Harmonized_Confirmed_BOR", "SD", "PD")
## Class: Resistance

Shape is:  (84, 38)


Unnamed: 0,WES_Cohort_1,WES_Cohort_2,WES_All,RNA_Cohort_1,RNA_Cohort_2,RNA_All,Institution,Harmonized_SU2C_Participant_ID_v2,Harmonized_SU2C_WES_Tumor_Sample_ID_v2,Harmonized_SU2C_WES_Normal_Sample_ID_v2,Harmonized_SU2C_RNA_Tumor_Sample_ID_v2,Pre-treatment_RNA_Sample_QC,Patient_Age_at_Diagnosis,Patient_Sex,Patient_Race,Patient_Smoking_Status,Patient_Smoking_Pack_Years_Harmonized,Histology_Harmonized,Histology_Detail,Initial_Stage,Initial_Stage_Substage,PDL1_TPS,PDL1_TPS_Description,Local_Antibody_Clone,Clinical_Driver,Sequencing_Platform,Advanced_Diagnosis_Date,Line_of_Therapy,Agent_PD1,Agent_PD1_Category,Prior_Platinum,Prior_TKI,Harmonized_PFS_Event,Harmonized_PFS_Days,Harmonized_Confirmed_BOR,Harmonized_BOR_RECIST,Harmonized_OS_Event,Harmonized_OS_Days
0,,,,1.0,,1.0,Cleveland Clinic,SU2CLC-CLE-NIVO10,,,SU2CLC-CLE-NIVO10-T1,Flag,55,F,2.0,1.0,30.0,Adeno,,4.0,,,,,,,-321.0,3.0,Nivolumab,PD(L)1,1.0,0.0,1.0,63.0,PD,,1.0,86.0
1,1.0,,1.0,1.0,,1.0,Cleveland Clinic,SU2CLC-CLE-NIVO18,SU2CLC-CLE-NIVO18-T1,SU2CLC-CLE-NIVO18-N1,SU2CLC-CLE-NIVO18-T1,Keep,68,F,0.0,0.0,0.0,Adeno,,2.0,A,,,,EGFR,,-533.0,4.0,Nivolumab,PD(L)1,1.0,1.0,1.0,50.0,PD,,1.0,161.0
3,,,,1.0,,1.0,Cleveland Clinic,SU2CLC-CLE-NIVO2,,,SU2CLC-CLE-NIVO2-T1,Keep,63,F,0.0,0.0,0.0,Adeno,,4.0,,,,,,,-262.0,4.0,Nivolumab,PD(L)1,1.0,1.0,1.0,68.0,PD,,1.0,123.0
6,,,,,1.0,1.0,Cleveland Clinic,SU2CLC-CLE-NIVO24,,,SU2CLC-CLE-NIVO24-T1,Flag,73,F,0.0,1.0,82.0,Squamous,,3.0,A,,,,,,-11.0,2.0,Nivolumab,PD(L)1,1.0,0.0,1.0,172.0,SD,,1.0,172.0
8,,,,1.0,,1.0,Cleveland Clinic,SU2CLC-CLE-NIVO31,,,SU2CLC-CLE-NIVO31-T1,Flag,83,M,0.0,1.0,4.0,Squamous,,1.0,,,,,,,-4.0,1.0,Nivolumab,PD(L)1,0.0,0.0,1.0,12.0,PD,,1.0,98.0




Shape is:  (84, 2)


Unnamed: 0,Harmonized_SU2C_RNA_Tumor_Sample_ID_v2,Harmonized_Confirmed_BOR
0,SU2CLC-CLE-NIVO10-T1,PD
1,SU2CLC-CLE-NIVO18-T1,PD
3,SU2CLC-CLE-NIVO2-T1,PD
6,SU2CLC-CLE-NIVO24-T1,SD
8,SU2CLC-CLE-NIVO31-T1,PD


In [None]:
## Check
print("Total observations number: ", master_annotations_df_hq.shape[0])
print("Total observations number where BOR is NE: ", master_annotations_df_hq[(master_annotations_df_hq["Harmonized_Confirmed_BOR"] == "NE")].shape[0])
print("Total observations number with NaN: ", master_annotations_df_hq["Harmonized_Confirmed_BOR"].isna().sum())
print("Total observations number where BOR is PR, CR, SD, PD: ", rna_patients_pr_cr.shape[0] + rna_patients_sd_pd.shape[0])

Total observations number:  152
Total observations number where BOR is NE:  3
Total observations number with NaN:  13
Total observations number where BOR is PR, CR, SD, PD:  136


In [None]:
## Consider only the samples where BOR is PR, CR, SD, PD
expression_BOR = ["PR", "CR", "SD", "PD"]

# Create subset excluding rows with values in exclude_values list in column 'B'
master_annotations_df_hq_clean = master_annotations_df_hq[master_annotations_df_hq["Harmonized_Confirmed_BOR"].isin(expression_BOR)]
## The result must be 136

## Check
print("Shape is: ", master_annotations_df_hq_clean.shape)

Shape is:  (136, 38)


Now, it's possible modify the clinical table in the following way:

- if Harmonized_Confirmed_BOR == **PR/CR** , **Response**
- if Harmonized_Confirmed_BOR == **SD/PD** , **Resistance**

In [None]:
## Modify Harmonized_Confirmed_BOR

## Define the mapping dictionary
substitutions = {"PR": "Response","CR":"Response", "SD":"Resistance", "PD":"Resistance"}
## Substitute
master_annotations_df_hq_clean["BOR_cat"] = [substitutions[x] if x in substitutions else x for x in master_annotations_df_hq_clean["Harmonized_Confirmed_BOR"]]

In [None]:
## Check
print("Shape is: ", master_annotations_df_hq_clean.shape)
display(master_annotations_df_hq_clean.head())

## The new variable must be:
## Response   (n = 52)
## Resistance (n = 84)

Shape is:  (136, 39)


Unnamed: 0,WES_Cohort_1,WES_Cohort_2,WES_All,RNA_Cohort_1,RNA_Cohort_2,RNA_All,Institution,Harmonized_SU2C_Participant_ID_v2,Harmonized_SU2C_WES_Tumor_Sample_ID_v2,Harmonized_SU2C_WES_Normal_Sample_ID_v2,Harmonized_SU2C_RNA_Tumor_Sample_ID_v2,Pre-treatment_RNA_Sample_QC,Patient_Age_at_Diagnosis,Patient_Sex,Patient_Race,Patient_Smoking_Status,Patient_Smoking_Pack_Years_Harmonized,Histology_Harmonized,Histology_Detail,Initial_Stage,Initial_Stage_Substage,PDL1_TPS,PDL1_TPS_Description,Local_Antibody_Clone,Clinical_Driver,Sequencing_Platform,Advanced_Diagnosis_Date,Line_of_Therapy,Agent_PD1,Agent_PD1_Category,Prior_Platinum,Prior_TKI,Harmonized_PFS_Event,Harmonized_PFS_Days,Harmonized_Confirmed_BOR,Harmonized_BOR_RECIST,Harmonized_OS_Event,Harmonized_OS_Days,BOR_cat
0,,,,1.0,,1.0,Cleveland Clinic,SU2CLC-CLE-NIVO10,,,SU2CLC-CLE-NIVO10-T1,Flag,55,F,2.0,1.0,30.0,Adeno,,4.0,,,,,,,-321.0,3.0,Nivolumab,PD(L)1,1.0,0.0,1.0,63.0,PD,,1.0,86.0,Resistance
1,1.0,,1.0,1.0,,1.0,Cleveland Clinic,SU2CLC-CLE-NIVO18,SU2CLC-CLE-NIVO18-T1,SU2CLC-CLE-NIVO18-N1,SU2CLC-CLE-NIVO18-T1,Keep,68,F,0.0,0.0,0.0,Adeno,,2.0,A,,,,EGFR,,-533.0,4.0,Nivolumab,PD(L)1,1.0,1.0,1.0,50.0,PD,,1.0,161.0,Resistance
2,1.0,,1.0,1.0,,1.0,Cleveland Clinic,SU2CLC-CLE-NIVO19,SU2CLC-CLE-NIVO19-T1,SU2CLC-CLE-NIVO19-N1,SU2CLC-CLE-NIVO19-T1,Keep,57,F,2.0,2.0,15.0,Adeno,,4.0,,,,,,,-35.0,1.0,Nivolumab,PD(L)1,0.0,0.0,1.0,297.0,PR,,1.0,297.0,Response
3,,,,1.0,,1.0,Cleveland Clinic,SU2CLC-CLE-NIVO2,,,SU2CLC-CLE-NIVO2-T1,Keep,63,F,0.0,0.0,0.0,Adeno,,4.0,,,,,,,-262.0,4.0,Nivolumab,PD(L)1,1.0,1.0,1.0,68.0,PD,,1.0,123.0,Resistance
4,,,,1.0,,1.0,Cleveland Clinic,SU2CLC-CLE-NIVO20,,,SU2CLC-CLE-NIVO20-T1,Keep,63,F,0.0,1.0,35.0,Adeno,,4.0,,,,,KRAS,,-301.0,2.0,Nivolumab,PD(L)1,1.0,0.0,1.0,273.0,PR,,1.0,273.0,Response


In [None]:
## Check
master_annotations_df_hq_clean.groupby("BOR_cat").size()

BOR_cat
Resistance    84
Response      52
dtype: int64

In [None]:
## Extract the patients ID
patients_id = master_annotations_df_hq_clean["Harmonized_SU2C_RNA_Tumor_Sample_ID_v2"]
## Check
print("Total Number with highest quality samples ", len(patients_id))

Total Number with highest quality samples  136


### Prepare RNA-seq Data

In [None]:
# ## Load Annotation
# rna_counts_hq = pd.read_csv(source_path_data + "rna_counts.csv").set_index("Name")
# ## Save description genes
# description_genes = rna_counts_hq["Description"]

# ## Check
# print("Shape is: ", rna_counts_hq.shape)
# display(rna_counts_hq.head())

Now, using patients_id we can select only the patients for which we are interessed


In [None]:
## Load CPM Data (Normalized)
rna_counts_hq_cpm = pd.read_csv(source_path_data + "cpm.csv")
## Rename column
rna_counts_hq_cpm = rna_counts_hq_cpm.rename(columns = ({"Unnamed: 0": "Name"}))
## Set index
rna_counts_hq_cpm = rna_counts_hq_cpm.set_index("Name")

## Check
print("Shape is: ", rna_counts_hq_cpm.shape)
display(rna_counts_hq_cpm.head())

Shape is:  (15924, 136)


Unnamed: 0_level_0,SU2CLC-CLE-NIVO10-T1,SU2CLC-CLE-NIVO18-T1,SU2CLC-CLE-NIVO19-T1,SU2CLC-CLE-NIVO2-T1,SU2CLC-CLE-NIVO20-T1,SU2CLC-CLE-NIVO24-T1,SU2CLC-CLE-NIVO3-T1,SU2CLC-CLE-NIVO31-T1,SU2CLC-CLE-NIVO5-T1,SU2CLC-CLE-NIVO52-T1,SU2CLC-CLE-NIVO9-T1,SU2CLC-COL-1001-T1,SU2CLC-COL-1005-T1,SU2CLC-COL-1007-T1,SU2CLC-COL-1008-T1,SU2CLC-COL-1010-T1,SU2CLC-COL-1016-T1,SU2CLC-COL-1017-T1,SU2CLC-COL-1018-T1,SU2CLC-COL-1020-T1,SU2CLC-COL-1021-T1,SU2CLC-COL-1022-T1,SU2CLC-COL-1023-T1,SU2CLC-COL-1025-T1,SU2CLC-COL-1026-T1,SU2CLC-COL-1027-T1,SU2CLC-COL-1029-T1,SU2CLC-COL-1031-T1,SU2CLC-COL-1032-T1,SU2CLC-COL-1033-T1,SU2CLC-COL-1034-T1,SU2CLC-COL-1035-T1,SU2CLC-COL-1036-T1,SU2CLC-COL-1037-T1,SU2CLC-COL-1038-T1,SU2CLC-COL-1039-T2,SU2CLC-COL-1041-T1,SU2CLC-COL-1043-T2,SU2CLC-COL-1044-T1,SU2CLC-DFC-1001-T1,SU2CLC-DFC-1002-T1,SU2CLC-DFC-1003-T1,SU2CLC-DFC-1004-T1,SU2CLC-DFC-1007-T1,SU2CLC-DFC-1012-T1,SU2CLC-DFC-1013-T1,SU2CLC-DFC-1015-T2,SU2CLC-DFC-1016-T1,SU2CLC-DFC-1017-T2,SU2CLC-DFC-1018-T1,SU2CLC-DFC-1019-T1,SU2CLC-DFC-1020-T1,SU2CLC-DFC-1534-T1,SU2CLC-DFC-1535-T1,SU2CLC-DFC-1536-T1,SU2CLC-DFC-1537-T1,SU2CLC-DFC-1538-T1,SU2CLC-DFC-1539-T1,SU2CLC-DFC-DF0032-T1,SU2CLC-DFC-DF0033-T1,SU2CLC-DFC-DF0047-T1,SU2CLC-DFC-DF0107-T1,SU2CLC-DFC-DF0108-T1,SU2CLC-DFC-DF0109-T1,SU2CLC-DFC-DF0112-T1,SU2CLC-DFC-DF0241-T1,SU2CLC-DFC-DF0499-T1,SU2CLC-DFC-DF0510-T1,SU2CLC-DFC-DF0512-T1,SU2CLC-DFC-DF0561-T1,SU2CLC-DFC-DF0668-T1,SU2CLC-DFC-DF0790-T1,SU2CLC-DFC-DF0840-T1,SU2CLC-MDA-1441-T1,SU2CLC-MDA-1442-T1,SU2CLC-MDA-1443-T1,SU2CLC-MDA-1444-T1,SU2CLC-MDA-1561-T1,SU2CLC-MDA-1562-T1,SU2CLC-MDA-1563-T1,SU2CLC-MDA-1564-T1,SU2CLC-MDA-1627-T1,SU2CLC-MDA-1628-T1,SU2CLC-MDA-1629-T1,SU2CLC-MDA-1630-T1,SU2CLC-MDA-1631-T1,SU2CLC-MGH-1044-T1,SU2CLC-MGH-1054-T2,SU2CLC-MGH-1055-T1,SU2CLC-MGH-1135-T2,SU2CLC-MGH-1148-T1,SU2CLC-MGH-1149-T1,SU2CLC-MGH-1150-T1,SU2CLC-MGH-1161-T2,SU2CLC-MGH-1163-T1,SU2CLC-MGH-1169-T1,SU2CLC-MGH-1387-T1,SU2CLC-MGH-1389-T1,SU2CLC-MGH-1409-T1,SU2CLC-MGH-1411-T1,SU2CLC-MGH-1412-T1,SU2CLC-MGH-1413-T1,SU2CLC-MGH-1414-T1,SU2CLC-MGH-1415-T1,SU2CLC-MGH-1416-T1,SU2CLC-MGH-1417-T1,SU2CLC-MGH-1418-T1,SU2CLC-MGH-1487-T1,SU2CLC-MGH-1488-T1,SU2CLC-MGH-1489-T1,SU2CLC-MGH-1490-T1,SU2CLC-MGH-1492-T1,SU2CLC-MGH-1493-T1,SU2CLC-MGH-1495-T1,SU2CLC-MGH-1498-T1,SU2CLC-MGH-1500-T1,SU2CLC-MGH-1501-T1,SU2CLC-MGH-1503-T1,SU2CLC-MGH-1565-T1,SU2CLC-MGH-1568-T1,SU2CLC-MGH-1572-T1,SU2CLC-MGH-1574-T1,SU2CLC-MGH-1575-T1,SU2CLC-MSK-1364-T1,SU2CLC-MSK-1365-T1,SU2CLC-MSK-A2009-T1,SU2CLC-MSK-A2013-T1,SU2CLC-MSK-A2014-T1,SU2CLC-MSK-A2060-T1,SU2CLC-MSK-A2075-T1,SU2CLC-UCD-1124-T1,SU2CLC-UCD-1137-T1,SU2CLC-UCD-1142-T1,SU2CLC-UCD-1143-T1,SU2CLC-UCD-1557-T1,SU2CLC-UCD-1560-T1
Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1,Unnamed: 48_level_1,Unnamed: 49_level_1,Unnamed: 50_level_1,Unnamed: 51_level_1,Unnamed: 52_level_1,Unnamed: 53_level_1,Unnamed: 54_level_1,Unnamed: 55_level_1,Unnamed: 56_level_1,Unnamed: 57_level_1,Unnamed: 58_level_1,Unnamed: 59_level_1,Unnamed: 60_level_1,Unnamed: 61_level_1,Unnamed: 62_level_1,Unnamed: 63_level_1,Unnamed: 64_level_1,Unnamed: 65_level_1,Unnamed: 66_level_1,Unnamed: 67_level_1,Unnamed: 68_level_1,Unnamed: 69_level_1,Unnamed: 70_level_1,Unnamed: 71_level_1,Unnamed: 72_level_1,Unnamed: 73_level_1,Unnamed: 74_level_1,Unnamed: 75_level_1,Unnamed: 76_level_1,Unnamed: 77_level_1,Unnamed: 78_level_1,Unnamed: 79_level_1,Unnamed: 80_level_1,Unnamed: 81_level_1,Unnamed: 82_level_1,Unnamed: 83_level_1,Unnamed: 84_level_1,Unnamed: 85_level_1,Unnamed: 86_level_1,Unnamed: 87_level_1,Unnamed: 88_level_1,Unnamed: 89_level_1,Unnamed: 90_level_1,Unnamed: 91_level_1,Unnamed: 92_level_1,Unnamed: 93_level_1,Unnamed: 94_level_1,Unnamed: 95_level_1,Unnamed: 96_level_1,Unnamed: 97_level_1,Unnamed: 98_level_1,Unnamed: 99_level_1,Unnamed: 100_level_1,Unnamed: 101_level_1,Unnamed: 102_level_1,Unnamed: 103_level_1,Unnamed: 104_level_1,Unnamed: 105_level_1,Unnamed: 106_level_1,Unnamed: 107_level_1,Unnamed: 108_level_1,Unnamed: 109_level_1,Unnamed: 110_level_1,Unnamed: 111_level_1,Unnamed: 112_level_1,Unnamed: 113_level_1,Unnamed: 114_level_1,Unnamed: 115_level_1,Unnamed: 116_level_1,Unnamed: 117_level_1,Unnamed: 118_level_1,Unnamed: 119_level_1,Unnamed: 120_level_1,Unnamed: 121_level_1,Unnamed: 122_level_1,Unnamed: 123_level_1,Unnamed: 124_level_1,Unnamed: 125_level_1,Unnamed: 126_level_1,Unnamed: 127_level_1,Unnamed: 128_level_1,Unnamed: 129_level_1,Unnamed: 130_level_1,Unnamed: 131_level_1,Unnamed: 132_level_1,Unnamed: 133_level_1,Unnamed: 134_level_1,Unnamed: 135_level_1,Unnamed: 136_level_1
ENSG00000187634.11,4.431076,0.453236,1.155171,1.009496,1.767499,3.307169,0.360152,0.850265,5.292916,52.776752,2.684743,4.189089,7.803944,0.928295,0.553878,1.35086,1.682919,0.306656,0.0,7.350413,2.353301,4.706567,3.533557,0.432397,0.860571,4.574732,1.306453,0.495657,4.650064,17.301699,0.335699,7.226855,1.783137,1.114443,1.58655,0.926229,1.926702,3.64374,0.879785,7.798641,2.771499,17.313346,2.388004,5.376212,6.243532,3.387685,4.052297,0.651425,2.532179,32.106544,2.183914,1.458088,4.709117,6.184516,1.502689,5.528765,3.63904,3.89814,7.21049,3.907967,2.027742,2.882646,6.936726,5.329858,4.7363,0.930876,2.706959,0.427445,2.640967,0.092825,1.192969,0.480931,4.105426,8.984268,0.88175,4.555242,0.811472,5.054237,1.412726,2.761905,0.0,6.881449,7.022262,2.075339,18.49583,0.55786,8.206746,8.156931,0.0,5.334674,0.754773,6.509234,3.244957,1.480543,15.500587,1.39094,9.337178,3.920356,3.102597,0.99418,0.520856,5.57787,1.885142,0.966936,2.321603,3.670364,3.960003,0.027218,5.11121,0.543567,3.723049,1.090681,0.880358,7.600478,1.922739,0.875893,0.148757,1.208175,0.252068,2.935332,13.906862,4.539829,1.619512,13.348662,2.940581,3.946299,7.682567,1.270299,4.439976,2.985712,5.150212,2.661842,1.581718,2.516983,2.945791,0.872072
ENSG00000188976.10,25.584913,49.855911,51.597657,38.227745,105.780337,88.426113,53.704539,0.0,40.825648,34.120935,36.342253,48.361057,50.039426,24.739552,67.89074,69.577832,102.786623,62.350654,16.635854,32.127086,38.556317,38.393164,52.60791,51.695477,51.051552,30.32732,52.385391,17.372767,61.323629,46.605695,74.629457,39.108726,62.049038,54.283014,42.290065,75.870929,46.041244,33.718329,56.423081,25.937886,34.855127,77.221622,53.431587,39.530971,36.060203,71.90239,46.851839,41.066923,78.621074,79.724301,29.586835,63.55254,32.655485,94.995108,44.343232,26.419155,27.654637,78.085247,52.399899,41.987386,47.651945,43.077746,58.091516,56.337918,70.777123,63.445043,166.411429,54.452189,39.793555,44.494215,144.681455,54.996788,49.631097,38.931827,31.976931,64.294865,59.663512,40.874157,41.788435,107.150353,20.389659,28.47245,49.725347,65.614057,52.583473,78.024362,62.259033,40.735973,96.684696,49.978231,41.690086,49.419003,80.743658,51.308127,36.468435,73.957881,75.996927,81.839175,28.228684,30.819585,42.694378,34.748357,39.442981,43.129387,28.111425,39.45641,48.222419,87.460821,34.816494,20.228449,61.484533,47.141646,37.030753,27.971127,32.252813,42.117948,35.37222,45.567774,28.670977,57.89182,67.241338,57.504498,87.911023,75.225956,18.007987,134.963427,53.525806,129.900408,58.173232,61.630737,90.646653,41.486701,71.838019,31.44736,94.265307,72.963388
ENSG00000187961.13,3.186733,9.291329,10.847989,9.640137,51.676885,20.620109,10.63705,17.643009,8.423136,0.0,8.591178,9.134985,11.046617,0.973212,15.019859,8.618144,7.830249,8.022788,0.016838,2.890612,6.912822,6.35506,13.394351,6.720172,8.076813,8.498748,8.211993,10.619445,12.3146,10.590737,9.602161,16.898005,8.596162,11.890314,5.34228,20.121526,12.367346,2.783158,13.45796,6.120453,14.092369,23.553181,8.818202,11.815017,9.18256,13.018858,7.683428,7.247104,7.571833,30.271884,6.770133,19.022187,12.851966,53.847938,15.945197,7.161659,5.686,8.46978,16.760262,12.700893,17.134423,21.474095,22.865506,16.899907,24.693691,24.435505,29.820928,12.758157,12.443879,12.252927,25.535575,14.412416,16.501266,9.612538,4.075844,6.733169,8.90591,9.668209,9.521773,19.492399,0.0,0.206322,7.535838,15.861517,8.838817,5.027081,11.434992,6.041971,0.0,13.249942,8.959595,7.900643,16.275488,12.876557,11.084722,7.771529,27.530236,17.983412,6.254704,5.698923,7.765484,4.61209,5.534585,9.125461,6.386263,15.130887,6.539454,2.576647,9.921078,24.538157,10.283568,6.44021,12.347299,7.600478,6.234335,12.763015,7.469712,11.306228,3.896257,19.536158,23.709307,15.230393,20.446342,16.294573,12.99917,16.705999,12.238312,9.238535,10.216719,11.96302,18.062217,8.150305,17.278765,7.431565,13.320098,19.766974
ENSG00000187583.10,1.608541,1.586324,18.549132,0.698882,37.776552,1.174678,1.448984,11.053451,2.636973,0.35814,12.755803,6.150392,4.803462,3.703197,11.875789,3.163406,7.467954,16.261064,0.0,11.583095,7.648229,3.535898,14.478656,5.164743,11.501166,1.518338,6.430466,0.904573,21.526714,10.300359,15.70263,5.831709,9.049677,4.054115,3.40167,2.07603,6.95175,2.57259,7.168872,2.578981,6.834799,27.595891,8.370451,3.504025,2.969485,10.269432,3.972617,5.90354,4.310881,12.425649,3.03668,30.133826,2.873122,9.940246,11.353648,1.082679,1.230244,2.275616,9.260801,6.722634,3.629659,12.21076,2.25515,13.287937,5.385631,7.359741,15.043593,9.52696,5.371459,6.312114,16.837472,4.188754,11.982115,7.979035,4.975589,1.17332,9.636235,4.719636,1.780035,13.014213,0.0,1.456391,4.322175,21.902235,11.569977,6.016016,3.828333,3.456418,0.0,8.601981,13.283998,2.558913,0.00845,6.433065,4.498475,2.305787,29.744206,19.99242,1.023197,4.548179,6.976309,0.847521,2.150996,7.144585,4.620955,8.333224,2.736883,8.791446,4.728738,6.63928,8.476263,2.0948,5.973062,2.923787,5.658161,2.777833,6.130901,15.518513,5.631928,10.298206,19.192157,4.198121,13.053568,28.222885,7.826533,0.0,10.658091,7.044383,8.903822,8.17036,31.747484,3.257587,17.058526,2.775645,14.536837,9.15676
ENSG00000187642.9,0.0,0.0,0.358501,0.199681,2.995761,0.072288,0.879441,0.0,0.663986,0.0,1.191764,0.373074,0.05382,0.0,0.089598,0.752378,1.028451,0.638176,0.0,0.020647,0.294163,0.203075,0.91847,0.510469,1.36257,0.276061,0.169669,0.0,0.814128,0.927597,1.088129,0.206482,0.103071,6.906035,1.447615,0.015969,0.347154,0.082396,1.464017,0.234453,0.587182,0.70308,0.398001,1.720388,0.548213,2.0948,0.022766,1.00428,1.025224,1.584479,0.447182,1.089376,0.112122,3.021171,0.556551,0.088744,0.351498,1.336797,1.871425,0.13957,0.993594,1.085041,0.0,7.165208,2.482738,0.639978,0.088753,1.180908,0.604289,0.804485,2.068819,0.465417,0.525113,1.151829,0.575837,0.30675,3.834207,0.704423,0.452072,1.590626,0.0,1.650577,1.301737,2.482995,1.359818,2.434299,0.666763,2.812735,0.0,0.679484,6.34009,0.607742,0.0,1.449264,0.322929,1.171564,3.513474,1.674173,0.099019,0.422722,0.205186,0.275937,0.193348,0.38946,0.778813,0.955044,0.520734,0.027218,0.475192,0.0,0.524119,1.627365,0.278594,1.376303,2.104007,0.012513,0.170008,2.481656,0.777811,0.461407,0.366875,0.260349,1.109666,1.025808,1.085312,0.460402,1.714709,0.148476,0.334192,1.149902,0.335566,0.551381,0.520565,0.338251,0.064039,0.290691


In [None]:
## Subset
rna_counts_hq_cts =  rna_counts_hq_cpm[patients_id]

## Check
print("Shape is: ", rna_counts_hq_cts.shape)
display(rna_counts_hq_cts.head())

Shape is:  (15924, 136)


Unnamed: 0_level_0,SU2CLC-CLE-NIVO10-T1,SU2CLC-CLE-NIVO18-T1,SU2CLC-CLE-NIVO19-T1,SU2CLC-CLE-NIVO2-T1,SU2CLC-CLE-NIVO20-T1,SU2CLC-CLE-NIVO24-T1,SU2CLC-CLE-NIVO3-T1,SU2CLC-CLE-NIVO31-T1,SU2CLC-CLE-NIVO5-T1,SU2CLC-CLE-NIVO52-T1,SU2CLC-CLE-NIVO9-T1,SU2CLC-COL-1001-T1,SU2CLC-COL-1005-T1,SU2CLC-COL-1007-T1,SU2CLC-COL-1008-T1,SU2CLC-COL-1010-T1,SU2CLC-COL-1016-T1,SU2CLC-COL-1017-T1,SU2CLC-COL-1018-T1,SU2CLC-COL-1020-T1,SU2CLC-COL-1021-T1,SU2CLC-COL-1022-T1,SU2CLC-COL-1023-T1,SU2CLC-COL-1025-T1,SU2CLC-COL-1026-T1,SU2CLC-COL-1027-T1,SU2CLC-COL-1029-T1,SU2CLC-COL-1031-T1,SU2CLC-COL-1032-T1,SU2CLC-COL-1033-T1,SU2CLC-COL-1034-T1,SU2CLC-COL-1035-T1,SU2CLC-COL-1036-T1,SU2CLC-COL-1037-T1,SU2CLC-COL-1038-T1,SU2CLC-COL-1039-T2,SU2CLC-COL-1041-T1,SU2CLC-COL-1043-T2,SU2CLC-COL-1044-T1,SU2CLC-DFC-1001-T1,SU2CLC-DFC-1002-T1,SU2CLC-DFC-1003-T1,SU2CLC-DFC-1004-T1,SU2CLC-DFC-1007-T1,SU2CLC-DFC-1012-T1,SU2CLC-DFC-1013-T1,SU2CLC-DFC-1015-T2,SU2CLC-DFC-1016-T1,SU2CLC-DFC-1017-T2,SU2CLC-DFC-1018-T1,SU2CLC-DFC-1019-T1,SU2CLC-DFC-1020-T1,SU2CLC-DFC-1534-T1,SU2CLC-DFC-1535-T1,SU2CLC-DFC-1536-T1,SU2CLC-DFC-1537-T1,SU2CLC-DFC-1538-T1,SU2CLC-DFC-1539-T1,SU2CLC-DFC-DF0032-T1,SU2CLC-DFC-DF0033-T1,SU2CLC-DFC-DF0047-T1,SU2CLC-DFC-DF0107-T1,SU2CLC-DFC-DF0108-T1,SU2CLC-DFC-DF0109-T1,SU2CLC-DFC-DF0112-T1,SU2CLC-DFC-DF0241-T1,SU2CLC-DFC-DF0499-T1,SU2CLC-DFC-DF0510-T1,SU2CLC-DFC-DF0512-T1,SU2CLC-DFC-DF0561-T1,SU2CLC-DFC-DF0668-T1,SU2CLC-DFC-DF0790-T1,SU2CLC-DFC-DF0840-T1,SU2CLC-MDA-1441-T1,SU2CLC-MDA-1442-T1,SU2CLC-MDA-1443-T1,SU2CLC-MDA-1444-T1,SU2CLC-MDA-1561-T1,SU2CLC-MDA-1562-T1,SU2CLC-MDA-1563-T1,SU2CLC-MDA-1564-T1,SU2CLC-MDA-1627-T1,SU2CLC-MDA-1628-T1,SU2CLC-MDA-1629-T1,SU2CLC-MDA-1630-T1,SU2CLC-MDA-1631-T1,SU2CLC-MGH-1044-T1,SU2CLC-MGH-1054-T2,SU2CLC-MGH-1055-T1,SU2CLC-MGH-1135-T2,SU2CLC-MGH-1148-T1,SU2CLC-MGH-1149-T1,SU2CLC-MGH-1150-T1,SU2CLC-MGH-1161-T2,SU2CLC-MGH-1163-T1,SU2CLC-MGH-1169-T1,SU2CLC-MGH-1387-T1,SU2CLC-MGH-1389-T1,SU2CLC-MGH-1409-T1,SU2CLC-MGH-1411-T1,SU2CLC-MGH-1412-T1,SU2CLC-MGH-1413-T1,SU2CLC-MGH-1414-T1,SU2CLC-MGH-1415-T1,SU2CLC-MGH-1416-T1,SU2CLC-MGH-1417-T1,SU2CLC-MGH-1418-T1,SU2CLC-MGH-1487-T1,SU2CLC-MGH-1488-T1,SU2CLC-MGH-1489-T1,SU2CLC-MGH-1490-T1,SU2CLC-MGH-1492-T1,SU2CLC-MGH-1493-T1,SU2CLC-MGH-1495-T1,SU2CLC-MGH-1498-T1,SU2CLC-MGH-1500-T1,SU2CLC-MGH-1501-T1,SU2CLC-MGH-1503-T1,SU2CLC-MGH-1565-T1,SU2CLC-MGH-1568-T1,SU2CLC-MGH-1572-T1,SU2CLC-MGH-1574-T1,SU2CLC-MGH-1575-T1,SU2CLC-MSK-1364-T1,SU2CLC-MSK-1365-T1,SU2CLC-MSK-A2009-T1,SU2CLC-MSK-A2013-T1,SU2CLC-MSK-A2014-T1,SU2CLC-MSK-A2060-T1,SU2CLC-MSK-A2075-T1,SU2CLC-UCD-1124-T1,SU2CLC-UCD-1137-T1,SU2CLC-UCD-1142-T1,SU2CLC-UCD-1143-T1,SU2CLC-UCD-1557-T1,SU2CLC-UCD-1560-T1
Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1,Unnamed: 48_level_1,Unnamed: 49_level_1,Unnamed: 50_level_1,Unnamed: 51_level_1,Unnamed: 52_level_1,Unnamed: 53_level_1,Unnamed: 54_level_1,Unnamed: 55_level_1,Unnamed: 56_level_1,Unnamed: 57_level_1,Unnamed: 58_level_1,Unnamed: 59_level_1,Unnamed: 60_level_1,Unnamed: 61_level_1,Unnamed: 62_level_1,Unnamed: 63_level_1,Unnamed: 64_level_1,Unnamed: 65_level_1,Unnamed: 66_level_1,Unnamed: 67_level_1,Unnamed: 68_level_1,Unnamed: 69_level_1,Unnamed: 70_level_1,Unnamed: 71_level_1,Unnamed: 72_level_1,Unnamed: 73_level_1,Unnamed: 74_level_1,Unnamed: 75_level_1,Unnamed: 76_level_1,Unnamed: 77_level_1,Unnamed: 78_level_1,Unnamed: 79_level_1,Unnamed: 80_level_1,Unnamed: 81_level_1,Unnamed: 82_level_1,Unnamed: 83_level_1,Unnamed: 84_level_1,Unnamed: 85_level_1,Unnamed: 86_level_1,Unnamed: 87_level_1,Unnamed: 88_level_1,Unnamed: 89_level_1,Unnamed: 90_level_1,Unnamed: 91_level_1,Unnamed: 92_level_1,Unnamed: 93_level_1,Unnamed: 94_level_1,Unnamed: 95_level_1,Unnamed: 96_level_1,Unnamed: 97_level_1,Unnamed: 98_level_1,Unnamed: 99_level_1,Unnamed: 100_level_1,Unnamed: 101_level_1,Unnamed: 102_level_1,Unnamed: 103_level_1,Unnamed: 104_level_1,Unnamed: 105_level_1,Unnamed: 106_level_1,Unnamed: 107_level_1,Unnamed: 108_level_1,Unnamed: 109_level_1,Unnamed: 110_level_1,Unnamed: 111_level_1,Unnamed: 112_level_1,Unnamed: 113_level_1,Unnamed: 114_level_1,Unnamed: 115_level_1,Unnamed: 116_level_1,Unnamed: 117_level_1,Unnamed: 118_level_1,Unnamed: 119_level_1,Unnamed: 120_level_1,Unnamed: 121_level_1,Unnamed: 122_level_1,Unnamed: 123_level_1,Unnamed: 124_level_1,Unnamed: 125_level_1,Unnamed: 126_level_1,Unnamed: 127_level_1,Unnamed: 128_level_1,Unnamed: 129_level_1,Unnamed: 130_level_1,Unnamed: 131_level_1,Unnamed: 132_level_1,Unnamed: 133_level_1,Unnamed: 134_level_1,Unnamed: 135_level_1,Unnamed: 136_level_1
ENSG00000187634.11,4.431076,0.453236,1.155171,1.009496,1.767499,3.307169,0.360152,0.850265,5.292916,52.776752,2.684743,4.189089,7.803944,0.928295,0.553878,1.35086,1.682919,0.306656,0.0,7.350413,2.353301,4.706567,3.533557,0.432397,0.860571,4.574732,1.306453,0.495657,4.650064,17.301699,0.335699,7.226855,1.783137,1.114443,1.58655,0.926229,1.926702,3.64374,0.879785,7.798641,2.771499,17.313346,2.388004,5.376212,6.243532,3.387685,4.052297,0.651425,2.532179,32.106544,2.183914,1.458088,4.709117,6.184516,1.502689,5.528765,3.63904,3.89814,7.21049,3.907967,2.027742,2.882646,6.936726,5.329858,4.7363,0.930876,2.706959,0.427445,2.640967,0.092825,1.192969,0.480931,4.105426,8.984268,0.88175,4.555242,0.811472,5.054237,1.412726,2.761905,0.0,6.881449,7.022262,2.075339,18.49583,0.55786,8.206746,8.156931,0.0,5.334674,0.754773,6.509234,3.244957,1.480543,15.500587,1.39094,9.337178,3.920356,3.102597,0.99418,0.520856,5.57787,1.885142,0.966936,2.321603,3.670364,3.960003,0.027218,5.11121,0.543567,3.723049,1.090681,0.880358,7.600478,1.922739,0.875893,0.148757,1.208175,0.252068,2.935332,13.906862,4.539829,1.619512,13.348662,2.940581,3.946299,7.682567,1.270299,4.439976,2.985712,5.150212,2.661842,1.581718,2.516983,2.945791,0.872072
ENSG00000188976.10,25.584913,49.855911,51.597657,38.227745,105.780337,88.426113,53.704539,0.0,40.825648,34.120935,36.342253,48.361057,50.039426,24.739552,67.89074,69.577832,102.786623,62.350654,16.635854,32.127086,38.556317,38.393164,52.60791,51.695477,51.051552,30.32732,52.385391,17.372767,61.323629,46.605695,74.629457,39.108726,62.049038,54.283014,42.290065,75.870929,46.041244,33.718329,56.423081,25.937886,34.855127,77.221622,53.431587,39.530971,36.060203,71.90239,46.851839,41.066923,78.621074,79.724301,29.586835,63.55254,32.655485,94.995108,44.343232,26.419155,27.654637,78.085247,52.399899,41.987386,47.651945,43.077746,58.091516,56.337918,70.777123,63.445043,166.411429,54.452189,39.793555,44.494215,144.681455,54.996788,49.631097,38.931827,31.976931,64.294865,59.663512,40.874157,41.788435,107.150353,20.389659,28.47245,49.725347,65.614057,52.583473,78.024362,62.259033,40.735973,96.684696,49.978231,41.690086,49.419003,80.743658,51.308127,36.468435,73.957881,75.996927,81.839175,28.228684,30.819585,42.694378,34.748357,39.442981,43.129387,28.111425,39.45641,48.222419,87.460821,34.816494,20.228449,61.484533,47.141646,37.030753,27.971127,32.252813,42.117948,35.37222,45.567774,28.670977,57.89182,67.241338,57.504498,87.911023,75.225956,18.007987,134.963427,53.525806,129.900408,58.173232,61.630737,90.646653,41.486701,71.838019,31.44736,94.265307,72.963388
ENSG00000187961.13,3.186733,9.291329,10.847989,9.640137,51.676885,20.620109,10.63705,17.643009,8.423136,0.0,8.591178,9.134985,11.046617,0.973212,15.019859,8.618144,7.830249,8.022788,0.016838,2.890612,6.912822,6.35506,13.394351,6.720172,8.076813,8.498748,8.211993,10.619445,12.3146,10.590737,9.602161,16.898005,8.596162,11.890314,5.34228,20.121526,12.367346,2.783158,13.45796,6.120453,14.092369,23.553181,8.818202,11.815017,9.18256,13.018858,7.683428,7.247104,7.571833,30.271884,6.770133,19.022187,12.851966,53.847938,15.945197,7.161659,5.686,8.46978,16.760262,12.700893,17.134423,21.474095,22.865506,16.899907,24.693691,24.435505,29.820928,12.758157,12.443879,12.252927,25.535575,14.412416,16.501266,9.612538,4.075844,6.733169,8.90591,9.668209,9.521773,19.492399,0.0,0.206322,7.535838,15.861517,8.838817,5.027081,11.434992,6.041971,0.0,13.249942,8.959595,7.900643,16.275488,12.876557,11.084722,7.771529,27.530236,17.983412,6.254704,5.698923,7.765484,4.61209,5.534585,9.125461,6.386263,15.130887,6.539454,2.576647,9.921078,24.538157,10.283568,6.44021,12.347299,7.600478,6.234335,12.763015,7.469712,11.306228,3.896257,19.536158,23.709307,15.230393,20.446342,16.294573,12.99917,16.705999,12.238312,9.238535,10.216719,11.96302,18.062217,8.150305,17.278765,7.431565,13.320098,19.766974
ENSG00000187583.10,1.608541,1.586324,18.549132,0.698882,37.776552,1.174678,1.448984,11.053451,2.636973,0.35814,12.755803,6.150392,4.803462,3.703197,11.875789,3.163406,7.467954,16.261064,0.0,11.583095,7.648229,3.535898,14.478656,5.164743,11.501166,1.518338,6.430466,0.904573,21.526714,10.300359,15.70263,5.831709,9.049677,4.054115,3.40167,2.07603,6.95175,2.57259,7.168872,2.578981,6.834799,27.595891,8.370451,3.504025,2.969485,10.269432,3.972617,5.90354,4.310881,12.425649,3.03668,30.133826,2.873122,9.940246,11.353648,1.082679,1.230244,2.275616,9.260801,6.722634,3.629659,12.21076,2.25515,13.287937,5.385631,7.359741,15.043593,9.52696,5.371459,6.312114,16.837472,4.188754,11.982115,7.979035,4.975589,1.17332,9.636235,4.719636,1.780035,13.014213,0.0,1.456391,4.322175,21.902235,11.569977,6.016016,3.828333,3.456418,0.0,8.601981,13.283998,2.558913,0.00845,6.433065,4.498475,2.305787,29.744206,19.99242,1.023197,4.548179,6.976309,0.847521,2.150996,7.144585,4.620955,8.333224,2.736883,8.791446,4.728738,6.63928,8.476263,2.0948,5.973062,2.923787,5.658161,2.777833,6.130901,15.518513,5.631928,10.298206,19.192157,4.198121,13.053568,28.222885,7.826533,0.0,10.658091,7.044383,8.903822,8.17036,31.747484,3.257587,17.058526,2.775645,14.536837,9.15676
ENSG00000187642.9,0.0,0.0,0.358501,0.199681,2.995761,0.072288,0.879441,0.0,0.663986,0.0,1.191764,0.373074,0.05382,0.0,0.089598,0.752378,1.028451,0.638176,0.0,0.020647,0.294163,0.203075,0.91847,0.510469,1.36257,0.276061,0.169669,0.0,0.814128,0.927597,1.088129,0.206482,0.103071,6.906035,1.447615,0.015969,0.347154,0.082396,1.464017,0.234453,0.587182,0.70308,0.398001,1.720388,0.548213,2.0948,0.022766,1.00428,1.025224,1.584479,0.447182,1.089376,0.112122,3.021171,0.556551,0.088744,0.351498,1.336797,1.871425,0.13957,0.993594,1.085041,0.0,7.165208,2.482738,0.639978,0.088753,1.180908,0.604289,0.804485,2.068819,0.465417,0.525113,1.151829,0.575837,0.30675,3.834207,0.704423,0.452072,1.590626,0.0,1.650577,1.301737,2.482995,1.359818,2.434299,0.666763,2.812735,0.0,0.679484,6.34009,0.607742,0.0,1.449264,0.322929,1.171564,3.513474,1.674173,0.099019,0.422722,0.205186,0.275937,0.193348,0.38946,0.778813,0.955044,0.520734,0.027218,0.475192,0.0,0.524119,1.627365,0.278594,1.376303,2.104007,0.012513,0.170008,2.481656,0.777811,0.461407,0.366875,0.260349,1.109666,1.025808,1.085312,0.460402,1.714709,0.148476,0.334192,1.149902,0.335566,0.551381,0.520565,0.338251,0.064039,0.290691


Now, it's possible to extract the two different datasets.

- rna_counts_pr_cr: contains the RNA-seq for response group
- rna_counts_sd_pd: contains the RNA-seq for resistance group

In [None]:
## Extract the group of patients
columns_to_select_pr_cr = rna_patients_pr_cr["Harmonized_SU2C_RNA_Tumor_Sample_ID_v2"].values.tolist()
## Extract Dataset
rna_counts_pr_cr_df = rna_counts_hq_cts[columns_to_select_pr_cr]

## Check
print("Patients for Response group (PR/CR)")
print("\n")
print("Shape is: ", rna_counts_pr_cr_df.shape)
display(rna_counts_pr_cr_df.head())

Patients for Response group (PR/CR)


Shape is:  (15924, 52)


Unnamed: 0_level_0,SU2CLC-CLE-NIVO19-T1,SU2CLC-CLE-NIVO20-T1,SU2CLC-CLE-NIVO3-T1,SU2CLC-CLE-NIVO5-T1,SU2CLC-CLE-NIVO52-T1,SU2CLC-COL-1005-T1,SU2CLC-COL-1017-T1,SU2CLC-COL-1018-T1,SU2CLC-COL-1025-T1,SU2CLC-COL-1029-T1,SU2CLC-COL-1036-T1,SU2CLC-COL-1037-T1,SU2CLC-COL-1039-T2,SU2CLC-DFC-1012-T1,SU2CLC-DFC-1020-T1,SU2CLC-DFC-1535-T1,SU2CLC-DFC-1536-T1,SU2CLC-DFC-1537-T1,SU2CLC-DFC-DF0241-T1,SU2CLC-DFC-DF0790-T1,SU2CLC-MDA-1627-T1,SU2CLC-MDA-1631-T1,SU2CLC-MGH-1044-T1,SU2CLC-MGH-1054-T2,SU2CLC-MGH-1055-T1,SU2CLC-MGH-1161-T2,SU2CLC-MGH-1163-T1,SU2CLC-MGH-1387-T1,SU2CLC-MGH-1412-T1,SU2CLC-MGH-1413-T1,SU2CLC-MGH-1414-T1,SU2CLC-MGH-1415-T1,SU2CLC-MGH-1416-T1,SU2CLC-MGH-1417-T1,SU2CLC-MGH-1418-T1,SU2CLC-MGH-1487-T1,SU2CLC-MGH-1488-T1,SU2CLC-MGH-1489-T1,SU2CLC-MGH-1490-T1,SU2CLC-MGH-1498-T1,SU2CLC-MGH-1503-T1,SU2CLC-MGH-1565-T1,SU2CLC-MGH-1574-T1,SU2CLC-MSK-1364-T1,SU2CLC-MSK-A2009-T1,SU2CLC-MSK-A2013-T1,SU2CLC-MSK-A2014-T1,SU2CLC-MSK-A2060-T1,SU2CLC-MSK-A2075-T1,SU2CLC-UCD-1124-T1,SU2CLC-UCD-1137-T1,SU2CLC-UCD-1142-T1
Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1,Unnamed: 48_level_1,Unnamed: 49_level_1,Unnamed: 50_level_1,Unnamed: 51_level_1,Unnamed: 52_level_1
ENSG00000187634.11,1.155171,1.767499,0.360152,5.292916,52.776752,7.803944,0.306656,0.0,0.432397,1.306453,1.783137,1.114443,0.926229,6.243532,1.458088,6.184516,1.502689,5.528765,0.930876,0.480931,6.881449,0.55786,8.206746,8.156931,0.0,1.480543,15.500587,9.337178,0.520856,5.57787,1.885142,0.966936,2.321603,3.670364,3.960003,0.027218,5.11121,0.543567,3.723049,1.922739,1.208175,0.252068,4.539829,13.348662,3.946299,7.682567,1.270299,4.439976,2.985712,5.150212,2.661842,1.581718
ENSG00000188976.10,51.597657,105.780337,53.704539,40.825648,34.120935,50.039426,62.350654,16.635854,51.695477,52.385391,62.049038,54.283014,75.870929,36.060203,63.55254,94.995108,44.343232,26.419155,63.445043,54.996788,28.47245,78.024362,62.259033,40.735973,96.684696,51.308127,36.468435,75.996927,42.694378,34.748357,39.442981,43.129387,28.111425,39.45641,48.222419,87.460821,34.816494,20.228449,61.484533,32.252813,45.567774,28.670977,57.504498,75.225956,134.963427,53.525806,129.900408,58.173232,61.630737,90.646653,41.486701,71.838019
ENSG00000187961.13,10.847989,51.676885,10.63705,8.423136,0.0,11.046617,8.022788,0.016838,6.720172,8.211993,8.596162,11.890314,20.121526,9.18256,19.022187,53.847938,15.945197,7.161659,24.435505,14.412416,0.206322,5.027081,11.434992,6.041971,0.0,12.876557,11.084722,27.530236,7.765484,4.61209,5.534585,9.125461,6.386263,15.130887,6.539454,2.576647,9.921078,24.538157,10.283568,6.234335,11.306228,3.896257,15.230393,16.294573,16.705999,12.238312,9.238535,10.216719,11.96302,18.062217,8.150305,17.278765
ENSG00000187583.10,18.549132,37.776552,1.448984,2.636973,0.35814,4.803462,16.261064,0.0,5.164743,6.430466,9.049677,4.054115,2.07603,2.969485,30.133826,9.940246,11.353648,1.082679,7.359741,4.188754,1.456391,6.016016,3.828333,3.456418,0.0,6.433065,4.498475,29.744206,6.976309,0.847521,2.150996,7.144585,4.620955,8.333224,2.736883,8.791446,4.728738,6.63928,8.476263,5.658161,15.518513,5.631928,4.198121,28.222885,0.0,10.658091,7.044383,8.903822,8.17036,31.747484,3.257587,17.058526
ENSG00000187642.9,0.358501,2.995761,0.879441,0.663986,0.0,0.05382,0.638176,0.0,0.510469,0.169669,0.103071,6.906035,0.015969,0.548213,1.089376,3.021171,0.556551,0.088744,0.639978,0.465417,1.650577,2.434299,0.666763,2.812735,0.0,1.449264,0.322929,3.513474,0.205186,0.275937,0.193348,0.38946,0.778813,0.955044,0.520734,0.027218,0.475192,0.0,0.524119,2.104007,2.481656,0.777811,0.260349,1.025808,0.460402,1.714709,0.148476,0.334192,1.149902,0.335566,0.551381,0.520565


In [None]:
## Extract the group of patients
columns_to_select_sd_pd = rna_patients_sd_pd["Harmonized_SU2C_RNA_Tumor_Sample_ID_v2"].values.tolist()
## Extract Dataset
rna_counts_sd_pd_df = rna_counts_hq_cts[columns_to_select_sd_pd]

## Check
print("Patients for Response group (SD/PD)")
print("\n")
print("Shape is: ", rna_counts_sd_pd_df.shape)
display(rna_counts_sd_pd_df.head())

Patients for Response group (SD/PD)


Shape is:  (15924, 84)


Unnamed: 0_level_0,SU2CLC-CLE-NIVO10-T1,SU2CLC-CLE-NIVO18-T1,SU2CLC-CLE-NIVO2-T1,SU2CLC-CLE-NIVO24-T1,SU2CLC-CLE-NIVO31-T1,SU2CLC-CLE-NIVO9-T1,SU2CLC-COL-1001-T1,SU2CLC-COL-1007-T1,SU2CLC-COL-1008-T1,SU2CLC-COL-1010-T1,SU2CLC-COL-1016-T1,SU2CLC-COL-1020-T1,SU2CLC-COL-1021-T1,SU2CLC-COL-1022-T1,SU2CLC-COL-1023-T1,SU2CLC-COL-1026-T1,SU2CLC-COL-1027-T1,SU2CLC-COL-1031-T1,SU2CLC-COL-1032-T1,SU2CLC-COL-1033-T1,SU2CLC-COL-1034-T1,SU2CLC-COL-1035-T1,SU2CLC-COL-1038-T1,SU2CLC-COL-1041-T1,SU2CLC-COL-1043-T2,SU2CLC-COL-1044-T1,SU2CLC-DFC-1001-T1,SU2CLC-DFC-1002-T1,SU2CLC-DFC-1003-T1,SU2CLC-DFC-1004-T1,SU2CLC-DFC-1007-T1,SU2CLC-DFC-1013-T1,SU2CLC-DFC-1015-T2,SU2CLC-DFC-1016-T1,SU2CLC-DFC-1017-T2,SU2CLC-DFC-1018-T1,SU2CLC-DFC-1019-T1,SU2CLC-DFC-1534-T1,SU2CLC-DFC-1538-T1,SU2CLC-DFC-1539-T1,SU2CLC-DFC-DF0032-T1,SU2CLC-DFC-DF0033-T1,SU2CLC-DFC-DF0047-T1,SU2CLC-DFC-DF0107-T1,SU2CLC-DFC-DF0108-T1,SU2CLC-DFC-DF0109-T1,SU2CLC-DFC-DF0112-T1,SU2CLC-DFC-DF0499-T1,SU2CLC-DFC-DF0510-T1,SU2CLC-DFC-DF0512-T1,SU2CLC-DFC-DF0561-T1,SU2CLC-DFC-DF0668-T1,SU2CLC-DFC-DF0840-T1,SU2CLC-MDA-1441-T1,SU2CLC-MDA-1442-T1,SU2CLC-MDA-1443-T1,SU2CLC-MDA-1444-T1,SU2CLC-MDA-1561-T1,SU2CLC-MDA-1562-T1,SU2CLC-MDA-1563-T1,SU2CLC-MDA-1564-T1,SU2CLC-MDA-1628-T1,SU2CLC-MDA-1629-T1,SU2CLC-MDA-1630-T1,SU2CLC-MGH-1135-T2,SU2CLC-MGH-1148-T1,SU2CLC-MGH-1149-T1,SU2CLC-MGH-1150-T1,SU2CLC-MGH-1169-T1,SU2CLC-MGH-1389-T1,SU2CLC-MGH-1409-T1,SU2CLC-MGH-1411-T1,SU2CLC-MGH-1492-T1,SU2CLC-MGH-1493-T1,SU2CLC-MGH-1495-T1,SU2CLC-MGH-1500-T1,SU2CLC-MGH-1501-T1,SU2CLC-MGH-1568-T1,SU2CLC-MGH-1572-T1,SU2CLC-MGH-1575-T1,SU2CLC-MSK-1365-T1,SU2CLC-UCD-1143-T1,SU2CLC-UCD-1557-T1,SU2CLC-UCD-1560-T1
Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1,Unnamed: 48_level_1,Unnamed: 49_level_1,Unnamed: 50_level_1,Unnamed: 51_level_1,Unnamed: 52_level_1,Unnamed: 53_level_1,Unnamed: 54_level_1,Unnamed: 55_level_1,Unnamed: 56_level_1,Unnamed: 57_level_1,Unnamed: 58_level_1,Unnamed: 59_level_1,Unnamed: 60_level_1,Unnamed: 61_level_1,Unnamed: 62_level_1,Unnamed: 63_level_1,Unnamed: 64_level_1,Unnamed: 65_level_1,Unnamed: 66_level_1,Unnamed: 67_level_1,Unnamed: 68_level_1,Unnamed: 69_level_1,Unnamed: 70_level_1,Unnamed: 71_level_1,Unnamed: 72_level_1,Unnamed: 73_level_1,Unnamed: 74_level_1,Unnamed: 75_level_1,Unnamed: 76_level_1,Unnamed: 77_level_1,Unnamed: 78_level_1,Unnamed: 79_level_1,Unnamed: 80_level_1,Unnamed: 81_level_1,Unnamed: 82_level_1,Unnamed: 83_level_1,Unnamed: 84_level_1
ENSG00000187634.11,4.431076,0.453236,1.009496,3.307169,0.850265,2.684743,4.189089,0.928295,0.553878,1.35086,1.682919,7.350413,2.353301,4.706567,3.533557,0.860571,4.574732,0.495657,4.650064,17.301699,0.335699,7.226855,1.58655,1.926702,3.64374,0.879785,7.798641,2.771499,17.313346,2.388004,5.376212,3.387685,4.052297,0.651425,2.532179,32.106544,2.183914,4.709117,3.63904,3.89814,7.21049,3.907967,2.027742,2.882646,6.936726,5.329858,4.7363,2.706959,0.427445,2.640967,0.092825,1.192969,4.105426,8.984268,0.88175,4.555242,0.811472,5.054237,1.412726,2.761905,0.0,7.022262,2.075339,18.49583,5.334674,0.754773,6.509234,3.244957,1.39094,3.920356,3.102597,0.99418,1.090681,0.880358,7.600478,0.875893,0.148757,2.935332,13.906862,1.619512,2.940581,2.516983,2.945791,0.872072
ENSG00000188976.10,25.584913,49.855911,38.227745,88.426113,0.0,36.342253,48.361057,24.739552,67.89074,69.577832,102.786623,32.127086,38.556317,38.393164,52.60791,51.051552,30.32732,17.372767,61.323629,46.605695,74.629457,39.108726,42.290065,46.041244,33.718329,56.423081,25.937886,34.855127,77.221622,53.431587,39.530971,71.90239,46.851839,41.066923,78.621074,79.724301,29.586835,32.655485,27.654637,78.085247,52.399899,41.987386,47.651945,43.077746,58.091516,56.337918,70.777123,166.411429,54.452189,39.793555,44.494215,144.681455,49.631097,38.931827,31.976931,64.294865,59.663512,40.874157,41.788435,107.150353,20.389659,49.725347,65.614057,52.583473,49.978231,41.690086,49.419003,80.743658,73.957881,81.839175,28.228684,30.819585,47.141646,37.030753,27.971127,42.117948,35.37222,57.89182,67.241338,87.911023,18.007987,31.44736,94.265307,72.963388
ENSG00000187961.13,3.186733,9.291329,9.640137,20.620109,17.643009,8.591178,9.134985,0.973212,15.019859,8.618144,7.830249,2.890612,6.912822,6.35506,13.394351,8.076813,8.498748,10.619445,12.3146,10.590737,9.602161,16.898005,5.34228,12.367346,2.783158,13.45796,6.120453,14.092369,23.553181,8.818202,11.815017,13.018858,7.683428,7.247104,7.571833,30.271884,6.770133,12.851966,5.686,8.46978,16.760262,12.700893,17.134423,21.474095,22.865506,16.899907,24.693691,29.820928,12.758157,12.443879,12.252927,25.535575,16.501266,9.612538,4.075844,6.733169,8.90591,9.668209,9.521773,19.492399,0.0,7.535838,15.861517,8.838817,13.249942,8.959595,7.900643,16.275488,7.771529,17.983412,6.254704,5.698923,6.44021,12.347299,7.600478,12.763015,7.469712,19.536158,23.709307,20.446342,12.99917,7.431565,13.320098,19.766974
ENSG00000187583.10,1.608541,1.586324,0.698882,1.174678,11.053451,12.755803,6.150392,3.703197,11.875789,3.163406,7.467954,11.583095,7.648229,3.535898,14.478656,11.501166,1.518338,0.904573,21.526714,10.300359,15.70263,5.831709,3.40167,6.95175,2.57259,7.168872,2.578981,6.834799,27.595891,8.370451,3.504025,10.269432,3.972617,5.90354,4.310881,12.425649,3.03668,2.873122,1.230244,2.275616,9.260801,6.722634,3.629659,12.21076,2.25515,13.287937,5.385631,15.043593,9.52696,5.371459,6.312114,16.837472,11.982115,7.979035,4.975589,1.17332,9.636235,4.719636,1.780035,13.014213,0.0,4.322175,21.902235,11.569977,8.601981,13.283998,2.558913,0.00845,2.305787,19.99242,1.023197,4.548179,2.0948,5.973062,2.923787,2.777833,6.130901,10.298206,19.192157,13.053568,7.826533,2.775645,14.536837,9.15676
ENSG00000187642.9,0.0,0.0,0.199681,0.072288,0.0,1.191764,0.373074,0.0,0.089598,0.752378,1.028451,0.020647,0.294163,0.203075,0.91847,1.36257,0.276061,0.0,0.814128,0.927597,1.088129,0.206482,1.447615,0.347154,0.082396,1.464017,0.234453,0.587182,0.70308,0.398001,1.720388,2.0948,0.022766,1.00428,1.025224,1.584479,0.447182,0.112122,0.351498,1.336797,1.871425,0.13957,0.993594,1.085041,0.0,7.165208,2.482738,0.088753,1.180908,0.604289,0.804485,2.068819,0.525113,1.151829,0.575837,0.30675,3.834207,0.704423,0.452072,1.590626,0.0,1.301737,2.482995,1.359818,0.679484,6.34009,0.607742,0.0,1.171564,1.674173,0.099019,0.422722,1.627365,0.278594,1.376303,0.012513,0.170008,0.461407,0.366875,1.109666,1.085312,0.338251,0.064039,0.290691


At this point the two datasets will be normalized in order to be used for network analysis.

In [None]:
## Save dataset for Response group
with open(save_path_data + "rna_counts_pr_cr_normalized.csv", "w") as output:
    output.write(str(rna_counts_pr_cr_df.to_csv(index = True)))

In [None]:
## Save dataset for Resistance group
with open(save_path_data + "rna_counts_sd_pd_normalized.csv", "w") as output:
    output.write(str(rna_counts_sd_pd_df.to_csv(index = True)))

In [None]:
## Save the dataset that will be the input of Limma
with open(save_path_data + "rna_counts_hq.csv", "w") as output:
    output.write(str(rna_counts_hq_cts.to_csv()))