# Tutorial for [HDmics](https://github.com/guangxujin/HDmics)

This notebook explicts the details of the high-dimensional model by displaying input data, codes for Pareto optimization, and output data, as well as figures. The contents include
#### 1. Input
#### 2. Pareto Optimization
#### 3. Parallel computing and GPU-accelerated computing for a large number of single cells within CyTOF data
#### 4. Output
#### 5. Quantification of CD8Tex by the strength scores, $\mathbf{S}_{\wp(\mathbf{\Omega}) }$
#### 6. Association analyses between the quantitative values, $\boldsymbol{\theta}, \mathbf{d(E)},\mathbf{p(E)}$, and immunotherapy responses
## 1. Input

HDmics quantifies the CD8 T cell exhaustion (CD8Tex) by the Pareto optimization (PO). The PO model utilizes a $k-$dimensional expression space, defined by the multiple immune checkpoints (MICs) and related transcriptional factors (TFs).

\begin{equation*}
\label{PO}
 \textbf{PO:} \begin{cases} \underset{\mathbf{y}}{\text{max}} &\left[\mathbf{y}_1, \mathbf{y}_2, \ldots, \mathbf{y}_k\right]\\ \text{subject to } &\mathbf{y} \in \mathbf{\Theta} \end{cases}
\end{equation*}

The $k-$dimensional hyperspace is described by an input file with $k$ columns, in which each column denotes a marker of the MICs and TFs and each row is a single cell. See blow:

In [20]:
!head ../Tests/input_example.csv

0.000000,4.647093,5.099177,0.000000,0.000000,24.346270
0.000000,27.792419,0.000000,0.000000,68.452293,15.114510
0.000000,21.579929,144.226105,0.000000,0.249643,66.083168
0.000000,15.682070,1.245843,0.000000,61.164871,1.243367
0.000000,3.272880,1.704744,0.000000,0.202038,7.481422
0.000000,4.291564,0.000000,0.000000,9.710076,15.477640
0.000000,32.142818,9.081074,0.000000,57.673870,9.885400
0.000000,20.178490,0.000000,0.000000,20.316780,52.126122
0.000000,4.451198,0.000000,0.000000,1.085103,40.683681
0.000000,0.794618,0.000000,1.730134,42.857761,5.975451


### The column markers are

In [19]:
!head ../Tests/Header_human.csv

TIM3
PD-1
CTLA-4
Lag3
EOMES
Tbet


In [93]:
import pandas as pd
import numpy as np
from display import *
table=read("../Tests/input_example.csv",',')
cols=read("../Tests/Header_human.csv",',')
table=np.array(table[0:10])
df = pd.DataFrame(table)
df.columns = np.array(cols)[:,0]
df.index.name= "single_cell_id"
df

Unnamed: 0_level_0,TIM3,PD-1,CTLA-4,Lag3,EOMES,Tbet
single_cell_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
0,0.0,4.647093,5.099177,0.0,0.0,24.34627
1,0.0,27.792419,0.0,0.0,68.452293,15.11451
2,0.0,21.579929,144.226105,0.0,0.249643,66.083168
3,0.0,15.68207,1.245843,0.0,61.164871,1.243367
4,0.0,3.27288,1.704744,0.0,0.202038,7.481422
5,0.0,4.291564,0.0,0.0,9.710076,15.47764
6,0.0,32.142818,9.081074,0.0,57.67387,9.8854
7,0.0,20.17849,0.0,0.0,20.31678,52.126122
8,0.0,4.451198,0.0,0.0,1.085103,40.683681
9,0.0,0.794618,0.0,1.730134,42.857761,5.975451


## 2. Pareto Optimization

When applying the PO model to the CyTOF data, the variable, $\mathbf{y}$, denotes the expression levels of markers, $\mathbf{E}$, across the single cells of interest. $k$ is the number of the MICs and TFs. The feasible region of the PO model of $\mathbf{\Omega}$, mapped from decision space $\mathbf{X}$ to the objective space $\mathbf{Y}$, is denoted by $\mathbf{\Theta}$. The objective space, $\mathbf{\Theta}$, contains the variables defined by all candidate single cells of interest, each of which represents the expression levels of the markers from one single cell.


### Pareto Dominance


<img src="../Tests/img/POF_2D.jpg" style="width: 300px;" />

This example in $2-$dimensional space shows how PO model models the single cells within the high-dimensional expression space. The single cells in red color are of high CD8Tex in the context of the 2 markers whose expression levels are described as $\mathbf{E_1}$ and $\mathbf{E_2}$. 

The Pareto Dominance is required for PO model to determine which single cells tend to have high CD8Tex. In this example, single cell $\mathbf{a}$ dominates single cell $\mathbf{b}.$ Pareto dominance between any two single cells is defined by the high-dimensional expression levels of all MICs and TFs. Two single cells, $\mathbf{a}$ and $\mathbf{b}$, have a dominance relationship, that is, $\mathbf{a} \succ \mathbf{b}$, which requires that expression level of each marker of single cell $\mathbf{a}$ is not lower than that of another single cell $\mathbf{b}$, i.e., $\mathbf{E}_i (\mathbf{a}) \geq \mathbf{E}_i (\mathbf{b}), i=1,2,\dots,k$, and at least one marker of single cell $\mathbf{a}$ is higher than that of another single cell $\mathbf{b}$, i.e., $\exists i \in \{1,2,\dots,k\}$, satisfies that $\mathbf{E}_i (\mathbf{a})> \mathbf{E}_i (\mathbf{b})$

<img src="../Tests/img/Fig.1b.jpg" style="width: 600px;" />

### Strength score and fitness score for Pareto front
To quantify the CD8Tex of a single cell,  a strength score, $S(\mathbf{a})$, and a fitness score, $F(\mathbf{a})$, were subsequently derived from the defined Pareto dominance.

\begin{equation*}
S(\mathbf{a})=|\{\mathbf{b}|\mathbf{a} \succ \mathbf{b}\}| 
\end{equation*}
and
\begin{equation*}
F(\mathbf{a})=\sum_{\mathbf{c} \succ \mathbf{a}} S(\mathbf{c})
\end{equation*}

$S(\mathbf{a})$ describes the number of other single cells dominated by single cell $\mathbf{a}$ and $F(\mathbf{a})$ recounts total of strength scores of the single cells that dominate single cell $\mathbf{a}$. High strength score and low fitness score represent high CD8Tex in this study. The Pareto front (POF) in our computational model is used to identify high CD8Tex. The POF is defined by those nondominated single cells with fitness scores of zero and strength scores higher than 0.

The following is an example of POF from a $k-$dimensional expression space.

<img src="../Tests/img/Fig.1c.jpg" style="width: 600px;" />


## 3. Parallel computing and GPU-accelerated computing for a large number of single cells within CyTOF data

The Pareto dominace requirs the pair-wise comparision of the $k-$dimensional expression levels of the $k$ markers between any two single cells. Single-cell mass cytometry (CyTOF) data include thousands or even millions of single cells. The computation of the Pareto dominance is time-consuming if the single cell number is higher than 100,000. We take advantages of two Python packages to accelerate the computation for Pareto dominance by HPC and GPU.

### HPC or desktop/laptop
The strategy for HPC is to use the Scoop and multiprocessing packages. The computation of strength scores (S) and fitness scores (F) were distributed to multiple cores by futures.map.

In [107]:
from scoop import futures

In [None]:
S_Parallel = list(futures.map(S, single_cells))
F_Parallel = list(futures.map(F, single_cells))

#### *Run HDMics on HPC by the command:

In [None]:
HDmics -core HPC -core_number 96 -config config

Of note, the Scoop package is unnecessary for desktop or laptop. The scoop will be replaced by multiprocessing automatically. 

#### *Run HDMics on desktop or laptop by the command:

In [None]:
HDmics -core_number 4 -config config

### GPU
The strategy for GPU to accelerate the computation utilizes the Numba package to speed up the pair-wise comparision for Pareto dominance. The Numba package enables array-oriented computation by GPUs.

In [9]:
from numba import vectorize
from numba import *
from numba import cuda

In [10]:
@vectorize(['float32(float32, float32)'], target='cuda')
def Pareto_dominace(a, b):
  return a>b

#### *Run HDMics on GPU by the command:

In [None]:
HDmics -core GPU -config config

*detail codes with configuratures, please refer to [computation.ipynb](https://nbviewer.jupyter.org/github/guangxujin/HDmics/blob/master/pynb/computation.ipynb)

## 4. Output

The output from HDmics software is a file with strength scores and fitness scores of single cells. The columns are single_cell_id, S, F, and annotation. The row numbers denote single cells.

In [42]:
!head ../Tests/output_example.txt

0	21.0	0.0	front
1	13.0	0.0	front
2	123.0	0.0	front
3	51.0	0.0	front
4	83.0	0.0	front
5	0.0	0.0	other
6	1.0	0.0	front
7	70.0	0.0	front
8	3.0	0.0	front
9	0.0	0.0	other


In [None]:
The columns are

In [43]:
!head ../Tests/output_header.txt

single_cell_id
S (strength_score)
F (fitness_score)
Pareto_front

In [100]:
import pandas as pd
import numpy as np
from display import *
table=read("../Tests/output_example.txt",'\t')
cols=read("../Tests/output_header.txt",'\t')
table=np.array(table[0:10])
df = pd.DataFrame(table)
df.columns = np.array(cols)[:,0]

df

Unnamed: 0,single_cell_id,S (strength_score),F (fitness_score),Pareto_front
0,0,21.0,0.0,front
1,1,13.0,0.0,front
2,2,123.0,0.0,front
3,3,51.0,0.0,front
4,4,83.0,0.0,front
5,5,0.0,0.0,other
6,6,1.0,0.0,front
7,7,70.0,0.0,front
8,8,3.0,0.0,front
9,9,0.0,0.0,other


## 5. Quantification of CD8Tex by the strength scores, $\mathbf{S}_{\wp(\mathbf{\Omega}) }$

The identified strength scores, $\mathbf{S}_{\wp(\mathbf{\Omega}) }$, based on Pareto dominance, represent to what extent the single cells in the POF are exhausted in the context of all MICs and TFs. For the application of PO model to the CyTOF data, $\mathbf{S}_{\wp(\mathbf{\Omega}) }$ were derived from a 6$-$dimensional expression space defined by the markers. Since the strength score vector, $\mathbf{S}_{\wp(\mathbf{\Omega}) }$, considers multiple checkpoints and TFs, the exhaustion represented by $\mathbf{S}_{\wp(\mathbf{\Omega}) }$ is called overall CD8Tex. 

To show the relationship between each marker and the overall CD8Tex, we defined three quantitative values at sample level for each marker. These three quantitative values were derived from two vectors in the 6$-$dimensional expression space. One is the strength score vector, $\mathbf{S}_{\wp(\mathbf{\Omega}) }$, and the other is the expression levels of a marker $i$, $\mathbf{E}_i$, of the single cells in the POF, $\wp(\mathbf{\Omega})$. 

### POF angle
The first quantitative value is the angle between these two vectors, that is, $\theta_i$, which is termed by POF angle (\textbf{Fig.1c,2a, Methods}). The POF angle determines how sensitive the marker is in regulating overall exhaustion. If the overall CD8Tex can be represented by an individual marker, the angle is close to zero degree. 

<img src="../Tests/img/Fig.2a.jpg" style="width: 600px;" />

We first calculate the POF angle, $\theta_i$, between the expression level vector of marker $i$, $\mathbf{E}_{i}$, and the strength score vector, $\mathbf{S}_{\wp(\mathbf{\Omega})}$.

\begin{equation}
\cos\theta_i=\frac{\mathbf{E}_{i} \cdot \mathbf{S}_{\wp(\mathbf{\Omega})}}{\| \mathbf{E}_{i}\| \times \|\mathbf{S}_{\wp(\mathbf{\Omega})}\|}
\end{equation}

where $\mathbf{E}_{i} \cdot \mathbf{S}_{\wp(\mathbf{\Omega})}$ is the inner product or dot product of the two vectors and $\| \cdot \|$ is the length of the vector. Thus, the POF angle, $\theta_i$, is calculated as follows.


\begin{equation}
\theta_i=\arccos  \left ( \frac{\mathbf{E}_{i} \cdot \mathbf{S}_{\wp(\mathbf{\Omega})}}{\| \mathbf{E}_{i}\| \times \|\mathbf{S}_{\wp(\mathbf{\Omega})}\|} \right )
\end{equation}


### POF expression
Next, another quantitative value, $d(\mathbf{E}_i)$, is computed from single-cell level expression to sample-level via normalizing the vector length of $\mathbf{E}_i$ by the cell number of POF (\textbf{Fig.2a}). We called this quantitative value, $d(\mathbf{E}_i)$, as the POF expression of marker $i$ in terms of the single cells within the POF. POF expression is used to describe CD8Tex by one marker at sample level. 

We compute the POF expression level, $d(\mathbf{E}_{i})$, by the length of expression level vector, $\mathbf{E}_{i}$. To reduce the impact of cell number of $\wp(\mathbf{\Omega})$ on the $d(\mathbf{E}_{i})$, we normalize the length, $\| \mathbf{E}_{i} \|$.

\begin{equation}
d(\mathbf{E}_{i}) = \sqrt{\frac{\sum_{p=1}^{N} \mathbf{E}_{i,p} ^2}{N}}=\frac{\| \mathbf{E}_{i} \|}{\sqrt{N}}
\end{equation}
where $N=| \mathbf{E}_{i} |$ denotes the cell number within the POF.

### Contribution of a marker to CD8Tex ($\mathbf{S}_{\wp(\mathbf{\Omega}) }$)
Lastly, to evaluate how marker $i$ contributes to overall CD8Tex represented by $\mathbf{S}_{\wp(\mathbf{\Omega}) }$, we defined the third quantitative value by POF angle and POF expression of marker $i$. Projecting the POF expression value, $d(\mathbf{E}_i)$, of marker $i$ to $\mathbf{S}_{\wp(\mathbf{\Omega}) }$, by the POF angle, $\theta_i$, defines the POF projected expression value, $p(\mathbf{E}_i)$, of this marker. The projected POF expression denotes the contribution of marker $i$ to the overall CD8Tex (\textbf{Fig.2a, Methods}).

\begin{equation}
p(\mathbf{E}_{i}) = d(\mathbf{E}_{i})\cdot \cos\theta
\end{equation}

Thus, for all markers, $i = 1, 2, \ldots, k$, we set 

\begin{equation}
\boldsymbol{\theta}=\{\boldsymbol{\theta}_1,\boldsymbol{\theta}_2, \ldots, \boldsymbol{\theta}_k\}
\end{equation}


\begin{equation}
\mathbf{d}(\mathbf{E})=\{\mathbf{d}(\mathbf{E}_{1}), \mathbf{d}(\mathbf{E}_{2}),\ldots, \mathbf{d}(\mathbf{E}_{k})\}
\end{equation}


\begin{equation}
\mathbf{p}(\mathbf{E})=\{\mathbf{p}(\mathbf{E}_{1}), \mathbf{p}(\mathbf{E}_{2}),\ldots, \mathbf{p}(\mathbf{E}_{k})\}
\end{equation}

where $\boldsymbol{\theta}, \mathbf{d},\mathbf{p}$ denote the application of the equations of POF angle, POF expression, and POF projected expression, to all samples, $\{1, 2, \ldots, s\}$, from a single-cell data set.


#### *Run quantification of CD8Tex:

In [None]:
HDmics -mode quantification -config config

#### Output: $\boldsymbol{\theta}, \mathbf{d(E)},\mathbf{p(E)}$, please see following examples.

## 6. Association analyses between the quantitative values, $\boldsymbol{\theta}, \mathbf{d(E)},\mathbf{p(E)}$, and immunotherapy responses

We associate the quantitative values defined by the strength scores with immunotherapy responses in terms of mouse tumor volumes and patient clinical outcomes. 

### Simple comparison between a mouse with a large tumor and another mouse with complete response after receiving anti-CTLA-4 treatment.

In [106]:
import pandas as pd
import numpy as np
from display import *
table=read("../Tests/comprision_two_mice.txt.4display.txt",'\t')
df = pd.DataFrame(table)
df.columns=["info","Large_tumor","Small_tumor"]
df1=df.set_index("info")
df1

Unnamed: 0_level_0,Large_tumor,Small_tumor
info,Unnamed: 1_level_1,Unnamed: 2_level_1
sample_id,c09_C147-C166__CD8__MICs_TFs,c08_C147-C166__CD8__MICs_TFs
Cohort,1,1
Mouse ID,C156,C155
Barcode number,9,8
Treatment group,aCTLA-4 + GVAX,aCTLA-4 + GVAX
"""Final tumor volume (Day 19, mm3)""",454.9,0
Ratio of POF single cells,0.69,0.85
theta_TIM3,72.10,75.20
d_TIM3,0.38,0.05
p_TIM3,0.12,0.01


#### Comparison of POF angles, POF expression levels, and POF projected expression levels of the markers:

<img src="../Tests/img/Fig.2c.jpg" style="width: 1200px;" />

The POF expression levels, $d\mathbf{(E_i)}$, are rescaled between the two mice in order to show a consistent pattern among the markers. The strength score vectors of these two mice were merged into the vector at 0 degree.

#### Difference of strength score distribtuions between these two mice

<img src="../Tests/img/Fig.2b2.jpg" style="width: 400px;" />

#### Difference of expression levels of the markers between these two mice

<img src="../Tests/img/Fig.2d.jpg" style="width: 1200px;" />

#### These comparisions enabled by the POF-derived quantitative values of these two mice, strength scores of single cells, and the single-cell expression levels of the markers suggest the high CD8 T cell exhaustion associated with the mouse with a large tumor of 454.9mm$^3$ after the anti-CTLA-4 immunotherapy.

### Strong positive correlation between POF-derived quantitative values for overall CD8Tex and ICI outcomes of melanoma mice

The melanoma mice received anti-PD-1 or anti-CTLA-4 immune checkpoint inhibition (ICI). 15 mice received anti-PD-1 treatment, 19 mice received anti-CTLA-4 treatment, and 18 control mice that did not received any ICIs. These mice were also treated by a single dose of GVAX tumor vaccine to improve low baseline T cell tumor infiltration and the response to anti-CTLA-4 monotherapy. The ICI outcome is inferred by tumor volume.


In [87]:
import pandas as pd
import numpy as np
from display import *
table=read("../Tests/Supplementary_Table_4.txt",'\t')
table=np.array(table)
df = pd.DataFrame(table[1:len(table),0:len(table[0])])
df.columns = table[0,0:len(table[0,:])].tolist()
df1=df.set_index('sample_id')
df1

Unnamed: 0_level_0,Cohort,Mouse ID,Barcode number,Treatment group,"""Final tumor volume (Day 19, mm3)""",Ratio of POF single cells,theta_TIM3,d_TIM3,p_TIM3,theta_VISTA,...,p_TIGIT,theta_PD-1,d_PD-1,p_PD-1,theta_TBET,d_TBET,p_TBET,theta_EOMES,d_EOMES,p_EOMES
sample_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
c01_2016-0722-BL6-GVAX__CD8__MICs_TFs,2,C240,1,Control + GVAX,339.7,0.77,80.14,0.47,0.08,87.36,...,0.93,79.13,2.38,0.45,72.89,2.03,0.6,75.85,0.49,0.12
c01_2016-0930-B16BL6-TIL__CD8__MICs_TFs,3,C380,1,Control + GVAX,584.2,0.85,81.1,0.33,0.05,86.76,...,0.44,80.62,1.17,0.19,71.2,1.2,0.39,73.14,0.46,0.13
c01_C147-C166__CD8__MICs_TFs,1,C147,1,Control + GVAX,759.7,0.68,70.68,0.33,0.11,84.99,...,1.65,77.47,3.61,0.78,65.8,1.83,0.75,70.14,0.91,0.31
c02_2016-0722-BL6-GVAX__CD8__MICs_TFs,2,C241,2,Control + GVAX,86.2,0.81,73.3,0.2,0.06,87.02,...,0.47,79.24,1.79,0.33,70.97,1.16,0.38,77.48,0.37,0.08
c02_2016-0930-B16BL6-TIL__CD8__MICs_TFs,3,C381,2,Control + GVAX,190.4,0.82,76.27,0.27,0.06,87.13,...,0.42,79.44,1.19,0.22,71.74,1.12,0.35,78.09,0.45,0.09
c02_C147-C166__CD8__MICs_TFs,1,C148,2,Control + GVAX,298.8,0.69,74.07,0.35,0.1,85.59,...,0.98,77.17,2.17,0.48,66.07,1.41,0.57,72.35,0.61,0.18
c03_2016-0722-BL6-GVAX__CD8__MICs_TFs,2,C242,3,Control + GVAX,593.2,0.92,87.44,1.11,0.05,87.34,...,0.83,82.92,0.78,0.1,68.31,0.89,0.33,66.98,0.43,0.17
c03_2016-0930-B16BL6-TIL__CD8__MICs_TFs,3,C382,3,Control + GVAX,339.2,0.78,79.42,0.42,0.08,87.49,...,0.6,80.24,1.92,0.33,72.41,1.66,0.5,76.98,0.69,0.16
c03_C147-C166__CD8__MICs_TFs,1,C149,3,Control + GVAX,261.0,0.68,70.98,0.39,0.13,84.94,...,1.06,76.68,2.97,0.68,67.85,2.06,0.78,72.25,0.83,0.25
c04_2016-0722-BL6-GVAX__CD8__MICs_TFs,2,C243,4,Control + GVAX,281.9,0.8,76.87,0.37,0.08,87.24,...,0.89,79.14,2.09,0.39,72.74,2.41,0.72,79.04,0.88,0.17


#### Correlation between the quantitative values and the tumor volumes of the 52 melanoma mice

<img src="../Tests/img/Fig.3.jpg" style="width: 1200px;" />

#### Combinations of the markers and the correlation between the combined contributions, $\sum_{i \in comb}p\mathbf{(E_i)}$, and the tumor volumes

<img src="../Tests/img/Fig.3g.jpg" style="width: 1200px;" />


### Significant association between POF-derived quantitative values for overall CD8Tex and clinical ICI responses of the melanoma patients
The melanoma patients received Ipilimumab (anti-CTLA-4), Nivolumab or Pembrolizumab (anti-PD-1), or combination of Ipilimumab+Nivolumab. The ICI outcomes are partial response (PR, 1 patient), stable disease (SD, 4 patients), and progressive disease (PD, 2 patients and 3 specimens from each). This human CyTOF data set also includes 3 healthy donors whose donor blood samples were derived. 


In [83]:
import pandas as pd
import numpy as np
from display import *
table=read("../Tests/Supplementary_Table_7.txt",'\t')
table=np.array(table)
df = pd.DataFrame(table[1:len(table),0:len(table[0])])
df.columns = table[0,0:len(table[0,:])].tolist()
df.set_index=['sample_id']
df

Unnamed: 0,sample_id,Clinical response,Ratio of POF single cells,theta_TIM3,d_TIM3,p_TIM3,theta_PD-1,d_PD-1,p_PD-1,theta_CTLA-4,...,p_CTLA-4,theta_LAG3,d_LAG3,p_LAG3,thera_EOMES,d_EOMES,p_EOMEs,theta_Tbet,d_Tbet,p_Tbet
0,export_export_120A5_a_CD19-_CD3+___CD8_MICs_TFs,PD,0.93,89.71,2.64,0.01,69.45,10.11,3.55,74.02,...,6.38,82.7,0.12,0.01,71.71,12.22,3.83,67.32,10.52,4.06
1,export_export_120A5_b_CD19-_CD3+___CD8_MICs_TFs,PD,0.91,87.38,0.27,0.01,70.59,10.75,3.57,74.16,...,6.59,82.78,0.12,0.02,72.9,12.61,3.71,68.48,10.87,3.99
2,export_export_120A5_c_CD19-_CD3+___CD8_MICs_TFs,PD,0.9,89.92,1.37,0.0,69.16,10.61,3.78,73.65,...,6.24,81.98,0.14,0.02,71.52,13.59,4.31,67.63,10.05,3.82
3,export_export_170_a_CD19-_CD3+___CD8_MICs_TFs,PD,0.89,87.57,0.35,0.01,73.11,1.45,0.42,69.68,...,13.72,82.09,0.16,0.02,67.7,24.38,9.25,64.23,31.7,13.78
4,export_export_170_b_CD19-_CD3+___CD8_MICs_TFs,PD,0.9,84.72,0.34,0.03,71.41,1.51,0.48,69.0,...,14.85,80.92,0.19,0.03,65.03,23.29,9.83,63.03,30.87,14.0
5,export_export_170_c_CD19-_CD3+___CD8_MICs_TFs,PD,0.89,89.86,2.04,0.0,70.7,1.52,0.5,69.0,...,15.08,81.01,0.19,0.03,65.23,23.95,10.03,63.23,31.58,14.22
6,export_export_193_CD19-_CD3+___CD8_MICs_TFs,PR,0.99,89.29,1.7,0.02,81.16,0.44,0.07,76.1,...,3.15,83.49,0.08,0.01,78.55,3.94,0.78,72.52,5.09,1.53
7,export_export_224C_CD19-_CD3+___CD8_MICs_TFs,SD,0.98,89.61,1.34,0.01,79.93,0.35,0.06,75.83,...,3.17,82.42,0.13,0.02,81.92,1.45,0.2,74.81,3.25,0.85
8,export_export_227_CD19-_CD3+___CD8_MICs_TFs,SD,0.67,88.3,16.15,0.48,74.32,0.78,0.21,72.32,...,1.16,82.68,0.08,0.01,72.48,3.11,0.94,67.18,2.82,1.09
9,export_export_251-3L_CD19-_CD3+___CD8_MICs_TFs,SD,0.77,80.99,0.26,0.04,75.6,0.59,0.15,63.23,...,14.45,83.82,0.11,0.01,66.77,8.25,3.25,60.45,9.47,4.67


#### Association of the quantitative values and the patient groups 

<img src="../Tests/img/Fig.4.jpg" style="width: 1200px;" />

#### Combinations of the markers and the association between the combined contributions, $\sum_{i \in comb}p\mathbf{(E_i)}$, and the patient groups

<img src="../Tests/img/Fig.4g.jpg" style="width: 1200px;" />

#### *For all detailed Python codes for computation and figures, please refer to [computation.ipynb](https://nbviewer.jupyter.org/github/guangxujin/HDmics/blob/master/pynb/computation.ipynb)

In [60]:
from IPython.core.display import display, HTML
display(HTML("""<a href="https://github.com/guangxujin/HDmics">Go back to Github/HDmics</a>"""))