# CIBERSORTx Tutorial
```
Andrew E. Davidson
aedavids@ucsc.edu
8/23/2022
```

Create some simple sample data to figure out how to use CIBERSORTx

In [1]:
from IPython.display import display
from itertools import combinations
import numpy as np
import pandas as pd

## 1) Create Signature Gene file

In [2]:
m =  6 # number of samples
n =  8 # number of genes
k =  3 # number of types

In [3]:
geneProfileList = []
for i in range(k):
    tmp = np.zeros(n)
    for j in range(i, 1 + k):
        tmp[i + j] = 1
    geneProfileList.append(tmp)
    
geneProfileList[1][n -1] = 1
geneProfileList

[array([1., 1., 1., 1., 0., 0., 0., 0.]),
 array([0., 0., 1., 1., 1., 0., 0., 1.]),
 array([0., 0., 0., 0., 1., 1., 0., 0.])]

In [4]:
geneNames = [ "G" + str(i + 1) for i in range(n)]
typeNames = [ "T" + str(i + 1) for i in range(k)]

geneProfileDict = { 'name': geneNames}
signatueGeneDF = pd.DataFrame( geneProfileDict )

for i in range(len(typeNames)):
    typeName = typeNames[i]
    signatueGeneDF[typeName] = geneProfileList[i]

signatueGeneDF

Unnamed: 0,name,T1,T2,T3
0,G1,1.0,0.0,0.0
1,G2,1.0,0.0,0.0
2,G3,1.0,1.0,0.0
3,G4,1.0,1.0,0.0
4,G5,0.0,1.0,1.0
5,G6,0.0,0.0,1.0
6,G7,0.0,0.0,0.0
7,G8,0.0,1.0,0.0


## Create Mixture File

In [5]:
mixtureDF = pd.DataFrame( {"sampleTitle": geneNames})
sampleNames = ["S" + str(i + 1) for i in range(m)]
print(sampleNames)

# create samples that are of a single tissue type
for i in range(k):
    sampleName = sampleNames[i]
    mixtureDF[sampleName] = signatueGeneDF.iloc[:, i+1] 
    
mixtureDF

['S1', 'S2', 'S3', 'S4', 'S5', 'S6']


Unnamed: 0,sampleTitle,S1,S2,S3
0,G1,1.0,0.0,0.0
1,G2,1.0,0.0,0.0
2,G3,1.0,1.0,0.0
3,G4,1.0,1.0,0.0
4,G5,0.0,1.0,1.0
5,G6,0.0,0.0,1.0
6,G7,0.0,0.0,0.0
7,G8,0.0,1.0,0.0


In [6]:
# create some mixtures
arr = [1, 2, 3]
r = 2
combos = list(combinations(arr, r))
print(len(combos))
print(combos)
t = 0
for c in combos:
    i,j = c # unpack
    sampleName = sampleNames[t + k]
    print("i:{} j:{} t:{} s:{}".format(i, j, t, sampleName))
    t += 1
    x = signatueGeneDF.iloc[:, i]
    y = signatueGeneDF.iloc[:, j]
    mixtureDF[ sampleName ] = x + y

mixtureDF

3
[(1, 2), (1, 3), (2, 3)]
i:1 j:2 t:0 s:S4
i:1 j:3 t:1 s:S5
i:2 j:3 t:2 s:S6


Unnamed: 0,sampleTitle,S1,S2,S3,S4,S5,S6
0,G1,1.0,0.0,0.0,1.0,1.0,0.0
1,G2,1.0,0.0,0.0,1.0,1.0,0.0
2,G3,1.0,1.0,0.0,2.0,1.0,1.0
3,G4,1.0,1.0,0.0,2.0,1.0,1.0
4,G5,0.0,1.0,1.0,1.0,1.0,2.0
5,G6,0.0,0.0,1.0,0.0,1.0,1.0
6,G7,0.0,0.0,0.0,0.0,0.0,0.0
7,G8,0.0,1.0,0.0,1.0,0.0,1.0


## Save sample files
We will need to upload these CIBERSORTx

In [7]:
signatueGeneDF.to_csv("signatureGenes.txt", index=False, sep="\t")
mixtureDF.to_csv("mixture.txt", index=False, sep="\t")

## 2) Run CIBERSORTx

### A) upload files
go to https://cibersortx.stanford.edu/upload.php . after uploading you should see something like this

![after upload](ciberSortUpload.png)

### B)  configure run
[menue -> run CIBERSORTx](https://cibersortx.stanford.edu/runcibersortx.php)

select option 2, "impute cell fractions"

select Analysis Mode:

configure custom

signature matrix file: select trivalSignature

mixture file: select trivialMixture

permuations for significance analysis: 100

run

### C) results

#### **imputed cell fractions**
![imputed cell fractions ](cellFractions.png)

#### **relative percentage**
![stackedBarChart](stackedBarChart.png)