# Create CiberSort Gene Signature Matrix
```
Andrew E. Davidson
aedavids@ucsc.edu
9/3/2022
```

Reference: terra/jupyterNotebooks/cibersort/CreateGeneSignatureMatrixOverview.ipynb

expected file format
``` 
$ cat signatureGenes.txt 
name	T1	T2	T3
G1	1.0	0.0	0.0
G2	1.0	0.0	0.0
G3	1.0	1.0	0.0
G4	1.0	1.0	0.0
G5	0.0	1.0	1.0
G6	0.0	0.0	1.0
G7	0.0	0.0	0.0
G8	0.0	1.0	0.0
```


output:
- testSignatureGenesDF.shape:(3, 4)
```
/private/groups/kimlab/GTEx_TCGA/geneSignatureProfiles/best/GTEx_TCGA_1vsAll-design:~__gender_+_category-padj:0.001-lfc:2.0-n:25/ciberSort/testSignatureGenes.tsv
```
    
- bestSignatureGeneDF.shape:(832, 84)
```
 /private/groups/kimlab/GTEx_TCGA/geneSignatureProfiles/best/GTEx_TCGA_1vsAll-design:~__gender_+_category-padj:0.001-lfc:2.0-n:25/ciberSort/signatureGenes.tsv
```
    
- upSignatureGeneDF.shape:(1087, 84)
```
/private/groups/kimlab/GTEx_TCGA/geneSignatureProfiles/up/GTEx_TCGA_1vsAll-design:~__gender_+_category-padj:0.001-lfc:2.0-n:25/ciberSort/signatureGenes.tsv
```

TODO:
- create down regulated, ...

In [1]:
import numpy as np
import os
import pandas as pd
import pathlib as pl
import time    

# use display() to print an html version of a data frame
# useful if dataFrame output is not generated by last like of cell
from IPython.display import display

## 1) Data Overview
Our geneSignature Profile Data was created by extraCellularRNA/terra/jupyterNotebooks/signatureGenesUpsetPlots.ipynb. 
We have several candidate gene signature profiles we want to evaluate using cibersort. For example
- potentially differnt 1vsAll design models
- different hyptothese. ie. use top up regulated or down regulated genes, use best genes ( |lfc| > 2 )

The rows in the output from signatureGenesUpsetPlots.ipynb are DESeq results. Our gene signature matrix must be average scaled count values.

In [2]:
kl = "/private/groups/kimlab"
bestGenesDir = "GTEx_TCGA_1vsAll-design:~__gender_+_category-padj:0.001-lfc:2.0-n:25"
bestDataRootDir = kl + "/GTEx_TCGA/geneSignatureProfiles/best/" + bestGenesDir

print("example of file naming convention\n")
! ls $bestDataRootDir | head

print("\nnumber of 'types' ie GTEx tissue types or TCGA cohorts")
! ls $bestDataRootDir | wc -l

print("\nfile structure\n")
!wc -l $bestDataRootDir/ACC_vs_all.results

print()
! head -n 3 $bestDataRootDir/ACC_vs_all.results

example of file naming convention

ACC_vs_all.results
Adipose_Subcutaneous_vs_all.results
Adipose_Visceral_Omentum_vs_all.results
Adrenal_Gland_vs_all.results
Artery_Aorta_vs_all.results
Artery_Coronary_vs_all.results
Artery_Tibial_vs_all.results
Bladder_vs_all.results
BLCA_vs_all.results
Brain_Amygdala_vs_all.results

number of 'types' ie GTEx tissue types or TCGA cohorts
84

file structure

26 /private/groups/kimlab/GTEx_TCGA/geneSignatureProfiles/best/GTEx_TCGA_1vsAll-design:~__gender_+_category-padj:0.001-lfc:2.0-n:25/ACC_vs_all.results

name,baseMean,log2FoldChange,lfcSE,stat,pvalue,padj
FAM181A,73.4815287207693,-22.7051991843563,0.647752856885241,-35.0522563397646,3.60243201558036e-269,6.651710595168349e-265
AL137140.1,19.6803764575685,-19.9939782297626,0.682540256941286,-29.2934783354216,1.2557622372977999e-188,7.729007276861759e-185


## 2) Create Signature Matrix

In [3]:
LOCAL_CACHE_DIR="/scratch/aedavids/tmp"

def loadCache(source, localCacheDir=LOCAL_CACHE_DIR, verbose=False):
    '''
    reading large files over a NFS mount is slow. loadCache() will
    copy the source file into the local cache if it does not not already exist
    
    set verbose=True will print full path to file in localCache
    '''
    # we can not join, combine source if it start from the root of the file system
    tmpSource = source
    if source[0] == "/":
        tmpSource = source[1:]        
    
    localTargetPath = pl.Path(localCacheDir,  tmpSource)
    if verbose:
        print("localCachePath:\n{}\n".format(localTargetPath))
            
    localTargetPath.parent.mkdir(parents=True, exist_ok=True)

    if not localTargetPath.exists():
        #print("localTargetPath:{} does not exits".format(localTargetPath))
        ! cp $source $localTargetPath 
        
    return localTargetPath
    
def testLoadCache():
    source = "/private/groups/kimlab/GTEx_TCGA/geneSignatureProfiles/best/GTEx_TCGA_1vsAll-design:~__gender_+_category-padj:0.001-lfc:2.0-n:25/ciberSort/testSignatureGenes.txt"
    loadCache( source )
    
testLoadCache()

In [4]:
class CibersortGeneSignatureMatrix(object):
    '''
    public functions
        __init__
        getCiberSortSignatueDF
        save
    '''
    
    ################################################################################    
    def __init__(self, 
                geneSignatureProfilesDataRootDir, 
                oneVsAllDataDir, 
                groupByGeneCountFilePath,   
                colDataFilePath,
                estimatedScalingFactorsFilePath,
                outdir="ciberSort",
                testSize=None,
                verbose=False):
        '''
        arguments
            geneSignatureProfilesDataRootDir:
                the path the the geneSignatureProfiles file
                ex. "/GTEx_TCGA/geneSignatureProfiles/best/GTEx_TCGA_1vsAll-design:~__gender_+_category-padj:0.001-lfc:2.0-n:25"

            oneVsAllDataDir:
                the path to the results from 1vsAll results
                
            groupByGeneCountFilePath:  
                TODO:
                
            colDataFilePath:
                TODO
                
            estimatedScalingFactorsFilePath:
                DESeq normalizing factors. Scales each sample to adjust for different
                library sizes and composition
                ex. GTEx_TCGA/1vsAll/estimatedSizeFactors.csv
                
            outdir:
                string
                default = "ciberSort"
                    save() path will be "geneSignatureProfilesDataRootDir + "/" + outDir"
            testSize:
                integer
                default: None (i.e. select all)
                use to select a sub set for testing purposes
                
            verbose:
                boolean, default = False
                argument passed to loadCache()
        '''
        self.geneSignatureProfilesDataRootDir = geneSignatureProfilesDataRootDir
        print("geneSignatureProfilesDataRootDir\n{}".format(geneSignatureProfilesDataRootDir))
        
        self.oneVsAllDataDir = oneVsAllDataDir
        print("oneVsAllDataDir\n{}".format(oneVsAllDataDir))
        
        self.groupByGeneCountFilePath = groupByGeneCountFilePath
        print("groupByGeneCountFilePath\n{}".format(groupByGeneCountFilePath))

        self.outdir = outdir
        self.testSize = testSize
        self.verbose = verbose
        
        # get a list of results files we want to use
        # and their deconvolution types. 
        self.suffix = "_vs_all.results"
        self.listOfResultsFiles = None
        self.listOfTypes = None
        self._initListOfTypes()
        
        # read the results into a dictionary of data frames
        self.resultsDFDict = None
        self._LoadResultsDFDict()
        
        # create a sorted list of all the uniqe signatue genes
        self.signatureGeneSet = None
        self._createsignatureGeneList()
        
        # free up memory
        self.resultsDFDict = None
        
        self.groupByCountDF = None
        self._readAndFilterGroupByCountDF()
        
        self.colDataFilePath = colDataFilePath
        self.colDataDF = None
        self._readColData()
        
        self.estimatedScalingFactorsFilePath = estimatedScalingFactorsFilePath;
        self.scalingFactors = None
        self._loadScalingFactors()
        
        self.ciberSortSignatueDF = None
        self._createSignatureMatrix()
        
        
        
    ################################################################################
    def getCiberSortSignatueDF(self):
        return self.ciberSortSignatueDF
    
    def save(self, prefixStr=None):
        '''
        saves in a cibersort's expected format
        '''
        tOutDir = self.geneSignatureProfilesDataRootDir + "/" + self.outdir
        #os.makedirs(outDir, exist_ok = True)
        
        if prefixStr:
            path = tOutDir + "/" + prefixStr + "SignatureGenes.tsv"
        else:
            path = tOutDir + "/signatureGenes.tsv"
        
        self.ciberSortSignatueDF.to_csv(path, index=False, sep="\t")
        print("\nsaved to: {}".format(path))
 
        
    ################################################################################
    def _initListOfTypes(self):
        '''
        create a list of all the singature results file
        output by SignatureGenesUpsetPlots.ipynb
        '''
        #listOfResultsFiles = !ls $bestDataRootDir
        listOfResultsFiles = !ls $self.geneSignatureProfilesDataRootDir
        listOfResultsFiles.remove( self.outdir )

        self.listOfResultsFiles = sorted(listOfResultsFiles[0:self.testSize])
        print("listOfResultsFiles:\n\t{}".format(self.listOfResultsFiles))

        # create list of types the we want to deconvolve the mixture matrix into
        self.listOfTypes = [ f.split(self.suffix)[0] for f in self.listOfResultsFiles]
        print("listOfTypes:\n\t{}".format(self.listOfTypes))

    ################################################################################
    def _LoadResultsDFDict(self):
        '''
        read the DESeq 1vsAll results files into a dictionary of pandas data frames. 
        with 'type' string as key
        '''
        self.resultsDFDict = dict()
        for i in range(len(self.listOfResultsFiles)):
            resultFile = self.listOfResultsFiles[i]
            deconvolutionType = self.listOfTypes[i]
            path = self.geneSignatureProfilesDataRootDir + "/" + resultFile
            path = loadCache(path, verbose=self.verbose)
            df = pd.read_csv(path, sep=",")
            self.resultsDFDict[deconvolutionType] = df
            
            
    ################################################################################    
    def _createsignatureGeneList(self):
        '''
        some types may share signature genes
        '''
       
        signatureGeneSet = set()
        for deconvolutionType,df in self.resultsDFDict.items():
            name = df.loc[:,'name']
            signatureGeneSet.update(name)
        
        # keep in sort order. makes debug easier
        self.geneListsorted = sorted( list(signatureGeneSet) )[0:self.testSize]
        print("\nnumber of signature genes:{}".format(len(self.geneListsorted)))
        #print(self.geneListsorted)        
        
    ################################################################################        
    def _readAndFilterGroupByCountDF(self):
        '''
        load the groupByGene count data and select the signature genes
        '''
        
        path = loadCache(self.groupByGeneCountFilePath, verbose=self.verbose)
        df = pd.read_csv(path, sep=",")
        
        # set index to geneId. will make join easier' When we transpose
        # the data frame the index will become the column names
        df = df.set_index('geneId')

        print("_readAndFilterGroupByCountDF() shape:{}".format(df.shape))
        print(df.iloc[0:3, 0:4])
        # geneId is an indx
        #df = df[ df.loc[:, "geneId"].isin(self.geneListsorted)]
        df = df[ df.index.isin(self.geneListsorted)]
        
         # sort makes debug easier
        #df = df.sort_values( by=["geneId"] )
        df = df.sort_index(ascending=True)

        print("\nshape:{}\niloc[0:3, 0:4]")
        print(df.iloc[0:3, 0:4])

        self.groupByCountDF = df
        
    ################################################################################        
    def _readColData(self):
        '''
        load the colData. We only need the sample_id and category columns 
        '''
        path = loadCache(self.colDataFilePath, verbose=self.verbose)
        self.colDataDF = pd.read_csv(path, sep=",").loc[:, ['sample_id', 'category']]
        print("_readColData() self.colDataDF.shape:{}".format(self.colDataDF.shape))
        print("iloc[0:3, :]")
        print( self.colDataDF.iloc[0:3, :] )
              

    ################################################################################                   
    def _loadScalingFactors(self):
        '''
        TODO
        '''
        path = loadCache(self.estimatedScalingFactorsFilePath, verbose=self.verbose)
        self.scalingFactors = pd.read_csv(path, sep=",")
        print("_loadScalingFactors() self.scalingFactors.shape:{}".format(self.scalingFactors.shape))
        print("iloc[0:3, :]")
        print( self.scalingFactors.iloc[0:3, :] )
        
    ################################################################################                   
    def _printInfo(self, str, df):
        print("{}.shape:{}".format(str, df.shape))
        print("{}.iloc[0:3, 0:4]".format(str))
        print(df.iloc[0:3, 0:4])
        
    ################################################################################                   
    def _createSignatureMatrix(self):
        '''
        todo
        '''
        # copy so we do not accidently change original groupDF
        transposeGroupByDF = self.groupByCountDF.transpose(copy=True)
        print("_createSignatureMatrix()")
        self._printInfo('transposeGroupByDF', transposeGroupByDF)
        
        # normalize counts
        # element wise multiplication . use values to to multiply a vector
        self._printInfo('scalingFactors', self.scalingFactors)
        #print("************** AEDWIP_transposeGroupByDF.csv")
        #transposeGroupByDF.to_csv("AEDWIP_transposeGroupByDF.csv")
        normalizedDF = transposeGroupByDF *  self.scalingFactors.values
        self._printInfo('normalizedDF', normalizedDF)


        # join the colData, we need the 'category' col so we can
        # calculate the  signature gene mean values 
        joinDF =  pd.merge(left=normalizedDF, 
                            right=self.colDataDF.loc[:,["sample_id", "category"]], 
                            how='inner', 
                            left_index=True, 
                            right_on="sample_id")      
        self._printInfo('joinDF', joinDF)

        # calculate the expected values for each category
        signatureDF = joinDF.groupby("category").mean()
        self._printInfo('signatureDF', signatureDF)
        
        # convert to ciber sort expected upload format
        ciberSortSignatueDF = signatureDF.transpose()
        ciberSortSignatueDF.index.name = "name"
        
        ciberSortSignatueDF = ciberSortSignatueDF.reset_index()
        # the original index data is now a column with name 'index'
        #ciberSortSignatueDF = ciberSortSignatueDF.rename(columns={"index":"sampleTitle"})
        
        self.ciberSortSignatueDF = ciberSortSignatueDF
        self._printInfo('ciberSortSignatueDF', ciberSortSignatueDF)

In [5]:
def CreateTestCibersortGeneSignatureMatrix():
    '''
    use to work out programming bugs. The returned data frame is to big
    to verify it is  correct
    '''
    kl = "/private/groups/kimlab"
    # geneSignatureProfiles/ conatains output of from 
    #  extraCellularRNA/terra/jupyterNotebooks/signatureGenesUpsetPlots.ipynb
    bestGenesDir = "GTEx_TCGA_1vsAll-design:~__gender_+_category-padj:0.001-lfc:2.0-n:25"
    bestGeneSignatureProfilesDataRootDir = kl + "/GTEx_TCGA/geneSignatureProfiles/best/" + bestGenesDir
    
    oneVsAllDataDir = kl + "/GTEx_TCGA/1vsAll"
    
    # path to gene count file and colData meta data
    groupByDataDir = kl + "/GTEx_TCGA/groupbyGeneTrainingSets"
    trainGroupByGeneCountFilePath = groupByDataDir + "/GTEx_TCGA_TrainGroupby.csv" 
    colDataFilePath = groupByDataDir + "/GTEx_TCGA_TrainColData.csv"
    
    #/private/groups/kimlab/GTEx_TCGA/1vsAll/estimatedSizeFactors.csv
    estimatedScalingFactorsFilePath = oneVsAllDataDir + "/estimatedSizeFactors.csv"
        
    testSize=3
    
    cgsm = CibersortGeneSignatureMatrix(
        bestGeneSignatureProfilesDataRootDir,
        oneVsAllDataDir,                
        trainGroupByGeneCountFilePath,
        colDataFilePath,
        estimatedScalingFactorsFilePath,
        testSize=testSize,
        verbose=True
    )
    
    ret = cgsm.getCiberSortSignatueDF()
    cgsm.save('test')
    print("ciberSortSignatueDF.shape:{}".format(ret.shape))
    
    return ret

In [6]:
%%time
def testCreateTestCibersortGeneSignatureMatrix():
    print("\n************ Testing CibersortGeneSignatureMatrix(")
    testGeneSignatureDF = CreateTestCibersortGeneSignatureMatrix()
    #expectedDict = testGeneSignatureDF.to_dict();
    #expectedDict = testGeneSignatureDF.iloc[0:2, :].to_dict()
    print("expectedDict.iloc[0:3, :]:\n{}".format())
#     expectedDF =  pd.DataFrame(
#                         {'name': {0: 'AC010329.1', 1: 'AC013391.3', 2: 'AC092720.2'}, 
#                         'ACC': {0: -9.43025488776092, 1: -4.23878510464397, 2: 0.216492131302401}, 
#                         'Adipose_Subcutaneous': {0: -5.66625017808804, 1: -8.8504124902406, 2: -8.51645590632103}, 
#                         'Adipose_Visceral_Omentum': {0: -8.52173825915158, 1: -8.15208313712815, 2: -8.64969927703953}
#                         }
#     )
    
#     pd.testing.assert_frame_equal(expectedDF, testGeneSignatureDF)

    return testGeneSignatureDF

testGeneSignatureDF = testCreateTestCibersortGeneSignatureMatrix()


************ Testing CibersortGeneSignatureMatrix(
geneSignatureProfilesDataRootDir
/private/groups/kimlab/GTEx_TCGA/geneSignatureProfiles/best/GTEx_TCGA_1vsAll-design:~__gender_+_category-padj:0.001-lfc:2.0-n:25
oneVsAllDataDir
/private/groups/kimlab/GTEx_TCGA/1vsAll
groupByGeneCountFilePath
/private/groups/kimlab/GTEx_TCGA/groupbyGeneTrainingSets/GTEx_TCGA_TrainGroupby.csv
listOfResultsFiles:
	['ACC_vs_all.results', 'Adipose_Subcutaneous_vs_all.results', 'Adipose_Visceral_Omentum_vs_all.results']
listOfTypes:
	['ACC', 'Adipose_Subcutaneous', 'Adipose_Visceral_Omentum']
localCachePath:
/scratch/aedavids/tmp/private/groups/kimlab/GTEx_TCGA/geneSignatureProfiles/best/GTEx_TCGA_1vsAll-design:~__gender_+_category-padj:0.001-lfc:2.0-n:25/ACC_vs_all.results

localCachePath:
/scratch/aedavids/tmp/private/groups/kimlab/GTEx_TCGA/geneSignatureProfiles/best/GTEx_TCGA_1vsAll-design:~__gender_+_category-padj:0.001-lfc:2.0-n:25/Adipose_Subcutaneous_vs_all.results

localCachePath:
/scratch/aedavid

IndexError: tuple index out of range

## Create Best Gene Signature Profile
abs(log fold change) >= 2

In [12]:
def CreateBestCibersortGeneSignatureMatrix():
    kl = "/private/groups/kimlab"
    # geneSignatureProfiles/ conatains output of from 
    #  extraCellularRNA/terra/jupyterNotebooks/signatureGenesUpsetPlots.ipynb
    bestGenesDir = "GTEx_TCGA_1vsAll-design:~__gender_+_category-padj:0.001-lfc:2.0-n:25"
    bestGeneSignatureProfilesDataRootDir = kl + "/GTEx_TCGA/geneSignatureProfiles/best/" + bestGenesDir
        
    oneVsAllDataDir = kl + "/GTEx_TCGA/1vsAll"
    
    # path to gene count file and colData meta data
    groupByDataDir = kl + "/GTEx_TCGA/groupbyGeneTrainingSets"
    trainGroupByGeneCountFilePath = groupByDataDir + "/GTEx_TCGA_TrainGroupby.csv" 
    colDataFilePath = groupByDataDir + "/GTEx_TCGA_TrainColData.csv"
    
    #/private/groups/kimlab/GTEx_TCGA/1vsAll/estimatedSizeFactors.csv
    estimatedScalingFactorsFilePath = oneVsAllDataDir + "/estimatedSizeFactors.csv"
    
        
    ret = CibersortGeneSignatureMatrix(
        bestGeneSignatureProfilesDataRootDir,
        oneVsAllDataDir,                
        trainGroupByGeneCountFilePath,
        colDataFilePath,
        estimatedScalingFactorsFilePath
    )
    

    return ret

In [13]:
%%time
bestCGSMatrix = CreateBestCibersortGeneSignatureMatrix()
bestSignatureGeneDF = bestCGSMatrix.getCiberSortSignatueDF()
print("bestSignatureGeneDF.shape:{}".format(bestSignatureGeneDF.shape))
bestSignatureGeneDF.iloc[0:3, 0:5]
bestCGSMatrix.save()

geneSignatureProfilesDataRootDir
/private/groups/kimlab/GTEx_TCGA/geneSignatureProfiles/best/GTEx_TCGA_1vsAll-design:~__gender_+_category-padj:0.001-lfc:2.0-n:25
oneVsAllDataDir
/private/groups/kimlab/GTEx_TCGA/1vsAll
groupByGeneCountFilePath
/private/groups/kimlab/GTEx_TCGA/groupbyGeneTrainingSets/GTEx_TCGA_TrainGroupby.csv
listOfResultsFiles:
	['ACC_vs_all.results', 'Adipose_Subcutaneous_vs_all.results', 'Adipose_Visceral_Omentum_vs_all.results', 'Adrenal_Gland_vs_all.results', 'Artery_Aorta_vs_all.results', 'Artery_Coronary_vs_all.results', 'Artery_Tibial_vs_all.results', 'BLCA_vs_all.results', 'BRCA_vs_all.results', 'Bladder_vs_all.results', 'Brain_Amygdala_vs_all.results', 'Brain_Anterior_cingulate_cortex_BA24_vs_all.results', 'Brain_Caudate_basal_ganglia_vs_all.results', 'Brain_Cerebellar_Hemisphere_vs_all.results', 'Brain_Cerebellum_vs_all.results', 'Brain_Cortex_vs_all.results', 'Brain_Frontal_Cortex_BA9_vs_all.results', 'Brain_Hippocampus_vs_all.results', 'Brain_Hypothalamus_v

## Create Up regulated Gene Signature Profile

In [14]:
def CreateUpCibersortGeneSignatureMatrix():
    kl = "/private/groups/kimlab"
    # geneSignatureProfiles/ conatains output of from 
    #  extraCellularRNA/terra/jupyterNotebooks/signatureGenesUpsetPlots.ipynb
    upGenesDir = "GTEx_TCGA_1vsAll-design:~__gender_+_category-padj:0.001-lfc:2.0-n:25"
    upGeneSignatureProfilesDataRootDir = kl + "/GTEx_TCGA/geneSignatureProfiles/up/" + upGenesDir
        
    oneVsAllDataDir = kl + "/GTEx_TCGA/1vsAll"
    
    # path to gene count file and colData meta data
    groupByDataDir = kl + "/GTEx_TCGA/groupbyGeneTrainingSets"
    trainGroupByGeneCountFilePath = groupByDataDir + "/GTEx_TCGA_TrainGroupby.csv" 
    colDataFilePath = groupByDataDir + "/GTEx_TCGA_TrainColData.csv"
    
    #/private/groups/kimlab/GTEx_TCGA/1vsAll/estimatedSizeFactors.csv
    estimatedScalingFactorsFilePath = oneVsAllDataDir + "/estimatedSizeFactors.csv"

    ret = CibersortGeneSignatureMatrix(
        upGeneSignatureProfilesDataRootDir,
        oneVsAllDataDir,
        trainGroupByGeneCountFilePath,
        colDataFilePath,
        estimatedScalingFactorsFilePath        
        #testSize=None
    )
    

    return ret

In [15]:
%%time
upCGSMatrix = CreateUpCibersortGeneSignatureMatrix()
upSignatureGeneDF = upCGSMatrix.getCiberSortSignatueDF()
print("upSignatureGeneDF.shape:{}".format(upSignatureGeneDF.shape))
upSignatureGeneDF.iloc[0:3, 0:5]
upCGSMatrix.save()

geneSignatureProfilesDataRootDir
/private/groups/kimlab/GTEx_TCGA/geneSignatureProfiles/up/GTEx_TCGA_1vsAll-design:~__gender_+_category-padj:0.001-lfc:2.0-n:25
oneVsAllDataDir
/private/groups/kimlab/GTEx_TCGA/1vsAll
groupByGeneCountFilePath
/private/groups/kimlab/GTEx_TCGA/groupbyGeneTrainingSets/GTEx_TCGA_TrainGroupby.csv
listOfResultsFiles:
	['ACC_vs_all.results', 'Adipose_Subcutaneous_vs_all.results', 'Adipose_Visceral_Omentum_vs_all.results', 'Adrenal_Gland_vs_all.results', 'Artery_Aorta_vs_all.results', 'Artery_Coronary_vs_all.results', 'Artery_Tibial_vs_all.results', 'BLCA_vs_all.results', 'BRCA_vs_all.results', 'Bladder_vs_all.results', 'Brain_Amygdala_vs_all.results', 'Brain_Anterior_cingulate_cortex_BA24_vs_all.results', 'Brain_Caudate_basal_ganglia_vs_all.results', 'Brain_Cerebellar_Hemisphere_vs_all.results', 'Brain_Cerebellum_vs_all.results', 'Brain_Cortex_vs_all.results', 'Brain_Frontal_Cortex_BA9_vs_all.results', 'Brain_Hippocampus_vs_all.results', 'Brain_Hypothalamus_vs_

In [16]:
# !chmod u+w /private/groups/kimlab/GTEx_TCGA/geneSignatureProfiles/up/GTEx_TCGA_1vsAll-design:~__gender_+_category-padj:0.001-lfc:2.0-n:25
# !mkdir /private/groups/kimlab/GTEx_TCGA/geneSignatureProfiles/up/GTEx_TCGA_1vsAll-design:~__gender_+_category-padj:0.001-lfc:2.0-n:25/ciberSort/
