# Getting Started

Let's go over basic functionality and use cases of CanDI package. 

### Importing

CanDI must be imported from from the main CanDI directory. The core CanDI objects are contained within the CanDI.candi module and are imported as follows. 

In [2]:
import CanDI.candi as can
#Can also be imported as 
from CanDI import candi as can

### Data Object
The Data object is instantiated when CanDI and access as data within the candi module
CanDI dataset paths are defined as attributes within the Data object.

In [3]:
print(can.data.gene_effect) # depmap ceres score
print(can.data.expression) # ccle rna seq data
print(can.data.gene_cn) # ccle copy number data

/home/cyogodzi/projects/candi-paper/CanDI/CanDI/setup/data/depmap/CRISPR_gene_effect.csv
/home/cyogodzi/projects/candi-paper/CanDI/CanDI/setup/data/depmap/CCLE_expression.csv
/home/cyogodzi/projects/candi-paper/CanDI/CanDI/setup/data/depmap/CCLE_gene_cn.csv


## How to Directly Load a Dataset
The load method of the Data object is used to load specific datasets into memory. The datasets are saved as pandas dataframes as attributes of the data object. 

In [4]:
can.data.load("expression")

Unnamed: 0_level_0,ACH-001113,ACH-001289,ACH-001339,ACH-001538,ACH-000242,ACH-000708,ACH-000327,ACH-000233,ACH-000461,ACH-000705,...,ACH-000114,ACH-000402,ACH-000036,ACH-000973,ACH-001128,ACH-000750,ACH-000285,ACH-001858,ACH-001997,ACH-000052
gene,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
TSPAN6,4.990501,5.209843,3.779260,5.726831,7.465648,4.914086,4.032982,0.097611,4.712596,5.101398,...,3.793896,0.070389,4.692650,5.026800,6.699052,4.173127,0.097611,5.045268,5.805292,4.870858
TNMD,0.000000,0.545968,0.000000,0.000000,0.000000,0.176323,0.000000,0.000000,0.000000,0.000000,...,0.028569,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000
DPM1,7.273702,7.070604,7.346425,7.086189,6.435462,6.946848,5.806582,5.919102,6.406333,6.309976,...,6.330738,5.858230,6.623369,6.966130,6.131960,6.400879,6.428276,6.991749,7.792855,6.077457
SCYL3,2.765535,2.538538,2.339137,2.543496,2.414136,2.577731,1.948601,3.983678,2.247928,2.361768,...,2.792855,2.757023,2.111031,1.899176,2.235727,1.807355,3.257011,1.807355,2.482848,2.304511
C1orf112,4.480265,3.510962,4.254745,3.102658,3.864929,3.853996,2.684819,3.733354,3.032101,4.280214,...,2.643856,5.103078,2.543496,3.531069,3.971773,3.303050,4.980482,3.270529,3.903038,3.836934
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
POLR2J3,5.781884,4.704319,4.931683,3.858976,4.990501,5.303781,4.996841,6.839960,5.529196,5.860963,...,3.793896,6.669877,6.191010,5.934281,3.097611,5.102658,6.341630,4.607626,4.787119,4.452859
H2BE1,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.594549,...,0.000000,0.176323,0.000000,0.000000,0.000000,0.000000,0.000000,0.111031,0.000000,0.000000
AL445238.1,0.000000,0.000000,0.028569,0.000000,0.000000,0.000000,0.042644,0.000000,0.000000,0.000000,...,0.000000,0.000000,0.000000,0.000000,0.000000,0.097611,0.000000,0.000000,0.163499,0.163499
GET1-SH3BGR,0.799087,0.464668,0.263034,0.000000,0.000000,0.263034,0.286881,2.280956,0.275007,0.790772,...,1.416840,0.526069,1.117695,0.378512,0.713696,0.214125,0.310340,1.090853,0.084064,1.422233


## Cell Lines
The Cell Lines dataset contains all cell line metadata. This table is loaded automatically when candi is imported.

In [5]:
can.data.cell_lines.head(5)

Unnamed: 0_level_0,cell_line_name,stripped_cell_line_name,CCLE_Name,alias,COSMICID,sex,source,Achilles_n_replicates,cell_line_NNMD,culture_type,...,primary_or_metastasis,primary_disease,Subtype,age,Sanger_Model_ID,depmap_public_comments,lineage,lineage_subtype,lineage_sub_subtype,lineage_molecular_subtype
DepMap_ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
ACH-000001,NIH:OVCAR-3,NIHOVCAR3,NIHOVCAR3_OVARY,OVCAR3,905933.0,Female,ATCC,,,,...,Metastasis,Ovarian Cancer,"Adenocarcinoma, high grade serous",60.0,SIDM00105,,ovary,ovary_adenocarcinoma,high_grade_serous,
ACH-000002,HL-60,HL60,HL60_HAEMATOPOIETIC_AND_LYMPHOID_TISSUE,,905938.0,Female,ATCC,,,,...,Primary,Leukemia,"Acute Myelogenous Leukemia (AML), M3 (Promyelo...",35.0,SIDM00829,,blood,AML,M3,
ACH-000003,CACO2,CACO2,CACO2_LARGE_INTESTINE,"CACO2, CaCo-2",,Male,ATCC,,,,...,,Colon/Colorectal Cancer,Adenocarcinoma,,SIDM00891,,colorectal,colorectal_adenocarcinoma,,
ACH-000004,HEL,HEL,HEL_HAEMATOPOIETIC_AND_LYMPHOID_TISSUE,,907053.0,Male,DSMZ,2.0,-3.079202,Suspension,...,,Leukemia,"Acute Myelogenous Leukemia (AML), M6 (Erythrol...",30.0,SIDM00594,,blood,AML,M6,
ACH-000005,HEL 92.1.7,HEL9217,HEL9217_HAEMATOPOIETIC_AND_LYMPHOID_TISSUE,,,Male,ATCC,2.0,-2.404409,Suspension,...,,Leukemia,"Acute Myelogenous Leukemia (AML), M6 (Erythrol...",30.0,SIDM00593,,blood,AML,M6,


## Genes
The genes dataset contains relevant gene metadata. 
The genes dataset is loaded into memory automatically when candi is imported. 

In [6]:
can.data.genes.head(5)

Unnamed: 0_level_0,Approved name,Accession numbers,UniProt ID,ENTREZ ID,Ensembl ID
Approved symbol,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
PINLYP,phospholipase A2 inhibitor and LY6/PLAUR domai...,,A6NC86,390940.0,ENSG00000234465
ARL6IP1P1,ADP ribosylation factor like GTPase 6 interact...,,,100288702.0,ENSG00000255664
PRAMEF33,PRAME family member 33,,A0A0G2JMD5,645382.0,ENSG00000237700
AL353354.2,,,,,
CTA-298G8.2,,,,,


## Locations
The locations dataset contains location annotations for all genes and their associated confidence scores. Confidence scores were crowd sourced from several protein localization papers and integrated into one scale. This dataset is automatically loaded into memory when candi is imported. 

In [7]:
can.data.locations.head(5)

Unnamed: 0,gene,location,confidence
0,A1CF,Nucleus,1.0
1,A4GALT,Mitochondria,1.0
2,AAAS,Nucleus,2.0
3,AAAS,Cytoskeleton,2.0
4,AAAS,Cytosol,2.0


## Basic Object Instantiation
- The user input for object instantiation is used directly for indexing
- This means if it is misspelled candi will not be able to retrieve the data in which the user is interested


In [8]:
kras = can.Gene("KRAS")
lung = can.Cancer("Lung Cancer")
membrane = can.Organelle("Plasma membrane")
a549 = can.CellLine("A549") 

## Gene Object Methods and Attributes
The following function prints the internal attributes and functions of CanDI objects. 

In [9]:
def pretty_print_attr(obj):
    attr = []
    ls_attr = []
    meth = []
    for i in dir(obj):
        if "_" != i[0]:
            if type(getattr(obj, i)) == str or type(getattr(obj, i)) == int:
                attr.append(i)
            elif type(getattr(obj, i)) == list:
                ls_attr.append(i)
            else:
                meth.append(i)
                
    print("Attributes:\n")
    for i in attr: print(i+":", getattr(obj, i))
    for i in ls_attr: print(i+" list first item:", getattr(obj, i)[0])
    for i in ls_attr: print(i+" length:", len(getattr(obj, i)))
    print("\nMethods:\n")
    for i in meth: print(i)

pretty_print_attr(kras)


Attributes:

ensembl: ENSG00000133703
entrez: 3845
get_name: KRAS
name: KRAS proto-oncogene, GTPase
symbol: KRAS

Methods:

cn_normal
deletion
dependency_of
dependent
duplication
effect_of
essential
expressed
expression_of
mutated
non_dependent
non_essential
unexpressed


## Gene Indexing examples
If a dataset has not be loaded into memory candi will prompt you.
Once a dataset is loaded, Gene.expression gives all the rna seq transcript data for that specific object.
In this case we have already instantiated a gene object

In [10]:
kras.expression

ACH-001113    4.568640
ACH-001289    4.554589
ACH-001339    3.955127
ACH-001538    5.593354
ACH-000242    3.845992
                ...   
ACH-000750    3.729009
ACH-000285    5.389567
ACH-001858    4.014355
ACH-001997    3.455492
ACH-000052    3.587365
Name: KRAS, Length: 1379, dtype: float64

### Basic CanDI filtering
the Gene.expressed() method retrieves cell lines where the user defined gene has above 1 transcript per million
the output is a list of cell line ids which can be used to instantiate CellLine or CellLineClbbuster objects


In [11]:
kras.expressed()[0:10]

['ACH-001113',
 'ACH-001289',
 'ACH-001339',
 'ACH-001538',
 'ACH-000242',
 'ACH-000708',
 'ACH-000327',
 'ACH-000233',
 'ACH-000461',
 'ACH-000705']

The user can specify if they want the tpm values with the depmap ids 

In [12]:
kras.expressed(style="values")

ACH-001113    4.568640
ACH-001289    4.554589
ACH-001339    3.955127
ACH-001538    5.593354
ACH-000242    3.845992
                ...   
ACH-000750    3.729009
ACH-000285    5.389567
ACH-001858    4.014355
ACH-001997    3.455492
ACH-000052    3.587365
Name: KRAS, Length: 1379, dtype: float64

If you input a depmap id as an argument to gene.expressed you will get a boolean showing the expression status of your gene

In [13]:
kras.expressed(a549.depmap_id)

True

The user can use the gene.expression_of() method to check that gene's expression in a specific cell line.
This method only, when called from a Gene object, accepts cell line depmap id's as an argument.

In [14]:
kras.expression_of(a549.depmap_id)

4.350497247084133

CanDI is consistent in the way this works across all classes and data types

In [15]:
kras.mutations

mutations has not been loaded. Do you want to load, y/n?> y
Load Complete


Unnamed: 0,gene,Entrez_Gene_Id,NCBI_Build,Chromosome,Start_position,End_position,Strand,Variant_Classification,Variant_Type,Reference_Allele,...,isCOSMIChotspot,COSMIChsCnt,ExAC_AF,Variant_annotation,CGA_WES_AC,HC_AC,RD_AC,RNAseq_AC,SangerWES_AC,WGS_AC
1543,KRAS,3845,37,12,25398284,25398284,+,Missense_Mutation,SNP,C,...,True,15813.0,0.000016,other non-conserving,187:172,26:35,,90:89,,17:12
7075,KRAS,3845,37,12,25398284,25398284,+,Missense_Mutation,SNP,C,...,True,15813.0,,other non-conserving,144:0,184:2,,155:2,,
7340,KRAS,3845,37,12,25398284,25398284,+,Missense_Mutation,SNP,C,...,True,15813.0,,other non-conserving,14:0,157:1,,106:1,16:0,24:0
10322,KRAS,3845,37,12,25380276,25380276,+,Missense_Mutation,SNP,T,...,True,141.0,,other non-conserving,34:30,97:47,,52:41,,
15559,KRAS,3845,37,12,25398284,25398284,+,Missense_Mutation,SNP,C,...,True,15813.0,0.000016,other non-conserving,14:20,39:45,,91:89,23:30,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1265558,KRAS,3845,37,12,25398283,25398284,+,In_Frame_Ins,INS,-,...,True,15827.0,,other non-conserving,71:112,,,,,
1265728,KRAS,3845,37,12,25378562,25378562,+,Missense_Mutation,SNP,C,...,True,82.0,,other non-conserving,58:71,,,,,
1265729,KRAS,3845,37,12,25398283,25398284,+,In_Frame_Ins,INS,-,...,True,15827.0,,other non-conserving,76:106,,,,,
1265899,KRAS,3845,37,12,25398283,25398284,+,In_Frame_Ins,INS,-,...,True,15827.0,,other non-conserving,55:70,,,,,


The gene.mutated() method allows very specific filtering.
Using the variant argument one can select the column on which to filter. Then using the item argument the user can specifiy the specific value in which they're interested. The example below shows retrieval of all cell lines with kras missense mutations.

In [17]:
kras.mutated(variant="Variant_Classification", item="Missense_Mutation")[0:10]

['ACH-000094',
 'ACH-000178',
 'ACH-002186',
 'ACH-000311',
 'ACH-001345',
 'ACH-001843',
 'ACH-001353',
 'ACH-000417',
 'ACH-000347',
 'ACH-000997']

Users can use the unload method of the Data object to remove a dataset from memory and return it to a file path string.

In [18]:
can.data.unload('mutations')
can.data.mutations

PosixPath('/home/cyogodzi/projects/candi-paper/CanDI/CanDI/setup/data/depmap/CCLE_mutations.csv')

## CellLine Methods and Attributes

In [19]:
pretty_print_attr(a549)

Attributes:

ccle_name: A549_LUNG
depmap_id: ACH-000681
get_name: ACH-000681
lineage: lung
name: A549
sanger_id: SIDM00903
sex: Male
source: ATCC
subtype: NSCLC
tissue: lung

Methods:

aliases
cn_normal
cosmic_id
deletion
dependency_of
dependent
duplication
effect_of
essential
expressed
expression_of
mutated
non_dependent
non_essential
unexpressed


All methods work in essentially same way regardless of the candi object in use.
The CellLine.expressed() method will return all genes which have expression above 1 transcript per million
in that specific cell line.

In [20]:
a549.expressed()[:10]

['TSPAN6',
 'DPM1',
 'SCYL3',
 'C1orf112',
 'CFH',
 'FUCA2',
 'GCLC',
 'NFYA',
 'STPG1',
 'NIPAL3']

Just like gene.expressed() the user can ask for the values

In [21]:
a549.expressed(style="values")

gene
TSPAN6         5.176323
DPM1           6.310522
SCYL3          2.017922
C1orf112       4.058316
CFH            3.772941
                 ...   
UPK3BL2        1.367371
AC093512.2     4.087463
ARHGAP11B      1.531069
ABCF2-H2BE1    1.891419
POLR2J3        3.372952
Name: ACH-000681, Length: 11498, dtype: float64

And for specific genes expression status

In [22]:
a549.expressed("KRAS")

True

expressed with style="values" gives the same result as expression_of

In [23]:
a549.expression_of("KRAS")

4.350497247084133

In [24]:
a549.expressed("KRAS", style="values")

4.350497247084133

The CellLine.mtuations attribute gives all mutation data for that specific cell line

In [25]:
a549.mutations

mutations has not been loaded. Do you want to load, y/n?> y
Load Complete


Unnamed: 0,gene,Entrez_Gene_Id,NCBI_Build,Chromosome,Start_position,End_position,Strand,Variant_Classification,Variant_Type,Reference_Allele,...,isCOSMIChotspot,COSMIChsCnt,ExAC_AF,Variant_annotation,CGA_WES_AC,HC_AC,RD_AC,RNAseq_AC,SangerWES_AC,WGS_AC
244692,TPRG1L,127262,37,1,3542384,3542384,+,Missense_Mutation,SNP,G,...,False,0.0,,other non-conserving,,,,,,17:28
244693,ENO1,2023,37,1,8925414,8925414,+,Missense_Mutation,SNP,A,...,False,0.0,,other non-conserving,,,,,,22:30
244694,NMNAT1,64802,37,1,10042579,10042579,+,Missense_Mutation,SNP,C,...,False,0.0,,other non-conserving,33:30,,,13:33,33:31,20:32
244695,MFN2,9927,37,1,12058908,12058908,+,Silent,SNP,C,...,False,0.0,,silent,19:91,,,,20:93,
244696,PRAMEF4,400735,37,1,12942971,12942971,+,Missense_Mutation,SNP,G,...,False,0.0,,other non-conserving,,,,,,29:39
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
245445,IGSF1,3547,37,X,130411178,130411178,+,Missense_Mutation,SNP,G,...,False,0.0,,other non-conserving,,,,,,19:12
245446,HS6ST2,90161,37,X,132091282,132091282,+,Silent,SNP,G,...,False,0.0,,silent,35:30,,,,35:32,16:10
245447,SLITRK4,139065,37,X,142717709,142717709,+,Missense_Mutation,SNP,G,...,False,0.0,,other non-conserving,125:0,,,,128:0,37:0
245448,MAGEA11,4110,37,X,148798368,148798368,+,Missense_Mutation,SNP,G,...,False,0.0,,other non-conserving,96:1,,,,69:1,47:0


In [None]:
# calling the CellLine.mutated() method works the same way with all CanDI objects
a549.mutated(variant="Variant_Classification", item="Nonsense_Mutation")[:10]


## Cancer Methods and Attributes


In [26]:
pretty_print_attr(lung)

Attributes:

disease: Lung Cancer
get_name: Lung Cancer
ccle_names list first item: NCIH2077_LUNG
depmap_ids list first item: ACH-000010
names list first item: NCI-H2077
ccle_names length: 273
depmap_ids length: 273
names length: 273

Methods:

cn_normal
deletion
dependency_of
dependent
duplication
effect_of
essential
expressed
expression_of
mutated
mutation_matrix
non_dependent
non_essential
sexes
sources
subtypes
unexpressed


Cancer objects work essentially works as a group of cell line objects 
the Cancer.expression object returns a pandas dataframe rather than a pandas series since there are multiple cell lines to consider.

In [27]:
lung.expression

Unnamed: 0_level_0,ACH-000010,ACH-000012,ACH-000015,ACH-000021,ACH-000029,ACH-000030,ACH-000033,ACH-000035,ACH-000062,ACH-000066,...,ACH-001386,ACH-001549,ACH-001555,ACH-001556,ACH-001557,ACH-001558,ACH-001559,ACH-001560,ACH-001561,ACH-001562
gene,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
TSPAN6,4.447579,5.802452,4.794936,4.831371,6.491532,5.399855,5.391974,4.888013,4.731726,5.753818,...,3.060047,0.773996,2.769772,4.280956,4.089159,4.799087,4.275007,4.621173,4.399855,4.628774
TNMD,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.028569,0.000000,0.000000,...,0.000000,0.028569,0.000000,0.000000,0.028569,0.000000,0.000000,0.000000,0.028569,0.000000
DPM1,7.227760,5.997744,6.929436,6.498570,6.672850,6.307246,6.866661,7.254840,5.849499,5.514438,...,6.837691,5.908333,7.010220,7.745170,7.149137,7.057992,7.379725,7.168923,7.191109,7.704941
SCYL3,2.405992,1.970854,2.952334,2.414136,2.475085,2.017922,1.803227,2.289834,2.503349,2.536053,...,2.472488,1.831877,1.895303,2.304511,2.220330,2.204767,2.000000,1.871844,2.060047,2.589763
C1orf112,4.594549,3.784504,3.709291,4.527946,4.486714,3.671293,3.841973,3.795975,3.761285,4.566206,...,4.571677,4.060912,1.550901,4.306700,4.551516,3.526069,4.251719,2.153805,3.231125,3.066950
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
POLR2J3,4.516646,5.238023,5.458119,6.648465,5.239933,5.650190,6.129489,4.511595,5.559798,5.768449,...,5.689020,3.447579,5.375735,3.456806,4.397118,5.673556,3.759156,4.130931,4.688740,5.414474
H2BE1,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,...,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000
AL445238.1,0.000000,0.000000,0.000000,0.000000,0.000000,0.124328,0.000000,0.000000,0.298658,0.000000,...,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000
GET1-SH3BGR,0.807355,0.536053,0.545968,0.757023,0.150560,0.650765,0.941106,0.432959,0.748461,0.536053,...,0.321928,0.739848,0.739848,0.495695,1.659925,1.028569,0.389567,1.327687,0.650765,0.807355


Cancer.expressed method uses an abitrary threshold to filter genes the default is if a gene is expressed in 100 percent of the cell lines within the cancer object it will read out as expressed

In [28]:
lung.expressed()[0:10]

['DPM1',
 'SCYL3',
 'C1orf112',
 'GCLC',
 'NFYA',
 'LAS1L',
 'ANKIB1',
 'CYP51A1',
 'KRIT1',
 'RAD52']

The user can relax this threshold as necessary

In [29]:
lung.expressed(threshold=0.50)[0:10]

['TSPAN6',
 'DPM1',
 'SCYL3',
 'C1orf112',
 'CFH',
 'FUCA2',
 'GCLC',
 'NFYA',
 'STPG1',
 'NIPAL3']

In [30]:
lung.expressed(threshold=0.50, style="values")

Unnamed: 0_level_0,ACH-000010,ACH-000012,ACH-000015,ACH-000021,ACH-000029,ACH-000030,ACH-000033,ACH-000035,ACH-000062,ACH-000066,...,ACH-001386,ACH-001549,ACH-001555,ACH-001556,ACH-001557,ACH-001558,ACH-001559,ACH-001560,ACH-001561,ACH-001562
gene,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
TSPAN6,4.447579,5.802452,4.794936,4.831371,6.491532,5.399855,5.391974,4.888013,4.731726,5.753818,...,3.060047,0.773996,2.769772,4.280956,4.089159,4.799087,4.275007,4.621173,4.399855,4.628774
DPM1,7.227760,5.997744,6.929436,6.498570,6.672850,6.307246,6.866661,7.254840,5.849499,5.514438,...,6.837691,5.908333,7.010220,7.745170,7.149137,7.057992,7.379725,7.168923,7.191109,7.704941
SCYL3,2.405992,1.970854,2.952334,2.414136,2.475085,2.017922,1.803227,2.289834,2.503349,2.536053,...,2.472488,1.831877,1.895303,2.304511,2.220330,2.204767,2.000000,1.871844,2.060047,2.589763
C1orf112,4.594549,3.784504,3.709291,4.527946,4.486714,3.671293,3.841973,3.795975,3.761285,4.566206,...,4.571677,4.060912,1.550901,4.306700,4.551516,3.526069,4.251719,2.153805,3.231125,3.066950
CFH,3.404631,2.750607,3.560715,2.153805,0.014355,1.214125,0.807355,0.367371,5.036503,3.881665,...,0.056584,0.000000,1.704872,2.901108,5.655638,3.142413,3.300124,0.111031,2.857981,6.017254
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
UPK3BL2,3.279471,2.729009,3.976364,2.643856,5.422233,2.073820,3.065228,3.572890,3.241840,3.084064,...,2.353323,0.863938,0.565597,0.516015,0.650765,1.565597,0.163499,0.887525,0.137504,2.063503
AC093512.2,2.653060,3.700440,2.214125,2.403268,2.933573,3.944858,2.965323,3.026800,2.778209,4.265287,...,4.765535,5.610877,2.424922,4.426265,2.324811,2.344828,1.799087,1.899176,2.904966,2.720278
ARHGAP11B,1.827819,1.695994,0.903038,2.666757,2.195348,1.056584,1.655352,2.996389,1.682573,2.364572,...,1.807355,1.541019,0.613532,0.773996,1.176323,0.565597,1.744161,0.137504,0.956057,1.550901
ABCF2-H2BE1,1.521051,3.132577,1.646163,0.855990,2.100978,2.797013,2.419539,0.344828,2.981853,2.482848,...,0.823749,0.189034,3.712596,1.007196,3.446256,2.592158,1.150560,2.754888,1.765535,3.129283


Cancer and CellLineCluster objects have an additional method that outputs a binary matrix
of which genes/cell lines have mutations

In [31]:
lung.mutation_matrix()

Unnamed: 0,A1BG,A1CF,A2M,A2ML1,A3GALT2,A4GALT,A4GNT,AAAS,AACS,AADAC,...,ZWILCH,ZWINT,ZXDA,ZXDB,ZXDC,ZYG11A,ZYG11B,ZYX,ZZEF1,ZZZ3
ACH-000523,1,0,0,0,0,0,0,0,0,0,...,0,1,0,0,0,0,0,0,0,0
ACH-000749,1,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
ACH-000787,1,1,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
ACH-000852,1,0,1,1,0,0,0,0,0,0,...,0,0,0,0,0,0,1,0,0,0
ACH-000867,1,0,0,0,0,0,0,0,0,0,...,1,0,0,0,0,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
ACH-000521,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
ACH-000010,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
ACH-000589,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
ACH-000575,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


## Organelle Methods and Attributes


In [32]:
pretty_print_attr(membrane)

Attributes:

conf: 3
get_name: Plasma membrane
location: Plasma membrane
genes list first item: ABCA7
genes length: 1547

Methods:

cn_normal
deletion
dependency_of
dependent
duplication
effect_of
essential
expressed
expression_of
genes_and_conf
mutated
non_dependent
non_essential
unexpressed
