## Extract Driver Mutations - Indels and CNVs

#### Paper: Nik-Zainal et al. http://www.nature.com/nature/journal/v534/n7605/full/nature17676.html

Indels analysis that they used:
* n_gene = number of indel sites in gene observed across all samples
* S_gene = size of coding region of gene
* u_indel = sum([N_gene])/sum([S_gene])
* Exp_indel = u_indel S_indel gene_indel_rate
* gene_indel rate --> glm.nb(n_indel ~ offset(log(Eindel))-1)
    * basically not informative
    * but does give the overdispersion parameter we need and P-values for indel recurrence in the gene
* excluded genes --> those identified as drivers in SNV analysis

Combining tests together:
* Fisher's method
* stratifies FDR to increase sensitivity (Stratified false discovery control for large-scale hypothesis testing with application to genome-wide association studies. Genet Epidemiol, 2006)

Improvements:
* why not quantify the indels by context as well?
* neutral reference: what about using gold standard indel sets? PacBio? other cancers?

Brainstorming:
* Use the proteomics data to validate the driver mutations (already done it seems: http://www.nature.com/nature/journal/v534/n7605/full/nature18003.html)
* Correlate specific mutation signatures or gene expression or CNVs or microsatellite/TRs to mutations themselves
* limiting dilution
* machine learning approach to cluster gene regions or specific mutations together
    * inputs: possible driver genes + gene expression
    * output: driver mutations + random passengers and the gene expression signature
        * similar to hierarchical clustering but not really

In [1]:
%load_ext rpy2.ipython
%matplotlib inline
%run ipy_setup.py

#brcaFullMutation=pd.read_csv("~/zhang-ipy/Programs/TCGA_brca_sample_by_gene.tsv", sep="\t")
#brcaFullTumorExpression=pd.read_csv("~/zhang-ipy/brca/tcga_BRCA_RNAseqV2_All.txt", sep="\t")
#brcaFullPathway=pd.read_csv("~/zhang-ipy/Programs/TCGA_bayesian_megena_network_from_igor.tsv", sep="\t", nrows=1000)

## indels analysis

In [2]:
varfile = pd.read_csv("/media/sf_D_DRIVE/zhang/bcgsc.ca__Multi-Center_Mutations_level2.maf",
                     sep="\t")
indels = varfile[[a["End_Position"] - a["Start_Position"] != 0 for ix,a in varfile.iterrows()]]
len(indels)

4304

In [18]:
import codecs
import csv
csvReader = csv.reader()

In [3]:
import os
indels_list = []
files = ["brca", "cesc", "esca","laml","ovarian","sarc","stad","thym","ucec"]
good_cols = varfile.columns.copy()
for f in files:
    print f
    fp = open("/media/sf_D_DRIVE/ipython/tcga_somatic_sets/" + f, "r")
    started = False
    count = 0
    indels_tmp = []
    for line in fp:
        split_line = line.split("\t")
        if "Hugo_Symbol" == split_line[0]:
            print len(split_line)
            started=True
            continue
        if started == False: 
            count += 1
        else: # add the first 34 columns to array
            count += 1
            if len(split_line) < 34: continue
            for i in range(10,13):
                split_line[i] = split_line[i].replace("-", "")
            if (int(split_line[6])!=int(split_line[5])) & (len(split_line[12])!=len(split_line[10])):
                indels_tmp.append(split_line[0:34])
    indels_list.extend(indels_tmp)

brca
55
cesc
73
esca
67
laml
64
ovarian
55
sarc
73
stad
67
thym
70
ucec
49


In [4]:
# all indels from TCGA somatic data - this is our null reference
indels_df = pd.DataFrame(indels_list, columns=good_cols[0:34])
indels_df.to_csv('indels_all_tcga_cancer_somatic.tsv', sep="\t", index=None)

In [87]:
indels_df

Unnamed: 0,Hugo_Symbol,Entrez_Gene_Id,Center,NCBI_Build,Chromosome,Start_Position,End_Position,Strand,Variant_Classification,Variant_Type,Reference_Allele,Tumor_Seq_Allele1,Tumor_Seq_Allele2,dbSNP_RS,dbSNP_Val_Status,Tumor_Sample_Barcode,Matched_Norm_Sample_Barcode,Match_Norm_Seq_Allele1,Match_Norm_Seq_Allele2,Tumor_Validation_Allele1,Tumor_Validation_Allele2,Match_Norm_Validation_Allele1,Match_Norm_Validation_Allele2,Verification_Status,Validation_Status,Mutation_Status,Sequencing_Phase,Sequence_Source,Validation_Method,Score,BAM_File,Sequencer,Tumor_Sample_UUID,Matched_Norm_Sample_UUID
0,A2ML1,0,genome.wustl.edu,37,12,9016563,9016564,+,Frame_Shift_Del,DEL,GC,GC,,novel,,TCGA-D8-A27N-01A-11D-A16D-09,TCGA-D8-A27N-10A-01D-A16D-09,GC,GC,,,,,Unknown,Untested,Somatic,Phase_IV,WXS,none,1,dbGAP,Illumina GAIIx,6a411174-582a-4c68-bb04-5ea2e504bf7c,de420622-f742-4781-873a-e9faffc94dd3
1,AAK1,22848,genome.wustl.edu,37,2,69870164,69870166,+,In_Frame_Del,DEL,CTT,CTT,,novel,,TCGA-E2-A14V-01A-11D-A12B-09,TCGA-E2-A14V-10A-01D-A12B-09,CTT,CTT,,,,,Unknown,Untested,Somatic,Phase_IV,WXS,none,1,dbGAP,Illumina GAIIx,703314fe-bfd5-45d5-9ed5-fcdce8a19fd6,e49f6bc1-0933-4500-9068-aa7d8c276b22
2,AAMP,0,genome.wustl.edu,37,2,219134765,219134766,+,Frame_Shift_Ins,INS,,,G,novel,,TCGA-A8-A08S-01A-11W-A050-09,TCGA-A8-A08S-10A-01W-A055-09,-,-,,,,,Unknown,Untested,Somatic,Phase_IV,WXS,none,1,dbGAP,Illumina GAIIx,9c981525-80af-4f79-b94a-be00131ab872,91135c15-292f-49d0-9eb0-ca16ac6a6a32
3,AASS,10157,genome.wustl.edu,37,7,121721545,121721553,+,Splice_Site,DEL,GTCACTCAC,GTCACTCAC,,novel,,TCGA-BH-A18Q-01A-12D-A12B-09,TCGA-BH-A18Q-11A-34D-A12B-09,GTCACTCAC,GTCACTCAC,,,,,Unknown,Untested,Somatic,Phase_IV,WXS,none,1,dbGAP,Illumina GAIIx,a4de6680-33c3-4f6f-8696-453470a00bcb,7a579f39-1175-4207-a069-b49b9cd87782
4,ABCA1,19,genome.wustl.edu,37,9,107556793,107556794,+,Splice_Site,INS,,,AAA,novel,,TCGA-AO-A0JC-01A-11W-A071-09,TCGA-AO-A0JC-10A-01W-A071-09,-,-,,,,,Unknown,Untested,Somatic,Phase_IV,WXS,none,1,dbGAP,Illumina GAIIx,120f55df-5d1d-4073-a21a-632c892d3da9,a9132e75-b777-4399-969c-d1e8fab91117
5,ABCA7,0,genome.wustl.edu,37,19,1061826,1061827,+,Frame_Shift_Del,DEL,TT,TT,,novel,,TCGA-AR-A256-01A-11D-A167-09,TCGA-AR-A256-10A-01D-A167-09,TT,TT,,,,,Unknown,Untested,Somatic,Phase_IV,WXS,none,1,dbGAP,Illumina GAIIx,ea43434b-197e-48ac-ae2e-46bc7f3776de,09f34836-fc9a-4ff4-a0c9-fd93259e32f3
6,ABCB6,10058,genome.wustl.edu,37,2,220082498,220082499,+,Frame_Shift_Ins,INS,,,T,novel,,TCGA-A8-A07E-01A-11W-A050-09,TCGA-A8-A07E-10A-01W-A055-09,-,-,,,,,Unknown,Untested,Somatic,Phase_IV,WXS,none,1,dbGAP,Illumina GAIIx,fa018a20-2c26-4d47-831f-75280b6464df,305efad7-ff5c-4c4a-8f59-2f5fd43710ec
7,ABCC3,0,genome.wustl.edu,37,17,48755424,48755433,+,Frame_Shift_Del,DEL,CCCATTTCCT,CCCATTTCCT,,novel,,TCGA-D8-A1XS-01A-11D-A14K-09,TCGA-D8-A1XS-10A-01D-A14K-09,CCCATTTCCT,CCCATTTCCT,,,,,Unknown,Untested,Somatic,Phase_IV,WXS,none,1,dbGAP,Illumina GAIIx,5d302c04-302e-4040-9429-37cd672e8d53,9b6db1ae-c8eb-4b02-9913-76070077fedf
8,ABCC9,10060,genome.wustl.edu,37,12,21981950,21981959,+,Frame_Shift_Del,DEL,GTATCCGTCA,GTATCCGTCA,,novel,,TCGA-AN-A0AR-01A-11W-A019-09,TCGA-AN-A0AR-10A-01W-A021-09,GTATCCGTCA,GTATCCGTCA,,,,,Unknown,Untested,Somatic,Phase_IV,WXS,none,1,dbGAP,Illumina GAIIx,a2d77acd-89db-4d2d-89d7-d1cc58cf576b,75042bcc-6363-44b2-97bd-c51d3789ce67
9,ABCC9,10060,genome.wustl.edu,37,12,22035726,22035727,+,Frame_Shift_Ins,INS,,,G,novel,,TCGA-AR-A0TP-01A-11D-A099-09,TCGA-AR-A0TP-10A-01D-A099-09,-,-,,,,,Unknown,Untested,Somatic,Phase_IV,WXS,none,1,dbGAP,Illumina GAIIx,bee5b9c8-739e-4530-b140-cd2b898d7afd,b95ee272-a889-47da-8366-353b57fb93b8


In [92]:
for name, group in indels_df.groupby("Variant_Classification"):
    print name, len(group)
    print "" 
for name, group in indels_df.groupby("Hugo_Symbol"):
    print name, len(group)

Frame_Shift_Del 3910

Frame_Shift_Ins 8416

In_Frame_Del 4167

In_Frame_Ins 977

RNA 3626

Splice_Site 810

A2ML1 4
AADAC 1
AADACL3 1
AADAT 1
AAK1 4
AAMP 1
AARD 1
AARSD1 1
AASDH 5
AASS 1
ABCA1 4
ABCA10 2
ABCA12 5
ABCA13 3
ABCA5 3
ABCA6 2
ABCA7 1
ABCA8 2
ABCA9 3
ABCB1 4
ABCB10 1
ABCB4 3
ABCB5 1
ABCB6 2
ABCB8 2
ABCC1 4
ABCC13 2
ABCC2 1
ABCC3 6
ABCC4 1
ABCC5 3
ABCC6 2
ABCC8 2
ABCC9 5
ABCD2 1
ABCD3 3
ABCD4 1
ABCF1 3
ABCF2 2
ABCG2 1
ABHD10 1
ABHD12B 1
ABHD13 1
ABHD17A 3
ABHD5 1
ABHD8 1
ABI2 1
ABI3BP 2
ABL1 3
ABL2 2
ABLIM2 3
ABRA 1
ABRACL 1
ABT1 1
ABTB1 2
ABTB2 1
ACAA2 1
ACACA 1
ACACB 3
ACAD10 1
ACAD11 1
ACAD9 1
ACADL 1
ACADM 1
ACADVL 2
ACAN 1
ACAP1 1
ACAP2 2
ACAP3 1
ACAT1 2
ACBD5 1
ACCN1 1
ACCN4 1
ACE 1
ACE2 3
ACHE 2
ACIN1 2
ACLY 1
ACO1 2
ACO2 3
ACOT11 1
ACOT4 1
ACOX2 2
ACOX3 1
ACP1 3
ACP2 1
ACP6 1
ACPP 1
ACPT 2
ACR 1
ACSF2 4
ACSL1 1
ACSL5 3
ACSL6 1
ACSM3 2
ACSS1 4
ACSS2 4
ACSS3 1
ACTA1 1
ACTA2 2
ACTB 2
ACTBL2 1
ACTC1 1
ACTG1 3
ACTG2 2
ACTL7B 2
ACTR1B 3
ACTR8 3
ACTRT2 1
ACVR1 1
ACVR1C 1
ACV

In [90]:
len(set(indels_df["Tumor_Sample_Barcode"]))

2217

In [None]:
# process pseudocode:
    # 1. get the number of indels per gene - done
    # 2. get the number of samples in the total file - done - 2217
    # 3. get the length of each gene
        # measure the length of non-translated sequence - useful for RNA and splice site mutations
        # measure the number of codons - useful for # of frame shift and in frame mutations
    # 4. get the # of occurrences of the indel sequence in the gene
        # has to be done on a per-indel basis
        # this rate is measured globally and applied to the equation (lt or gt 1)        

In [5]:
%run ipy_setup.py
import varcode, pyensembl
from pyensembl import EnsemblRelease
import pysam
import numpy as np
import scipy as sp
import pandas as pd
from scipy import stats
import itertools
import os.path, sys
import json, pickle

es = EnsemblRelease(75)
test = ''
for ix,var in indels_df.iterrows():
    pos = var["Start_Position"]
    origbase = var["Reference_Allele"]
    altbase = var["Tumor_Seq_Allele2"]
    chrom = var["Chromosome"]
    peffect_obj = varcode.Variant(chrom, pos, origbase,altbase,ensembl=es).effects()
    peffect = str(peffect_obj.top_priority_effect()).split("(")[0]
    test = peffect_obj.top_priority_effect()
    print peffect, var["Variant_Classification"], var["Strand"]
    break

FrameShift Frame_Shift_Del +


In [52]:
gene_info = []
pbar = ProgressBar(len(set(indels_df["Hugo_Symbol"])))
for gene in set(indels_df["Hugo_Symbol"]):
    pbar.animate()
    test = ''
    try:
        if ('ENSG' in gene) | ('LOC' in gene):
            test = es.gene_by_id(gene)
        else:
            test = es.genes_by_name(gene)[0]
    except ValueError:
        #print "Warning: lookup failed for " + gene 
        continue
    pct = [(a.length, a) for a in test.transcripts if a.biotype=="protein_coding"]
    if len(pct) == 0:
        continue
    i = max(pct, key=lambda x:x[0])[1] # gets the longest transcript
    #print len(i.coding_sequence) # this is proportional to the number of in-frame and frame-del mutations
    #print len(i.five_prime_utr_sequence) + len(i.three_prime_utr_sequence) # this is proportional to the number of RNA muts
    #print len(i.exon_intervals) # this is proportional to the number of possible splice site mutations
    #print i.end-i.start - len(i.sequence) # this is proportional to the intronic portion (not called it seems?)
    if i.contains_start_codon & i.contains_stop_codon:
        gene_info.append([gene, len(i.exon_intervals), len(i.sequence), len(i.coding_sequence), len(i.five_prime_utr_sequence) + len(i.three_prime_utr_sequence)])

In [184]:
len(gene_info)

8155

In [166]:
pd.DataFrame(gene_info).to_csv("gene-exons-seq-cod-utr.tsv", sep="\t")

In [None]:
# equation for predicting the number of events

# exp_gene = [len_coding*(frameshift rate + inframe rate) + len_UTRs*(RNA rate) + num_Exons*(splice_site rate)]*gene_mut_rate
# identify genes based on their dN/dS rate (nonsynonymous/syn rate) indicating enrichment
# other possible correlate: the repeat content of the gene

In [53]:
# frameshift rate = num frameshift events / len_coding_all
# inframe rate = num inframe events / len_coding_all
# UTR rate = num RNA events / len_UTRs_all
# splice_site_rate = num splice events / num exons all

genes_df = pd.DataFrame(gene_info, columns=["gene", "exons", "total_len", "coding_len", "utr_len"])
genes_df.set_index("gene", inplace=True)
frameshift_count = len(indels_df[["Frame_Shift" in i for i in indels_df["Variant_Classification"]]])
inframe_count = len(indels_df[["In_Frame" in i for i in indels_df["Variant_Classification"]]])
code_len = 0
for g in set(indels_df[["Frame" in i for i in indels_df["Variant_Classification"]]]["Hugo_Symbol"]):
    if g in genes_df.index:
        code_len += genes_df.loc[g]["coding_len"]
framerate = frameshift_count/float(code_len)
inframerate = inframe_count/float(code_len)

utr_count = len(indels_df[["RNA" in i for i in indels_df["Variant_Classification"]]])
utr_len = 0
for g in set(indels_df[["RNA" in i for i in indels_df["Variant_Classification"]]]["Hugo_Symbol"]):
    if g in genes_df.index:
        utr_len += genes_df.loc[g]["utr_len"]
utrrate = utr_count / float(utr_len)

splice_count = len(indels_df[["Splice" in i for i in indels_df["Variant_Classification"]]])
splice_len = 0
for g in set(indels_df[["Splice" in i for i in indels_df["Variant_Classification"]]]["Hugo_Symbol"]):
    if g in genes_df.index:
        splice_len += genes_df.loc[g]["exons"]
splicerate = splice_count / float(splice_len)
print framerate, inframerate, utrrate, splicerate

0.000699115708811 0.000291761415392 0.00230589701683 0.0773491214668


In [54]:
indels_df.set_index("Hugo_Symbol", inplace=True)

In [55]:
varfile = pd.read_csv("/media/sf_D_DRIVE/zhang/bcgsc.ca__Multi-Center_Mutations_level2.maf", sep="\t")
varfile = varfile[varfile["End_Position"] != varfile["Start_Position"]]
n_o = 2217
n_i = len(set(varfile["Tumor_Sample_Barcode"]))
adj = n_o/float(n_i)
labels = []
y = []
x = []
mut_arr = []
for name, group in varfile.groupby("Hugo_Symbol"):
    if name in genes_df.index:
        labels.append(name)
        y.append(len(group))
code_arr = []
exon_arr = []
utr_arr = []
gene_arr = []
#indels_df.set_index("Hugo_Symbol", inplace=True)
for name in labels:
    group = indels_df.loc[name]
    group = group[["In_Frame" in i for i in group["Variant_Classification"]]]
    gene = genes_df.loc[name]
    pred_mut = inframerate*gene["coding_len"]
    mut_arr.append(max(len(group)/pred_mut,1))
    
for ix, name in enumerate(labels):
    gene = genes_df.loc[name]
    gene_arr.append(gene["total_len"])
    code_arr.append(gene["coding_len"])
    exon_arr.append(gene["exons"]*2)
    utr_arr.append(gene["utr_len"])
    pred = ((framerate+inframerate)*gene["coding_len"] + splicerate*gene["exons"]*2 + utrrate*gene["utr_len"])
    pred = pred/adj
    x.append(pred)

In [56]:
sig_table = []
for gene,x_i,y_i in zip(labels,x,y):
    n_sites = genes_df.loc[gene]["total_len"]
    n_participants = n_i
    pred_sites = x_i
    sig = sp.stats.binom_test(y_i, n=n_sites*n_participants, p=pred_sites/(n_sites*n_participants), 
                                alternative="greater")
    sig_table.append([gene, sig, x_i, y_i])
pd.DataFrame(sorted(sig_table, key=lambda x:x[1]), columns=["gene","indel_p", "exp", "obs"]).to_csv("indel_p_ovarian.tsv", sep="\t")

## copy number variation

In [2]:
copy_file = pd.read_csv("../brca/SCNA_input.txt", sep="\t")
copy_file.set_index(["Unique Name","Descriptor"], inplace=True)

In [4]:
rna_file = pd.read_csv("../brca/tcga_BRCA_RNAseqV2_All.txt", sep="\t")

In [5]:
cytobandfile = pd.read_csv("cytoBand.txt", sep="\t", header=None, names=["chrom", "start", "end", "locus", "giemsa"])
cytobandfile.set_index(["chrom","locus"], inplace=True)

In [8]:
es = EnsemblRelease(75)

In [None]:
from sklearn import preprocessing
sig_results = [] # gene chrom pos dn_avg up_avg spearmanp kstestp spearmann kstestn
pbar = ProgressBar(len(copy_file))
for ri in range(0,len(copy_file)):
    pbar.animate()
    df = copy_file[[c for c in copy_file.columns if "TCGA" in c]][ri:ri+1].astype(int)
    df_as_array = sorted(zip(["-".join(i.split("-")[0:3]) for i in list(df.columns)], list(df.values[0])),key=lambda x:x[1])
    df_as_array_up = [i[0] for i in df_as_array if i[1] > 0]
    df_as_array_dn = [i[0] for i in df_as_array if i[1] <= 0]
    import re
    genes_to_look_at = []
    for ix,row in copy_file[ri:ri+1].iterrows():
        if "CN values" not in ix[0]:
            chrom, pos = re.findall(r'\d{1,2}|[p|q]\d{1,2}\.\d{1,2}|[p|q]\d{1,2}|[X|Y]\s*', ix[1].strip())
            gpos = cytobandfile.loc[("chr"+chrom,pos)]
            genes = es.gene_names_at_locus(chrom,gpos["start"], end=gpos["end"])
            genes_to_look_at.extend(genes)
            break
    genes_by_pos = []
    for gene in genes_to_look_at:
        genes_by_pos.append((gene, es.genes_by_name(gene)[0].start))
    genes_by_pos = sorted(genes_by_pos, key=lambda x:x[1])
    try:
        rna_to_see = rna_file.loc[genes_to_look_at].dropna()
    except KeyError:
        continue
    for rir in range(0,len(genes_by_pos)):
        if genes_by_pos[rir][0] in rna_to_see.index:
            rna_to_see2 = rna_to_see[[c for c in rna_to_see.columns if "TCGA" in c]].loc[genes_by_pos[rir][0]].copy()
            rna_to_see_up = rna_to_see2[[c for c in rna_to_see.columns if c in df_as_array_up]]
            rna_to_see_dn = rna_to_see2[[c for c in rna_to_see.columns if c in df_as_array_dn]]
            gene, pos = genes_by_pos[rir]
            dn = dict(df_as_array)
            cnv = [dn[c] for c in rna_to_see2.index if c in dict(df_as_array)]
            rna = [rna_to_see2[c] for c in rna_to_see2.index if c in dict(df_as_array)]
            rd = preprocessing.scale(rna_to_see_dn)
            ru = preprocessing.scale(rna_to_see_up)
            _,spearmanp = sp.stats.spearmanr(rna, cnv)
            _,kstestp = sp.stats.ks_2samp(rd, ru)
            _,ttestp = sp.stats.ttest_ind(rna_to_see_dn, rna_to_see_up)
            dn_avg = np.average(rna_to_see_dn)
            up_avg = np.average(rna_to_see_up)
            sig_results.append([gene, chrom, pos, dn_avg, up_avg, spearmanp, kstestp, len(cnv), rna_to_see_dn, rna_to_see_up])
            if (spearmanp < 0.001) and (kstestp < 0.001):
                print gene, spearmanp, kstestp, ttestp
                sb.plt.hold(b=True)
                sb.distplot(rd, bins=25)
                sb.distplot(ru, bins=25)
                sb.plt.show()
                sb.plt.hold(b=False)

In [None]:
for chrom in set(srdf[1]):
    print chrom
    values = srdf[srdf[1]==chrom][srdf[6]<1*10**-6][srdf[5]<1*10**-6]
    plt.scatter(values[2], values['fold'])
    plt.show()

In [31]:
srdf = pd.DataFrame(sig_results)
with open("/media/sf_D_DRIVE/zhang/copy_number_variation_gene_linker_distr.pickle","w") as fp:
    pickle.dump(srdf, fp)

In [27]:
srdf = pd.DataFrame(sig_results)
srdf["fold"] = srdf[4]/srdf[3]
srdf["amt"] = srdf[4]-srdf[3]
srdf[8] = [len(i) for i in srdf[8]]
srdf[9] = [len(i) for i in srdf[9]]
srdf[srdf[6]<9*10**-2][srdf[5]<9*10**-20].sort([5], ascending=True)



Unnamed: 0,0,1,2,3,4,5,6,7,8,9,fold,amt
702,ASH2L,8,37962760,10.089654,11.103010,6.504826e-102,2.306522e-05,824,527,297,1.100435,1.013355
1103,STARD3,17,37793318,9.418898,10.903584,2.618118e-99,8.207816e-08,824,577,247,1.157628,1.484686
706,DDHD2,8,38082736,9.808912,10.922693,3.254419e-92,5.720896e-05,824,527,297,1.113548,1.113781
704,LSM1,8,38020839,8.979558,10.009659,7.191480e-92,9.322522e-03,824,527,297,1.114716,1.030102
708,WHSC1L1,8,38127215,10.232712,11.330648,1.958991e-89,7.261607e-05,824,527,297,1.107297,1.097936
726,VDAC3,8,42249142,10.678985,11.488800,2.081447e-87,9.589432e-03,824,510,314,1.075833,0.809816
2781,MAP2K4,17,11924141,10.222836,9.396060,4.855681e-84,2.176653e-03,824,343,481,0.919125,-0.826777
2208,RIC8A,11,207511,11.089047,10.502356,3.123335e-83,1.185221e-02,824,599,225,0.947093,-0.586690
697,BRF2,8,37700786,8.266527,9.322489,4.189498e-83,1.712996e-02,824,527,297,1.127740,1.055962
876,FADD,11,70049269,9.151340,10.072154,3.543290e-80,2.379057e-02,824,525,299,1.100621,0.920814
