quotes from my discussion with Hani:

> (1) Promoter bound by PAI-1 according to ChIP-seq

> (2) Some gene expression change in response to PAI-1 knockdown (the more the better)

> (3) looking good IGV (basically something good looking like what Charles had attached.



> I had already generated those results. Here are the genes with promoter binding:
        `PAI1_ChIP-seq/result_ChIPseq/1.Peak_Calling/RT-112_ChIP_intersect.hg19.annotated.list`
and
        `PAI1_ChIP-seq/result_ChIPseq/1.Peak_Calling/UMUC3_ChIP_intersect.hg19.annotated.list`

> the RNA-seq files are:
`RNA-seq/RT112/exp/PAI1-KD_RT112_logFC.nd.txt`
and
`RNA-seq/UC3/exp/PAI1-KD_UC3_logFC.nd.txt`

> Simply take genes from the "annotated.list" files and overlap with those that are up-regulated in the RNA-seq files. Then start from the ones with the highest logFC and generate some IGV plots.

First, I select intersect of annotated genes from the PAI-1 ChIPseq data and over expressed genes in the RNA-seq data. Then, I take the `bigwig` files from MACS2 signal track (fold-enrichment) results for all samples (see each HTML reports like [RT112-H3K27ac](https://gitlab.com/abardesigner/goodarzilab-abe/-/blob/master/People/Rosser/data-2020-04-24/RT112-H3K27ac/RT112-H3K27ac.qc.html) that tells you where that data comes from). Then, I extract `wig` file of top genes from each `bigwig` files to make igv plots. 

In [6]:
import numpy as np
import pandas as pd 
import subprocess 

reading PAI-1 ChIP-seq results and RNA-seq LogFoldChanges into pandas dataframe. 

In [7]:
PAI1_RT112 = pd.read_csv('PAI1_ChIP-seq/result_ChIPseq/1.Peak_Calling/RT-112_ChIP_intersect.hg19.annotated.list',header=None).loc[:,0].to_list()
PAI1_UMUC3 = pd.read_csv('PAI1_ChIP-seq/result_ChIPseq/1.Peak_Calling/UMUC3_ChIP_intersect.hg19.annotated.list',header=None).loc[:,0].to_list()

RNASeq_RT112 = pd.read_csv('RNA-seq/RT112/exp/PAI1-KD_RT112_logFC.nd.txt', sep='\t', index_col=0)
RNASeq_UMUC3 = pd.read_csv('RNA-seq/UC3/exp/PAI1-KD_UC3_logFC.nd.txt', sep='\t', index_col=0)

reading directories to `bigwig` files into a list

In [8]:
fc_bigwig = subprocess.getoutput('ls data-2020-04-24/*/*/*/*fc*bigwig').split('\n')
# pval_bigwig = subprocess.getoutput('ls data-2020-04-24/*/*/*/*pval*bigwig').split('\n')
fc_bigwig

['data-2020-04-24/RT112-H3K27ac/signal/pooled-rep/rep.pooled_x_RT112-Input_R1.nodup.fc.signal.bigwig',
 'data-2020-04-24/RT112-H3K27ac/signal/rep1/RT112-H3K27ac-1_R1.nodup_x_RT112-Input_R1.nodup.fc.signal.bigwig',
 'data-2020-04-24/RT112-H3K27ac/signal/rep2/RT112-H3K27ac-2_R1.nodup_x_RT112-Input_R1.nodup.fc.signal.bigwig',
 'data-2020-04-24/RT112-H3K4me3/signal/pooled-rep/rep.pooled_x_RT112-Input_R1.nodup.fc.signal.bigwig',
 'data-2020-04-24/RT112-H3K4me3/signal/rep1/RT112-H3K4me3-1_R1.nodup_x_RT112-Input_R1.nodup.fc.signal.bigwig',
 'data-2020-04-24/RT112-H3K4me3/signal/rep2/RT112-H3K4me3-2_R1.nodup_x_RT112-Input_R1.nodup.fc.signal.bigwig',
 'data-2020-04-24/UMUC3-H3K27ac/signal/pooled-rep/rep.pooled_x_UMUC3-Input_R1.nodup.fc.signal.bigwig',
 'data-2020-04-24/UMUC3-H3K27ac/signal/rep1/UMUC3-H3K27ac-1_R1.nodup_x_UMUC3-Input_R1.nodup.fc.signal.bigwig',
 'data-2020-04-24/UMUC3-H3K27ac/signal/rep2/UMUC3-H3K27ac-1_repeat_R1.nodup_x_UMUC3-Input_R1.nodup.fc.signal.bigwig',
 'data-2020-04-24/

these funstions print out the cammand for making `wig` from each `bigwig` files. 

In [19]:
def bigwigger(CHR,STR,END, f,o):
    cmd = f'bigWigToWig -chrom={CHR} -start={STR} -end={END} {f} {o}'
    subprocess.call(cmd, shell = True)
#     print (cmd)
    
def genewigger(gene, CHR, STR, END, path_to_bigwigs, path_to_results):
    for f in path_to_bigwigs:
        wig = f.split("/")
        subprocess.call(f'mkdir -p {path_to_results}/{gene}', shell=True)
        o=f'{path_to_results}/{gene}/{wig[1]}.{wig[3]}.wig'
        bigwigger(CHR,STR,END, f,o)

# RT112

In [10]:
insect_RT112 = RNASeq_RT112.loc[
    # intersect of PAI-1 ChIP-seq and RNA-Seq
    list(set(PAI1_RT112) & set(RNASeq_RT112.index.to_list())),:
]

## up-regulated genes
Top 3 genes with log2FC > 3.5

In [12]:
insect_RT112_top = insect_RT112[insect_RT112.log2FoldChange > 3.5]
insect_RT112_top

Unnamed: 0,log2FoldChange
GPR84,3.621332
GJC3,3.548836
GBP7,4.28775


### [GBP7](https://www.genecards.org/cgi-bin/carddisp.pl?gene=GBP7)
chr1:89,131,742-89,176,040(GRCh38/hg38)

In [20]:
# genewigger('GBP7','chr1','89131742','89176040', fc_bigwig, 'igv-wigs')

<img src=igv-wigs/GBP7.png style="height:600px">

### [GPR84](https://www.genecards.org/cgi-bin/carddisp.pl?gene=GPR84)
chr12:54,350,784-54,365,253(GRCh38/hg38)

In [22]:
# genewigger('GPR84','chr12','54350784','54365253', fc_bigwig, 'igv-wigs')

<img src="igv-wigs/GRP84.png" style="height:600px">

### [GJC3](https://www.genecards.org/cgi-bin/carddisp.pl?gene=GJC3)
chr7:99,923,266-99,935,091(GRCh38/hg38)

In [27]:
# genewigger('GJC3','chr7','99923266','99935091', fc_bigwig, 'igv-wigs')

<img src="igv-wigs/GJC3.png" style="height:600px">

## down-regulated genes
Top 3 genes with log2FC < -4

In [10]:
insect_RT112_dwn = insect_RT112[insect_RT112.log2FoldChange < -4]
insect_RT112_dwn

Unnamed: 0,log2FoldChange
TAS2R31,-4.166653
TUBA8,-4.62496
IL16,-5.60858


### [IL16](https://www.genecards.org/cgi-bin/carddisp.pl?gene=IL16)
chr15:81,159,575-81,314,058(GRCh38/hg38)

In [20]:
genewigger('IL16','chr15','81159575','81314058', fc_bigwig, 'igv-wigs')

<img src='igv-wigs/IL16.png' style="height:600px">

### [TUBA8](https://www.genecards.org/cgi-bin/carddisp.pl?gene=TUBA8)
chr22:18,110,331-18,146,554(GRCh38/hg38)

In [21]:
genewigger('TUBA8','chr22','18110331','18146554', fc_bigwig, 'igv-wigs')

<img src='igv-wigs/TUBA8.png' style="height:600px">

### [TAS2R31](https://www.genecards.org/cgi-bin/carddisp.pl?gene=TAS2R31)
chr12:11,030,387-11,031,407(GRCh38/hg38)

In [22]:
genewigger('TAS2R31','chr12','11030387','11031407', fc_bigwig, 'igv-wigs')

<img src='igv-wigs/TAS2R31.png' style="height:600px">

# UMUC3

In [24]:
insect_UMUC3 = RNASeq_UMUC3.loc[
    # intersect of PAI-1 ChIP-seq and RNA-Seq
    list(set(PAI1_UMUC3) & set(RNASeq_UMUC3.index.to_list())),:
] 

## up-regulated genes
Top 3 genes with log2FC > 4.7

In [28]:
insect_UMUC3_top = insect_UMUC3[insect_UMUC3.log2FoldChange > 4.7]
insect_UMUC3_top

Unnamed: 0,log2FoldChange
GPR35,5.42429
KY,4.947295
INMT,4.787879


### [GPR35](https://www.genecards.org/cgi-bin/carddisp.pl?gene=GPR35)
chr2:240,605,408-240,631,259(GRCh38/hg38)

In [34]:
# genewigger('GPR35','chr2','240605408','240631259', fc_bigwig, 'igv-wigs')

<img src='igv-wigs/GPR35.png' style="height:600px">

### [KY](https://www.genecards.org/cgi-bin/carddisp.pl?gene=ky)
chr3:134,599,923-134,651,677(GRCh38/hg38)

In [36]:
# genewigger('KY','chr3','134599923','134651677', fc_bigwig, 'igv-wigs')

<img src="igv-wigs/KY.png" style="height:600px">

### [INMT](https://www.genecards.org/cgi-bin/carddisp.pl?gene=INMT)
chr7:30,697,985-30,757,602(GRCh38/hg38)

In [38]:
# genewigger('INMT','chr7','30697985','30757602', fc_bigwig, 'igv-wigs')

<img src="igv-wigs/INMT.png" style="height:600px">

zoom in

<img src='igv-wigs/INMT-zoom.png' style='height:600px'>

## down-regulated genes
Top 3 genes with log2FC < -4.1

In [34]:
insect_UMUC3_dwn = insect_UMUC3[insect_UMUC3.log2FoldChange < -4.1]

insect_UMUC3_dwn

Unnamed: 0,log2FoldChange
IPCEF1,-4.479412
TNFAIP8L3,-4.621867
ZNF385B,-4.255621


### [TNFAIP8L3](https://www.genecards.org/cgi-bin/carddisp.pl?gene=TNFAIP8L3)
chr15:51,056,596-51,105,276(GRCh38/hg38)

In [41]:
genewigger('TNFAIP8L3','chr15','51056596','51105276', fc_bigwig, 'igv-wigs')

<img src=igv-wigs/TNFAIP8L3.png style="height:600px">

### [IPCEF1](https://www.genecards.org/cgi-bin/carddisp.pl?gene=IPCEF1)
chr6:154,154,483-154,356,802(GRCh38/hg38)

In [37]:
genewigger('IPCEF1','chr6','154154483','154356802', fc_bigwig, 'igv-wigs')

<img src=igv-wigs/IPCEF1.png style="height:600px">

### [ZNF385B](https://www.genecards.org/cgi-bin/carddisp.pl?gene=ZNF385B)
chr2:179,441,982-179,862,321(GRCh38/hg38)

In [46]:
genewigger('ZNF385B','chr2','179441982','179862321', fc_bigwig, 'igv-wigs')

<img src='igv-wigs/ZNF385B-squashed.png' style='height:600px'>

# Session info:

In [44]:
!conda list 

# packages in environment at /rumi/shams/abe/anaconda3/envs/bedenv:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                      1_llvm    conda-forge
backcall                  0.2.0                      py_0    anaconda
bedops                    2.4.39               hc9558a2_0    bioconda
blas                      2.16                   openblas    conda-forge
bzip2                     1.0.8                h516909a_2    conda-forge
ca-certificates           2020.6.24                     0    anaconda
certifi                   2020.6.20                py37_0    anaconda
curl                      7.69.1               h33f0ec9_0    conda-forge
cycler                    0.10.0                     py_2    conda-forge
decorator                 4.4.2                      py_0    anaconda
deeptools                 3.4.3                      py_0    