# Task 3: Identify Cis-Regulatory Elements Within the ACE2 TAD
- Identify cis-regulatory regions within the smaller ACE2 TAD based on bidning indicators  

1) ENCODE/ReMap/UniBind: **Plot TF peaks**  

a) [ENCODE](https://www.encodeproject.org/)  
b) [ReMap](http://remap.univ-amu.fr/)  
c) [UniBind](https://unibind.uio.no/)  
    
2) Segway/ChromHMM: **Predict the location of regulatory regions**  

a) [Segway](https://pmgenomics.ca/hoffmanlab/proj/segway/)  
b) [ChromHMM](http://compbio.mit.edu/ChromHMM/)  

- Related articles:  
1) [Identifying regulatory elements in eukaryotic genomes](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2764519/)

# Glossary of Cis-Regulatory Elements:
## Promoters:
- Found in proximity to the TSS  
- Contain binding sites for ubiqutous GTFs or for activator proteins that interact with the GTFs  

## Enchancers:
-  Found at a greater distance from the TSS, either upstream or downstream of the gene or within an intron  

## Silencers:
- Can be present within enhancers or can act as independent modules with binding sites for repressors  

## Insulators:
- Usually contain multiple binding sites for TFs and the strength is directly proportional to the number of binding sites  

# Step-By-Step Walkthrough:
## 1) Determine the 3'- and 5'-edges:
![annotateducsc.png](attachment:annotateducsc.png)  
**Rationale:** The main outliers, i.e. lung (F), small intestine (F), and bladder (M), were excluded and the 3'- and 5'-edges were determined by the median values of the sex-independent distribution.  
- ***3'-edge (crhomStart):*** chrX:15,250,000 (i.e. chrX:15,509,068 - 259,068 bp)
- ***5'-edge (chromEnd):*** chrX:15,750,000 (i.e. chrX:15,509,068 + 240,932 bp)
![Rplot-task3.png](attachment:Rplot-task3.png)

![just-task3.PNG](attachment:just-task3.PNG)

# UniBind:
## [UCSC Genome Browser](http://genome.ucsc.edu/cgi-bin/hgTracks?db=hg38&hubUrl=https://unibind.uio.no/static/UniBind_hubs/hub.txt)
![task3-1.PNG](attachment:task3-1.PNG)  

1) Using the Table Browser and UniBind Track, obtain an output with all TFs for the desired region:  
![task3-2.PNG](attachment:task3-2.PNG)  

2) Using Excel, obtain the name of each TF according to UniBind's naming convention <GEO/ArrayExpress/ENCODE identifier>.<cell type/tissue>_<condition>.<TF name>.<JASPAR ID>.<JASPAR version>.<TF binding model>  
    **TF name** =MID(D2,(SEARCH(".",D2,SEARCH(".",D2)+1)+1),((SEARCH(".",D2,SEARCH(".",D2,(SEARCH(".",D2)+1)+1)+1)-(SEARCH(".",D2,SEARCH(".",D2)+1)+1))))  
    ![TASK3-3.PNG](attachment:TASK3-3.PNG)  
    
3) Using R, plot revelant data:

![task3-4.PNG](attachment:task3-4.PNG)

![Rplot-T3-TF.png](attachment:Rplot-T3-TF.png)

![tfs.PNG](attachment:tfs.PNG)

# Bioconductor

if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install(version = "3.11")

# Identifying TF Peaks:
![meeting28.PNG](attachment:meeting28.PNG)

# Predicting the Location of Regulatory Regions:
1. Using FANTOM5:  
![FANTOM5.1.PNG](attachment:FANTOM5.1.PNG)  
![FANTOM5.2.PNG](attachment:FANTOM5.2.PNG)  

2. Using UCSC for a combined analysis of chromatin events and TF-binding events:  
![t3-all.PNG](attachment:t3-all.PNG)  