# ParhyaleGenomeViewer

This Jupyter notebook enables the visualization of data and results from the Sun and Patel 2021 paper using IGV.
To use this notebook, make sure you have the following Python packages installed:

- igv (https://github.com/igvteam/igv-jupyter#igvjs-jupyter-extension)
- uniprot (https://github.com/boscoh/uniprot)
- matplotlib
- pandas
- seaborn

In [12]:
# Change this cell above the line to contain the paths to directories for required data for IGV visualization.
# Be sure to include a terminal "/" character at the end of each directory string.

### The below folder should contain:
# - the phaw_5.0.fa file
# - the phaw_5.0.fa.fai file
genome_loc = '~/Labwork/Bioinformatics/GenomeSequences/Phaw_5.0_Annotation/genome/'

### The below folder should contain:
# - reference .gff files for any gene annotations you'd like to load
annotation_loc = '~/Labwork/Bioinformatics/Transcripts/'
### The below list contains the filenames for .gffs you'd like to visualize by default
### Be sure to sort your gff file and generate an index file for each reference annotation using igvtools.
annotation_gffs = ['mikado.loci.sorted.gff']

### The below folder should contain:
# - all Genrich stage-specific peaks, downloaded from GEO
# - all Genrich .bw files, downloaded from GEO
OmniATAC_loc = ''

### The below folder should contain:
# - all NucleoATAC files, downloaded from GEO
NucATAC_loc = ''

### The below folder should contain:
# - all HINT-ATAC files, downloaded from GEO
HINT_ATAC_loc = ''

################################################################
############## No need to modify below this line ###############
################################################################

# Imports necessary packages
import igv
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
import sys

def toJupyterURL(loc):
    localhostStart = 'http://localhost:8888/files'
    return loc.replace('~', localhostStart)

# Specifies reference object
reference_object = {
    "id": "phaw_5.0",
    "name": "Parhyale hawaiensis genome 5.0",
    "fastaURL": "http://localhost:8888/files/Labwork/Bioinformatics/Omni-ATAC-Seq/OmniATAC_github/phaw_5.0.fa",
    "indexURL": "http://localhost:8888/files/Labwork/Bioinformatics/Omni-ATAC-Seq/OmniATAC_github/phaw_5.0.fa.fai"
}

annotation_objects = {}
for gff in annotation_gffs:
    annotation_objects[gff] = {
        "name": gff,
        "url": toJupyterURL(annotation_loc) + gff,
        "format": "gff",
        "indexed": True
}

In [13]:
display(reference_object)

{'id': 'phaw_5.0',
 'name': 'Parhyale hawaiensis genome 5.0',
 'fastaURL': 'files/phaw_5.0.fa',
 'indexURL': 'files/phaw_5.0.fa.fai'}

# Visualize summary data

In [14]:
#Choose the genomic region you'd like to visualize below.

### Input the IGV address of the genomic region of interest:
### IGV address format is: <contig>:<start>-<stop>
igv_address = "phaw_50.283823c:50,641,166-50,644,340"

display(reference_object)

b = igv.Browser({"reference": reference_object})

for obj in annotation_objects:
    b.load_track(annotation_objects[obj])

b.search(igv_address)
b.show()

{'id': 'phaw_5.0',
 'name': 'Parhyale hawaiensis genome 5.0',
 'fastaURL': 'files/phaw_5.0.fa',
 'indexURL': 'files/phaw_5.0.fa.fai'}

# Visualize data for a specific developmental stage

In [None]:
### Choose the developmental stage and genome region you'd like to visualize.

# The IGV address format is:
# <contig>:<start>-<stop>
igv_address = 'phaw_50.283823c:50,641,166-50,644,340'

### Set the following flags to "True" or "False" depending on whether or not you want to load the data.
OmniATAC_bool_dict = {
    load_Genrich_peaks = True
    load_bigwig = True
}

In [18]:
b = igv.Browser({"genome": "hg19"})
b.show()