# How to use igv-notebook in jupyter

This notebook will use igv-notebooks (https://github.com/igvteam/igv-notebook) in the GCP provided Python3 kernels. The study being used is a WGBS methylation GEO study study accession GSE188157 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE188157). This is a mouse study with male and female BRCA KO and wildtype samples from liver. 

Bismark coverage files were taken tranformed into BedGraph format(4 columns from 6 columns) and input into an interactive IGV browser in the notebook. We will use mm10 as reference and also load an annotation track from UCSC genome browser annotations FTP (https://hgdownload.soe.ucsc.edu/gbdb/mm10/). 

In [None]:
%cd ../

In [None]:
#We will use BedGraph format and convert methylation coverage files into BedGraph format
from IPython.display import Image
Image("/home/jupyter/img/BedGraph.PNG")

### Below we are:
1. Making our directory to store our data
2. Downloading data
3. Uncompressing data

In [None]:
!mkdir GSE188157_example

In [None]:
#Mouse WGBS Study
!wget https://ftp.ncbi.nlm.nih.gov/geo/series/GSE188nnn/GSE188157/suppl/GSE188157_RAW.tar -P ~/GSE188157_example

In [None]:
!wget https://ftp.ncbi.nlm.nih.gov/geo/series/GSE188nnn/GSE188157/matrix/GSE188157_series_matrix.txt.gz -P ~/GSE188157_example

In [None]:
!tar -xvf /home/jupyter/GSE188157_example/GSE188157_RAW.tar -C /home/jupyter/GSE188157_example/

In [None]:
!gunzip /home/jupyter/GSE188157_example/*.gz

In [None]:
!ls /home/jupyter/GSE188157_example/

In [None]:
!head /home/jupyter/GSE188157_example/GSM5671130_SEQ0032_3_3-MUP1_GRCm38_bismark_bt2_pe.bismark.cov

### Now we will use a shell script to convert each .cov file to a .bedgraph

This step is specific to bismark coverage files that are similar to bed files...but not exactly. SO, we are chopping off columns 5 and 6 from the Bismark.cov files. If you are using BED, BedGraph, BAM, or some other track format you should already be in the format that you need. 

To convert the .cov files, make the script below and save it in /home/jupyter

In [None]:
!cat CovToBedGraph.sh

In [None]:
!bash CovToBedGraph.sh

In [None]:
!head /home/jupyter/GSE188157_example/GSM5671130_SEQ0032_3_3-MUP1_GRCm38_bismark_bt2_pe.bismark.bedgraph

In [None]:
#What genome build did this study use?
!gunzip /home/jupyter/GSE188157_example/GSE188157_series_matrix.txt.gz
!grep 'Genome_build' /home/jupyter/GSE188157_example/GSE188157_series_matrix.txt

### You can store reference files locally or point to URLs with igv-notebook

Below we are downloading some reference files and storing them locally in our notebook. If you have some custom reference you'de like to use you could similarly create a folder for them and upload them. Alternatively, as we've seen in previous workshops you can store files in Google Cloud Storage buckets and either mount the bucket with gcsfuse or copy them in with gsutil commands. 

In [None]:
!mkdir Reference_mm10

In [None]:
!wget https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/001/635/GCF_000001635.26_GRCm38.p6/GCF_000001635.26_GRCm38.p6_genomic.fna.gz -P Reference_mm10/

In [None]:
!gunzip Reference_mm10/GCF_000001635.26_GRCm38.p6_genomic.fna.gz

In [None]:
!mv Reference_mm10/GCF_000001635.26_GRCm38.p6_genomic.fna Reference_mm10/GCF_000001635.26_GRCm38.p6_genomic.fa

In [None]:
!conda install -c bioconda samtools -y

In [None]:
!samtools faidx Reference_mm10/GCF_000001635.26_GRCm38.p6_genomic.fa

### Install igv-notebook and initialize

The kernel we are using is the default Python 3 (ipykernel) VertexAI kernel. We are installing igv-notebook with pip below.

In [None]:
#install IGV on managed notebook
!pip install --user igv-notebook

In [None]:
import igv_notebook

igv_notebook.init()

#### Browser 'b1' shows the human reference that Broad stores a copy of. 

In [None]:
b1 = igv_notebook.Browser(
    {
        "genome": "hg19",
        "locus": "chr22:24,376,166-24,376,456"
    }
)

#### Use this command to make b1 static prior to exporting your notebook

In [None]:
b1.to_svg()

#### Browser 'b2' shows using the local reference files we downloaded earlier

In [None]:
b2 = igv_notebook.Browser(
    {
        "reference": {
            "id": "mm10_custom",
            "name": "mm10_custom",
            "fastaURL": "Reference_mm10/GCF_000001635.26_GRCm38.p6_genomic.fa",
            "indexURL": "Reference_mm10/GCF_000001635.26_GRCm38.p6_genomic.fa.fai",
        },
        "locus": "NT_080256.1:71,136-71,205"
    })

#### Browser 'b3' shows a mouse reference. We'll move forward with adding tracks to this reference since our study has mm10 coverage files produced by Bismark.

In [None]:
#open interactive genome browser in notebook
b3 = igv_notebook.Browser(
    {
        "genome": "mm10",
        "locus": "chr11:101,479,688-101,606,077"
    }
)

#brca1 chr11:101,479,688-101,606,077
#pax9 chr12:56691693-56712824


In [None]:
#Add local file as a track
b3.load_track(
{
        "name": "KO",
        "url": "GSE188157_example/GSM5671130_SEQ0032_3_3-MUP1_GRCm38_bismark_bt2_pe.bismark.bedgraph",
        "format": "bedgraph",
        "type": "wig",
        "color":"red",
        "height": 25
    })

In [None]:
b3.load_track(
{
        "name": "WT",
        "url": "GSE188157_example/GSM5671131_SEQ0032_1_4-MUP2_GRCm38_bismark_bt2_pe.bismark.bedgraph",
        "format": "bedgraph",
        "type": "wig",
        "color": "blue",
        "height": 25
    })

In [None]:
#Add annotation track from UCSC website
b3.load_track(
{
        "name": "UCSC_encode3_chromHmm_mm10_Liver_P0",
        "url": "https://hgdownload.soe.ucsc.edu/gbdb/mm10/encode3/chromHmm/encode3RenChromHmmLiverP0.bb",
        "format": "bigbed",
        "type": "annotation",
        "height": 100
    })

In [None]:
b3.to_svg()

#### Learn about the different track types in IGV
https://github.com/igvteam/igv.js/wiki/Tracks-2.0