# Test with ipyigv and local files

## Load modules

In [1]:
import ipyigv as igv
from ipywidgets.widgets.trait_types import InstanceDict
from ipyigv.options import ReferenceGenome, Track
from ipywidgets import Output 

## Remote human genome (reference use case)

In [2]:
genome_dict = {
    'id': 'hg38',
    'name': 'Human (GRCh38/hg38)',
    'fastaURL': 'https://s3.dualstack.us-east-1.amazonaws.com/igv.broadinstitute.org/genomes/seq/hg38/hg38.fa',
    'indexURL': 'https://s3.dualstack.us-east-1.amazonaws.com/igv.broadinstitute.org/genomes/seq/hg38/hg38.fa.fai',
    'cytobandURL': 'https://s3.dualstack.us-east-1.amazonaws.com/igv.org.genomes/hg38/annotations/cytoBandIdeo.txt.gz',
    'tracks': [
        {
            'name': 'Refseq Genes',
            'format': 'refgene',
            'url': 'https://s3.dualstack.us-east-1.amazonaws.com/igv.org.genomes/hg38/refGene.txt.gz',
            'indexed': False,
            'visibilityWindow': -1,
            'removable': False,
            'order': 1000000
        }
    ]
}

genome = ReferenceGenome(**genome_dict)
browser = igv.IgvBrowser(genome=genome)
browser

IgvBrowser(genome=ReferenceGenome(cytobandURL='https://s3.dualstack.us-east-1.amazonaws.com/igv.org.genomes/hg…

This should work.

## Load local *Ostreococcus tauri* genome

Genome index (`otauri.fa.fai`) has been created with the standalone version of IGV and this guideline: https://software.broadinstitute.org/software/igv/LoadGenome

In [3]:
genome_dict = {
    "id": "otauri",
    "name": "O tauri",
    "fastaURL": "files/otauri.fa",
    "indexURL": "files/otauri.fa.fai",
    "tracks": [
      {
        "name": "Refseq Genes",
        "format": "refgene",
        "url": "files/otauri.gff",
        "order": 1000000,
        "visibilityWindow": -1,
        "indexed": False
      }
    ]
}

genome = ReferenceGenome(**genome_dict)
browser = igv.IgvBrowser(genome=genome)
browser

IgvBrowser(genome=ReferenceGenome(fastaURL='files/otauri.fa', id='otauri', indexURL='files/otauri.fa.fai', nam…

The genome is not loaded.

## Local human genome

Download the data first :

In [4]:
%%bash

if [[ ! -f hg38.fa ]]
then
    wget -c --no-verbose https://s3.dualstack.us-east-1.amazonaws.com/igv.broadinstitute.org/genomes/seq/hg38/hg38.fa
fi

if [[ ! -f hg38.fa.fai ]]
then
    wget -c --no-verbose https://s3.dualstack.us-east-1.amazonaws.com/igv.broadinstitute.org/genomes/seq/hg38/hg38.fa.fai
fi

if [[ ! -f refGene.txt.gz ]]
then
    wget -c --no-verbose https://s3.dualstack.us-east-1.amazonaws.com/igv.org.genomes/hg38/refGene.txt.gz
fi

In [5]:
!ls -lh hg38* ref*

-rw-rw-r-- 1 pierre pierre 3,1G mars  11  2014 hg38.fa
-rw-rw-r-- 1 pierre pierre  19K mars  11  2014 hg38.fa.fai
-rw-rw-r-- 1 pierre pierre 8,0M août  11  2020 refGene.txt.gz


In [6]:
genome_dict = {'id': 'hg38',
 'name': 'Human (GRCh38/hg38)',
 'fastaURL': 'files/hg38.fa',
 'indexURL': 'files/hg38.fa.fai',
 "tracks": [
      {
        "name": "Refseq Genes",
        "format": "refgene",
        "url": "files/refGene.txt.gz",
        "order": 1000000,
        "visibilityWindow": -1,
        "indexed": False
      }
    ]
}
genome = ReferenceGenome(**genome_dict)
browser = igv.IgvBrowser(genome=genome)
browser

IgvBrowser(genome=ReferenceGenome(fastaURL='files/hg38.fa', id='hg38', indexURL='files/hg38.fa.fai', name='Hum…

It works!

Now let's change the id of the genome:

In [7]:
genome_dict = {'id': 'toto',
 'name': 'Human (GRCh38/hg38)',
 'fastaURL': 'files/hg38.fa',
 'indexURL': 'files/hg38.fa.fai',
 "tracks": [
      {
        "name": "Refseq Genes",
        "format": "refgene",
        "url": "files/refGene.txt.gz",
        "order": 1000000,
        "visibilityWindow": -1,
        "indexed": False
      }
    ]
}

genome = ReferenceGenome(**genome_dict)
browser = igv.IgvBrowser(genome=genome)
browser

IgvBrowser(genome=ReferenceGenome(fastaURL='files/hg38.fa', id='toto', indexURL='files/hg38.fa.fai', name='Hum…

Albeit, genome files are present locally, the genome is not loaded!

This means that igv.js doesn't care file URLs but only care to the id and retrieve files from the internet.

This simple example with the id solely works nicely:

In [8]:
genome_dict = {'id': 'hg38'}
genome = ReferenceGenome(**genome_dict)
browser = igv.IgvBrowser(genome=genome)
browser

IgvBrowser(genome=ReferenceGenome(id='hg38'), locus='')

Regarding the example providing by Sylvain, we can't check whether or not the local file is actually loaded because the reference genome *hg38* is shipped with its annotations (`refGenes.txt.gz`) in any case.

In [9]:
trackDict = {'name': 'Refseq Genes',
   'format': 'refgene',
   'url': 'files/refGene.txt.gz',
   'indexed': False,
   'visibilityWindow': -1,
   'removable': False,
   'order': 1000000}
oneMoreTrack = Track(**trackDict)
browser.add_track(oneMoreTrack)