# Chromviewer-With-Vcf

As the tool name suggest, it's purpose is to take vcf as a input file and on the output display the chromosomes and the variants in them in an illustrated way.

In the work directory we have a bash script which has the task to install needed dependencies if needed and then run the R script with the filed options.

The only requirment before running the script is to install ssh-askpass, because not all Linux distro's provide a unified askpass. For the script we use ssh-askpass and the following command must be run on a terminal:

`sudo apt-get install ssh-askpass-gnome ssh-askpass`

Now that everything is ready, we can run the script and see the output when there are no arguments added.

__You must replace the path to chromosme_viewer.sh and then run it__

In [None]:
%%bash

{path_to_folder}/chromosome_viewer.sh

Firstly we will be promted for password so that packages can be installed if they're missing. If none of them are installed it might take a while (est. 10-30 min).

After checking for depending packages, we can see that a help section showed up which lists all the options we can include and also an error message that we must include the input file path. 

The only mandatory option are __*-i*__ (for the input vcf file's path) and __*-o*__ (the output html file's path). We can test this out and set the output directory same as the notebook's directory with the following command:

In [None]:
%%bash 
DIR=`pwd`
{path_to_folder}/chromosome_viewer.sh -i ./data/test.vcf -o $DIR/result1/result.html

We created a directory inside the current one which contains the output files - the result html file with it's required javascript files for visualisation and two text files containing chromosome lenghts and annotations info.

The output should look like this:

<img src="Images/result1.png" alt="Result 1"/>

You can check the generated files to take a look too!

When we hover over the colored areas, we can see that a dialog window appears. It lists the variants in this region and when clicking on certain variant, it leads to [NIH's dbSNP database](https://www.ncbi.nlm.nih.gov/snp/).

*Note*: When we click on the region the window won't disappear! 

<img src="Images/hover.png" alt="Hover dialog"/>

*Note*: Open the links in a new browser tab

When a variant is listed as null it means that there is no information for it on dbSNP database.

***

So far the options we know are:

- *-f* or *--filter* 
> Use only variants with value “PASS” in FILTER field

- *-r* or *--reference*
> If contig tags with lenghts aren't provided you can choose chromosome lengths between hg19 and hg38 reference genomes

- *-c* or *--chromosome*
> Filter by chromosomes, ex. <chr#:from-to,chr#:from-to,...>, *IMPORTNANT*: notation should be same with vcf file

- *-p* or *--pathogen*
> Filter by pathogenicity

- *-g* or *--clnsig*
> Filter by clinical significance, *USE NUMBERS ONLY*, ex. Uncertain - 0, Not provided - 1, Benign - 2, Likely benign - 3, Likely pathogenic - 4, Pathogenic - 5, Drug-response related - 6, Histocompatibility-related - 7, Other - 255, ex. <-g 0,3,4>

When we include the listed options we can filter further down the input file only with the variants of interest.
For example we want only the first chromosome to be displayed. To display chromosome we use the __*-c*__ tag

__Important__: The chromosome notation should be known prior when we use __*-c*__ tag. We can check it either in the contig section if it's present or the *#CHROM* column for the variants. Most common annotations are for example chr1,chr2.. / 1,2.. / c1,c2.. etc.

In [None]:
%%bash 
DIR=`pwd`
{path_to_folder}/chromosome_viewer.sh -i ./data/test.vcf -o $DIR/result2/result.html -c chr1

The result should look like this:

<img src="Images/result2.png" alt="Result2"/>

To filter by chromosome and their lenghts. For example we want from the start of the chromosome 1 (1) to the base pair 130'000'000 and all of chromosome 2. We can use the following command:

In [None]:
%%bash 
DIR=`pwd`
{path_to_folder}/chromosome_viewer.sh -i ./data/test.vcf -o $DIR/result3/result.html -c chr1:1-130000000,chr2

The result should look like this:

<img src="Images/result3.png" alt="Result 3"/>

We can filter to display only thevariants which *#FILTER* passes. Then we should use __*-f*__ tag. The command will look like this:

In [None]:
%%bash 
DIR=`pwd`
{path_to_folder}/chromosome_viewer.sh -i ./data/test.vcf -o $DIR/result4/result.html -c chr1:1-130000000,chr2 -f

The result should look like this:

<img src="Images/result4.png" alt="Result 4"/>

When the contig tag is missing or it lacks chromosome lengths, then we should use a reference file for chromosome lenghts. It can either be *hg19* or *hg38* and it's specified by the __*-r*__ option.

In the next example we will use a file that lakcs contig section and would also want to display only pathogenic variant. To display pathogenic variants only we use __*-p*__ option:

In [None]:
%%bash 
DIR=`pwd`
{path_to_folder}/chromosome_viewer.sh -i ./data/annotated_data.vcf -o $DIR/result5/result.html -r hg38 -p

The result should look like this:

<img src="Images/result5.png" alt="Result 5"/>

*Note*: When we use pathogenic variants, their variant ID will lead us instead to [NIH's ClinVar database](https://www.ncbi.nlm.nih.gov/clinvar/). You can try it!

<img src="Images/hover_pathogen.png" alt="Hover on pathogenic"/>

There is also an option to filter by clinical significance. If we wish to use it, we must specify the types with the __*-g*__ option. The types are the following:

- Uncertain - 0, 
- Not provided - 1, 
- Benign - 2,
- Likely benign - 3, 
- Likely pathogenic - 4, 
- Pathogenic - 5,
- Drug-response related - 6,
- Histocompatibility-related - 7, 
- Other - 255

We can combine them using a coma.
Ex. If we want to display only variants with clinical significance Bening and Likely benbing we will call the option like this: *-g 2,3*

*NOTE*: We can't use __*-p*__ and __*-g*__ options simultaneously. 

Run the following command to filter only drug response related variants:

In [None]:
%%bash 
DIR=`pwd`
{path_to_folder}/chromosome_viewer.sh -i ./data/annotated_data.vcf -o $DIR/result6/result.html -r hg38 -g 6

The result should look like this:

<img src="Images/result6.png" alt="Result 6"/>