# Pangenomics
--------------------------------------------

# Variant Calling with vg

## Overview

Variants can be called within the pangenomic graph and by aligning reads to the graph. You will learn how to call variants both ways in this submodule.

## Learning Objectives
+ Understand different types of variants
+ Understand our ability to call variants with different types of reads and pangenomic graphs
+ Learn how to call and interpret variants with vg

## Get Started

First we will learn how to identify variants that are supported by the graph. Then we'll look at identifying novel variants that are not in the graphs.

### Call Variants

We will look for variants that are supported by the graph as well as for variants that are novel (not in the graph but supported by the reads aligned to the graph).



## Calling Graph Supported Variants

Compute read support for variation already in the graph.


**vg pack**

The parameters:

-Q 5  
+ ignore mapping and base quality < 5

-s 5  
+ ignore the first and last 5bp of each read

-o S288C.SK1.illumina.pack  
    + the output pack file
    
-t 20  
    + use 20 threads

In [None]:
!vg pack -x S288C.xg -g S288C.SK1.illumina.gam -Q 5 -s 5 -o S288C.SK1.illumina.pack -t 20

Generate a VCF from the read support.

**vg call**

The parameters:

-k S288C.SK1.illumina.pack
+ The read support file to read in

-t 20
+ Use 20 threads

In [None]:
!vg call S288C.xg -k S288C.SK1.illumina.pack -t 20 > S288C.SK1.illumina.graph_calls.vcf

## Calling Novel Variants

Augment the graph with the mapped reads.

**vg augment**

The Parameters: XXX fix these params or delete them (We alread did this in another chapter?)

-A
+ The read alignment

-t 20
+ Use 20 threads


NOTE: This only supports VG files. Indexes used for mapping must be built from the same VG file being augmented (i.e. indexes built from GFA files that were then converted to VG wonâ€™t work.)


In [None]:
!vg augment S288C.vg S288C.SK1.illumina.gam -A S288C.SK1.illumina.aug.gam -t 20 > S288C.SK1.illumina.aug.vg

Index the augmented graph.

XXX Do we need params here?

In [None]:
!vg index -x S288C.SK1.illumina.aug.xg S288C.SK1.illumina.aug.vg -t 20

Compute read support for novel variation.

In [None]:
!vg pack -x S288C.SK1.illumina.aug.xg -g S288C.SK1.illumina.aug.gam -Q 5 -s 5 -o S288C.SK1.illumina.aug.pack -t 20

Generate a VCF from the support.

In [None]:
!vg call S288C.SK1.illumina.aug.xg -k S288C.SK1.illumina.aug.pack -t 20 > S288C.SK1.illumina.aug_calls.vcf

In [None]:
## Calling Variants Already in the Graph using read support XXX

Output variants used to construct graph


**vg deconstruct**

The parameters:

-P S288C
    + report variants relative to paths with names that start with S288C (XXX expand)

NOTE: S288C.deconstruct.vcf might not be identical to S288C.vcf because VG takes liberties with variants when constructing the graph.
  XXX Remind people the differnce in how the 2 were made.


In [None]:
!vg deconstruct S288C.xg -P S288C -t 20 > S288C.deconstruct.vcf

## Exercises

1. Use vg to index the chromosome VIII graph
2. Use vg to map SK1 reads to the chromosome VIII graph
3. Use vg to call variants on chromosome VIII read mapping GAMS


## Conclusion

In this submodule, you learned different ways to call and characterize variants from the graph, including variants supported within the graph and variants supported by reads mapped to the graph.

## Clean up
No cleanup is necessary for this submodule. Don't forget to shutdown your Workbench when you are done working through this module!