# Variant calling

This is a [Jupyter notebook](https://jupyter.org/) containing a guide and workflow for detecting variants on whole genome or targeted sequencing data. It is comprised of two parts:

1. Variant detection using the nfcore/sarek pipeline (https://nf-co.re/sarek) which is a variant calling and annotation pipeline built in the [Nextflow](https://www.nextflow.io/) workflow manager system.

2. Identification of functional variants. nfcore/sarek annotates variants using the [SNPEff package](http://pcingola.github.io/SnpEff/). In this section we use [SNPSift](http://pcingola.github.io/SnpEff/snpsift/introduction/) to filter the variants for biologically relevant variants.

This workflow was prepared by the [eResearch Office, QUT.](https://qutvirtual4.qut.edu.au/group/staff/governance/organisational-structure/academic-division/research-portfolio/research-infrastructure/eresearch)

**********************************

# Contents
[How to use this Jupyter Notebook](#overview)

1. [nfcore/sarek workflow](./WES_1.ipynb)

2. [Identification of functional variants](./WES_2.ipynb)

Clicking on the above links will open a separate Jupyter Notebook to run either of the two main analysis sections.


***************************

## How to use this Jupyter Notebook <a class="anchor" id="overview"></a>

Juypter Notebooks run a 'kernel' that allow code to be run in code 'cells' in the Notebook. This Notebook is running the BASH kernel, which allows for commands to be run on QUTs high performance compute cluster (HPC).

You can run a code cell by clicking on the cell itself and clicking the run button (at the top of this Notebook), or by pressing shift+enter.

![](https://data36.com/wp-content/uploads/2021/07/how-to-run-cell-in-jupyter-notebook.png)

<div class="alert alert-block alert-warning">
As an example, run the following code cell to list the contents of your HPC home directory.
</div>

In [3]:
ls $HOME

Afshin                                nextflow
anacapa                               Nextflow_pipelines
Angelico                              NTNU_2301
Annette                               NXF_CONDA_CACHEDIR
bin                                   NXF_SINGULARITY_CACHEDIR
edirect                               pipeliner
errors.log                            projectxx_final_results
git                                   public_html
go                                    R
IMR_scale_slime                       rachel
James_Stanley_groundwater_microbiome  Rolf_bactocell
Jupyter_Nextflow_install              rolf_erik
jupyter_notebooks                     sam_dando
Jupyter_WES                           singularity
liver_project_final_results           temp
local                                 tmux
mahsa                                 Vikki_horses_pacbio16s
marco                                 WGS_police
miniconda2                            wtdbg2
miniconda3                            yeast

**Before each code cell is a colour-coded text box that tells you what the cell does. The colour of the text box tells you whether a code cell is required to run as-is, optional or if it requires you to type input.** 

<div class="alert alert-block alert-success">
A green text box indicates a code cell that must be run, without alteration, to complete the workflow.
</div>

<div class="alert alert-block alert-warning">
A yellow text box indicates an optional code cell that doesn't have to be run to complete the workflow, but can be run to complete optional tasks.
</div>

<div class="alert alert-block alert-info">
A blue text box indicates a code cell that requires user input - this cell also must be run to complete the workflow, but the user needs to modify the command in the cell.
</div>

*******************************

[**Click here to open the nfcore/sarek Notebook**](./WES_1.ipynb)