# Alignment - WSI specific

## Introduction

In this tutorial, will then align whole-genome sequencing from a mouse zygote which was subject to CRISPR-induced mutagenesis. We will find the resulting engineered alleles, and track down some other alleles in this mouse.  

## Authors
This tutorial was written by [Vivek Iyer](https://www.sanger.ac.uk/people/directory/iyer-vivek) and [Thomas Keane](https://github.com/tk2). It was adapted for the WSI environment by Vivek Iyer

## Tutorial sections
This tutorial comprises the following sections:   
 1. [Visualising the the reference genome](reference_genome_visualisation.ipynb)   
 2. [Aligning paired FASTQ files with BWA](bwa_alignment.ipynb)   
 3. [Converting a SAM file to a BAM file](sam_bam_conversion.ipynb)   
 4. [Sorting and indexing the BAM file](sort_and_index_bam.ipynb)   
 5. [Using Unix pipes to combine the commands together](piping_commands.ipynb)
 6. [Marking PCR duplicates](pcr_duplicates.ipynb)
 7. [Generating QC stats](qc_stats.ipynb)
 8. [BAM visualisation](bam_visualisation.ipynb)
 

## Running the commands from this tutorial
The commands in this tutorial are meant to be run directly in a WSI (Sanger) server environment - in our case by logging into the server node vr-login. We have pre-prepared a working area for each user: /lustre/scratch115/teams/hgi//WSI_NGS_tutorial/read_alignment/Exercise2/<user_name> and all directories specified are underneath this directory.

### Sanger user id, 'ssh gateway' and 'network' passwords
We assume you have a Sanger User Id, which we will refer to as *_user_* and that you have used it to 

* Log into the sanger ssh gateway
* Log into individual servers

### Logging into the WSI server environment and running commands in the terminal
To run these commands you first have to open a terminal window, log into the WSI compute environment and navigate to the working directory. This is similar to the "Command Prompt" window on MS Windows systems, which allows the user to type DOS commands to manage files.

**Open a terminal window in your VM**

**Log into the Sanger server environment by entering the following command at the prompt:**

    $ ssh -L localhost:3128:wwwcache.sanger.ac.uk:3128 vvi@ssh.sanger.ac.uk

_where 'vvi' should be replaced with your user name (e.g. "kj6" or "ls7" etc)_

**You will receive a password prompt: enter your ssh gateway password.** 

    vvi@ssh.sanger.ac.uk's password: 
    Last login: Sun Aug 25 08:30:51 2019 from cpc91218-cmbg18
                  Wellcome Sanger Institute
                         SSH Gateway

         ********************************************
         * This system is for authorised users only *
         ********************************************

**After entering the ssh gateway, you will be prompted "Where would you like to go today"?  Enter the server "vr-login".**

    Where would you like to go today (exit to logout)? 

        > vr-login
 
**You will then be prompted for your standard network password:**

    Connecting to vr-login.internal.sanger.ac.uk ...
    vvi@vr-login.internal.sanger.ac.uk's password: 
    
**On providing this, you will get access to a command prompt on vr-login**

    Welcome to Ubuntu 12.04.2 LTS (GNU/Linux 3.2.0-105-generic x86_64)
     * Documentation:  https://help.ubuntu.com/
    This system is managed by CFEngine 3
    Last login: Sun Aug 25 11:11:46 2019 from ssh.sanger.ac.uk
    vvi@vr-2-2-02:~$ 
    
**Set up your software environment to mirror NPG's environment**

    vvi@vr-2-2-02:~$ . /software/npg/etc/profile.npg
   
This will allow you to access the same software (samtools, bcftools etc) that NPG uses

**Navigate to your working area of the course, replacing 'vvi' with your user name**

    vvi@vr-2-2-02:~$ cd /lustre/scratch115/teams/hgi/WSI_NGS_tutorial/read_alignment/Exercise2/vvi/
    
All following commands assume that you are in this directory

## Let’s get started!

To get started with the tutorial, head to the first section: [Visualising the the reference genome](reference_genome_visualisation.ipynb).