Skip to content

Latest commit

 

History

History
67 lines (46 loc) · 2.45 KB

00_introduction.md

File metadata and controls

67 lines (46 loc) · 2.45 KB

In this session you will learn how to do:

  • genotype calling
  • allele frequency estimation
  • variant (or SNP) calling

We are using the program ANGSD (Analysis of Next Generation Sequencing Data). More information about its rationale and implemented methods can be found here.

According to its website ANGSD is a software for analyzing next generation sequencing data. The software can handle a number of different input types from mapped reads to imputed genotype probabilities. Most methods take genotype uncertainty into account instead of basing the analysis on called genotypes. This is especially useful for low and medium depth data.

Please make sure to follow these preparatory instructions below before running these examples. Briefly, you need to set the path to the software and various data that will be used. Also, you will have to create two folders on your working directory, one for your results and one for your intermediate data.

Make sure you are in your home directory.

cd ~

and create a folder for this session and enter it

mkdir day2
cd day2

and you should be in ~/day2.

Also, you will have to create two folders on your working directory, one for your results and one for your intermediate data.

mkdir Results
RESDIR=~/day2/Results

mkdir Data
DATDIR=~/day2/Data

Let's set all environment variables

DIR=/home/ubuntu/Share/physalia-lcwgs/data
DATA=$DIR/BAMS
REF=$DIR/Ref.fa
ANC=$DIR/outgrp_ref.fa

The workflow for this session looks like this

stages

which seems daunting! However, that's not the case and we will go through each step to understand each one of them.

The workflow is roughly divided into four steps:

1. Data filtering and I/O

2. Genotype likelihoods

3. Genotype calling

4. SNP calling

You are now going to learn how to build your first pipeline in ANGSD for data processing and filtering.

click here to move to the next session.