VarCal is a simple python script to call SNPs.
It takes BAM and reference fasta as input to call variants, which are reported in a tabular format.
The tool can be run downloading the required environment file (env.yml) for conda and the pythin script (varcall.py).
The steps are given below:
$ conda env create -f env.yml
$ conda activate varcall
$ python3 varcall.py -h
usage: varcall.py [-h] -b BAM -r REF -o OUT
Generate variant calls from a BAM file using ref fasta
optional arguments:
-h, --help show this help message and exit
-b BAM, --bam BAM bam file
-r REF, --ref REF reference fasta
-o OUT, --out OUT file name to output variant calls
The output is a tab delimited text file and it has the following columns:
- Chr - chromosome
- POS - position of the variant
- REF - Reference base
- ALT - ALT base
- Avg_BQ - Average base quality
- Avg_MQ - Average mapping quality
- Coverage - Read coverage
- Avg_PiR - Average position of the base in the read
- ALT_DP - Depth of alterate allele
- AF - Allele frequency
- Entropy - Entropy based on number of +ve and -ve strands
- Total_Pos_Strand - Number of total reads mapped to +ve strand
- Total_Neg_Strand - Number of total reads mapped to -ve strand
The following filters are applied by default:
- Uniquely mapped reads only
- Minimum total coverage of 5
- Minimum variant-supporting coverage of 2
- Minimum average base quality of q15
- Minimum variant allele frequency of 0.01