# Align paired-end read with SARS-CoV-2 genome (NC_045512)

## Index FASTA file using bowtie2
```
mkdir bowtie2
bowtie2-build NC_045512.fa bowtie2/NC_045512
```

## Align FASTQ reads using bowtie2
If needed, downsample coverage to 10000x using DRAGEN FastQC with 30329 bp as genome size.
```
bowtie2 -t -x bowtie2/NC_045512 -1 SarsCov2Million_558613411_DS_S1_L001_R1_001.fastq.gz -2 SarsCov2Million_558613411_DS_S1_L001_R2_001.fastq.gz -S bowtie2/SarsCov2Million.sam
```

## Sort SAM file and convert to BAM
```
samtools view -bS bowtie2/SarsCov2Million.sam | samtools sort -o samtools/SarsCov2Million.bam
```

## Get consensus FASTQ file

### Call variants
```
bcftools mpileup -Ou -f NC_045512.fa samtools/SarsCov2Million.bam | bcftools call -mv -Oz -o SarsCov2Million.vcf.gz
```
### Normalize indels
```
bcftools norm -f NC_045512.fa SarsCov2Million.vcf.gz -Ob -o SarsCov2Million.norm.bcf
```
### Filter indels
```
bcftools filter -sLowQual -g3 -G10 -e '%QUAL<10 || (AC<2 && %QUAL<15)' -Ob SarsCov2Million.norm.bcf -o SarsCov2Million.hard-filtered.vcf.gz
```
### Index variants
```
bcftools index SarsCov2Million.hard-filtered.vcf.gz
```
### Generate consensus sequence
```
cat NC_045512.fa | bcftools consensus SarsCov2Million.hard-filtered.vcf.gz > SarsCov2Million.cns.fa
```