# Sequencing Pipline

### The data file (including reference FASTA and sequencing raw FASTQ is:
- wu_0.v7.fas is a reference file;
- wu_0_A_wgs.fastq is a raw sequencing data file

## Index FASTA

In [None]:
mkdir idx

> $ **bowtie2-build**  [reference_fasta_file.fasta]  [index_file]

In [None]:
bowtie2-build wu_0.v7.fas idx/wu_0_idx

## Alignment

> $ **bowtie2-p** 4 **-x**  [fasta_index_file]  **-U**  [in.fastq]  **-S**  [out.sam]

In [None]:
bowtie2 -p 4 -x idx/wu_0_idx -U wu_0_A_wgs.fastq -S out.full.sam

## SAM to BAM

> $ **samtools view -bT** [reference_fasta_file.fasta] [in.sam] **>** [out.bam]

In [None]:
samtools view -bT wu_0.v7.fas out.full.sam > out.full.bam

## Sort BAM File

> $ **samtools sort** [original.bam] **-o** [sorted.bam]

In [None]:
samtools sort out.full.bam -o out.full.sorted.bam

## Index BAM

> $ **samtools index** [sorted.bam]

In [None]:
samtools index out.full.sorted.bam

## M Pile up to VCF

- The VCF file generated in this step is **candidate** entry list, not final variants.

> $ **samtools mpileup -f**  [reference.fasta]  **-uv**  [sorted_indexed.bam]  **>**  [candidate_entry_vcf_file.vcf]

In [None]:
samtools mpileup -f wu_0.v7.fas -uv out.full.sorted.bam > out.full.mpileup.vcf

## Build BCF

- The BCF file generated in this step is **real variant** list file, but in binary format.

> $ **samtools mpileup -f**  [reference.fast]  **-g**  [sorted_indexed.bam]  **>**  [variant.bcf]

In [None]:
samtools mpileup -f wu_0.v7.fas -g out.full.sorted.bam > out.full.mpileup.bcf

## Call Variants

> $ **bcftools call**  -m -v -O v [variant.bcf]  **>**  [variant_call_format_file.vcf]

In [None]:
bcftools call -m -v -O v out.full.mpileup.bcf > out.full.vcf