# `snakemake` Short Tutorial

Solutions
---------

Only read this if you have a problem with one of the steps.

if you get an error, please manually create "envs" folder

In [None]:
%%writefile envs/mapping.yaml
channels:
    - bioconda
    - conda-forge
dependencies:
    - bwa =0.7.17
    - samtools =1.9

### Step 2

The rule should look like this:

In [None]:
%%writefile Snakefile
rule bwa:
    input:
        "data/genome.fa",
        "data/samples/A.fastq"
    output:
        "mapped/A.bam"
    conda:
        "envs/mapping.yaml"
    shell:
        "bwa mem {input} | samtools view -Sb - > {output}"

### Step 3

The rule should look like this:

    rule bwa:
        input:
            "data/genome.fa",
            "data/samples/{sample}.fastq"
        output:
            "mapped/{sample}.bam"
        conda:
            "envs/mapping.yaml"
        shell:
            "bwa mem {input} | samtools view -Sb - > {output}"

### Step 4

The rule should look like this:

    rule sort:
        input:
            "mapped/{sample}.bam"
        output:
            "mapped/{sample}.sorted.bam"
        conda:
            "envs/mapping.yaml"
        shell:
            "samtools sort -o {output} {input}"

### Step 5

The rule should look like this:

    samples = ["A", "B", "C"]

    rule call:
      input:
          fa="data/genome.fa",
          bam=expand("mapped/{sample}.sorted.bam", sample=samples)
      output:
          "calls/all.vcf"
      conda:
          "envs/calling.yaml"
      shell:
          "samtools mpileup -g -f {input.fa} {input.bam} | "
          "bcftools call -mv - > {output}"

### Step 6

The rule should look like this:

    rule stats:
        input:
            "calls/all.vcf"
        output:
            "plots/quals.svg"
        conda:
            "envs/stats.yaml"
        script:
            "scripts/plot-quals.py"

### Step 7

The rule should look like this:

    rule all:
        input:
            "calls/all.vcf",
            "plots/quals.svg"

It has to appear as first rule in the `Snakefile`.

### Step 8

The complete workflow should look like this:

In [None]:
%%writefile Snakefile
samples = ["A", "B", "C"]

rule all:
    input:
        "calls/all.vcf",
        "plots/quals.svg"

        
rule bwa:
    input:
        "data/genome.fa",
        "data/samples/{sample}.fastq"
    output:
        temp("mapped/{sample}.bam")
    conda:
        "envs/mapping.yaml"
    threads: 8
    shell:
        "bwa mem -t {threads} {input} | samtools view -Sb - > {output}"


rule sort:
    input:
        "mapped/{sample}.bam"
    output:
        "mapped/{sample}.sorted.bam"
    conda:
        "envs/mapping.yaml"
    shell:
        "samtools sort -o {output} {input}"



rule call:
    input:
        fa="data/genome.fa",
        bam=expand("mapped/{sample}.sorted.bam", sample=samples)
    output:
        "calls/all.vcf"
    conda:
        "envs/calling.yaml"
    shell:
        "samtools mpileup -g -f {input.fa} {input.bam} | "
        "bcftools call -mv - > {output}"

        
rule stats:
    input:
        "calls/all.vcf"
    output:
        report("plots/quals.svg", caption="report/calling.rst")
    conda:
        "envs/stats.yaml"
    script:
        "scripts/plot-quals.py"