# Project 2: M. tuberculosis Genome Assembly
## 03 - Assembly Quality Control with QUAST

* **Author:** Youssef Mimoune
* **Date:** 26-Oct-2025
* **Sample IDs:** `DRR749571` (Control), `DRR749572` (Resistant)

### Objective
This notebook evaluates the quality of our two *de novo* assemblies from SPAdes. We will use `QUAST` to generate a comparative report.

### Tool
* `QUAST`: (v5.3.0) Quality Assessment Tool for Genome Assemblies.

### Key Metrics to Watch
* **Total Length:** Should be ~4.4 Mbp (the known size of *M. tuberculosis*).
* **# of Contigs:** The number of pieces in our assembly. (Fewer is better).
* **N50:** The contig size at which 50% of the assembly is covered. (Larger is better).
* **L50:** The number of contigs needed to cover 50% of the assembly. (Smaller is better).

In [None]:
print("--- 1. Creating directory for QUAST reports ---")
!mkdir -p ../analysis/04_quast_qc

print("Directory created.")
!ls -l ../analysis/

In [None]:
print("--- 2. Running QUAST (Simplified Command, no labels) ---")

# We removed the -l argument. QUAST will use the filenames as labels.
!quast.py \
  -o ../analysis/04_quast_qc \
  -t 4 \
  ../analysis/03_spades_assembly/DRR749571_assembly/contigs.fasta \
  ../analysis/03_spades_assembly/DRR749572_assembly/contigs.fasta

print("--- QUAST analysis complete ---")

In [None]:
print("\n--- 3. Verifying QUAST Output ---")
!ls -lh ../analysis/04_quast_qc