# Project 2: M. tuberculosis Genome Assembly
## 04 - Genome Annotation with Prokka

* **Author:** Youssef Mimoune
* **Date:** 26-Oct-2025
* **Sample IDs:** `DRR749571` (Control), `DRR749572` (Resistant)

### Objective
This notebook takes our high-quality assemblies (contigs.fasta) and "annotates" them. We will use `Prokka` to find all the genes (CDS, rRNA, tRNA) and predict their functions.

This is the final step before we can compare the genomes to find resistance-related mutations.

### Tool
* `Prokka`: (v1.14.6) Rapid Prokaryotic Genome Annotation.
* **Key Parameters:**
    * `--gcode 11`: **CRITICAL.** *Mycobacterium* uses Genetic Code 11, not the standard 1.
    * `--kingdom Bacteria`: Specify we are annotating bacteria.

In [None]:
print("--- 2. Starting Prokka Annotation for DRR749571 (Control) ---")
# Attempt 4: Correcting the INPUT file path (was '03_prokka...' should be '03_spades...')

!prokka \
  --outdir ../analysis/05_prokka_annotation/DRR749571_annotation \
  --prefix DRR749571_control \
  --kingdom Bacteria \
  --gcode 11 \
  --genus Mycobacterium \
  --species tuberculosis \
  --cpus 4 \
  ../analysis/03_spades_assembly/DRR749571_assembly/contigs.fasta

print("--- Prokka Annotation for DRR749571 COMPLETE ---")

In [None]:
print("--- 3. Starting Prokka Annotation for DRR749572 (Resistant) ---")

!prokka \
  --outdir ../analysis/05_prokka_annotation/DRR749572_annotation \
  --prefix DRR749572_resistant \
  --kingdom Bacteria \
  --gcode 11 \
  --genus Mycobacterium \
  --species tuberculosis \
  --cpus 4 \
  ../analysis/03_spades_assembly/DRR749572_assembly/contigs.fasta

print("--- Prokka Annotation for DRR749572 COMPLETE ---")

In [None]:
print("\n--- 4. Verifying Prokka Output ---")
!ls -lhR ../analysis/05_prokka_annotation/

In [None]:
print("--- 5. Reading Prokka Summary Reports ---")

print("\n--- Summary for DRR749571 (Control) ---")
!cat ../analysis/05_prokka_annotation/DRR749571_annotation/DRR749571_control.txt

print("\n--- Summary for DRR749572 (Resistant) ---")
!cat ../analysis/05_prokka_annotation/DRR749572_annotation/DRR749572_resistant.txt