# QIIME2 Getting Started Notebook

This notebook demonstrates basic QIIME2 functionality in Jupyter.

In [None]:
# Import necessary libraries
import qiime2
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from qiime2 import Artifact, Metadata, Visualization

# Set up plotting
sns.set_style("whitegrid")
%matplotlib inline

## Check QIIME2 Installation

In [None]:
# Check QIIME2 version
!qiime info

## Download Sample Data

Let's download the Moving Pictures tutorial data:

In [None]:
!mkdir -p sample-data
!cd sample-data && wget -q "https://data.qiime2.org/2025.7/tutorials/moving-pictures/sample-metadata.tsv"
!cd sample-data && wget -q "https://data.qiime2.org/2025.7/tutorials/moving-pictures/emp-single-end-sequences/sequences.fastq.gz"
!cd sample-data && wget -q "https://data.qiime2.org/2025.7/tutorials/moving-pictures/emp-single-end-sequences/barcodes.fastq.gz"
print("Sample data downloaded!")

## Load and Explore Metadata

In [None]:
# Load metadata
metadata_df = pd.read_csv('sample-data/sample-metadata.tsv', sep='\t', index_col=0)
print(f"Metadata shape: {metadata_df.shape}")
print(f"\nColumns: {list(metadata_df.columns)}")
metadata_df.head()

## Import Data into QIIME2

In [None]:
# Import multiplexed sequences
!qiime tools import \
  --type EMPSingleEndSequences \
  --input-path sample-data \
  --output-path sample-data/emp-single-end-sequences.qza

print("Data imported successfully!")

## Demultiplex Sequences

In [None]:
# Demultiplex the sequences
!qiime demux emp-single \
  --i-seqs sample-data/emp-single-end-sequences.qza \
  --m-barcodes-file sample-data/sample-metadata.tsv \
  --m-barcodes-column barcode-sequence \
  --o-per-sample-sequences sample-data/demux.qza \
  --o-error-correction-details sample-data/demux-details.qza

print("Demultiplexing complete!")

## Create and View Summary Visualization

In [None]:
# Create summary visualization
!qiime demux summarize \
  --i-data sample-data/demux.qza \
  --o-visualization sample-data/demux.qzv

# Load and display the visualization
viz = Visualization.load('sample-data/demux.qzv')
viz

## Working with QIIME2 Artifacts in Python

In [None]:
# Load an artifact
demux_artifact = Artifact.load('sample-data/demux.qza')

# View artifact information
print(f"Type: {demux_artifact.type}")
print(f"Format: {demux_artifact.format}")
print(f"UUID: {demux_artifact.uuid}")

## Quality Control and Denoising with DADA2

Based on the quality plots from the demux.qzv visualization, we'll denoise the sequences:

In [None]:
# This step takes a few minutes
!qiime dada2 denoise-single \
  --i-demultiplexed-seqs sample-data/demux.qza \
  --p-trim-left 0 \
  --p-trunc-len 120 \
  --o-representative-sequences sample-data/rep-seqs-dada2.qza \
  --o-table sample-data/table-dada2.qza \
  --o-denoising-stats sample-data/stats-dada2.qza

print("DADA2 denoising complete!")

## Visualize Denoising Stats

In [None]:
# Create denoising stats visualization
!qiime metadata tabulate \
  --m-input-file sample-data/stats-dada2.qza \
  --o-visualization sample-data/stats-dada2.qzv

# View the stats
stats_viz = Visualization.load('sample-data/stats-dada2.qzv')
stats_viz

## Feature Table Summary

In [None]:
# Summarize feature table
!qiime feature-table summarize \
  --i-table sample-data/table-dada2.qza \
  --o-visualization sample-data/table-dada2.qzv \
  --m-sample-metadata-file sample-data/sample-metadata.tsv

# Summarize representative sequences
!qiime feature-table tabulate-seqs \
  --i-data sample-data/rep-seqs-dada2.qza \
  --o-visualization sample-data/rep-seqs-dada2.qzv

print("Feature table summaries created!")

## Next Steps

From here, you can continue with:

1. **Taxonomic Classification**: Classify your sequences against a reference database
2. **Phylogenetic Analysis**: Build a phylogenetic tree
3. **Diversity Analysis**: Calculate alpha and beta diversity metrics
4. **Differential Abundance**: Test for differentially abundant features

Check the QIIME2 tutorials for detailed workflows: https://docs.qiime2.org/2025.7/tutorials/