**Lesson 11: CRISPR Basics & Cas9 Mechanism**

Understand how CRISPR-Cas9 works and its role in genome editing.

---

##### ðŸ“‹ Prerequisites

Before starting this lesson, you should have:
- Completed **Lesson 5** (FASTA Files & BioPython)
- BioPython installed (`pip install biopython`)
- Basic understanding of DNA sequences

---


##### ðŸ§¬ What is CRISPR?

**CRISPR** = Clustered Regularly Interspaced Short Palindromic Repeats

##### The Revolutionary Discovery
- Originally found in bacteria as an immune system
- Bacteria use CRISPR to remember and fight viruses
- Scientists adapted it to edit any DNA sequence

##### Why CRISPR Matters
1. **Precision**: Target specific DNA sequences
2. **Simplicity**: Easier than previous gene editing methods
3. **Versatility**: Works in many organisms
4. **Cost**: Much cheaper than older techniques



##### ðŸ”¬ How Does Cas9 Work?

The CRISPR-Cas9 system has two main components:

##### 1. **Cas9 Protein** (the molecular scissors)
- Cuts DNA at a specific location
- Like programmable scissors

##### 2. **Guide RNA (gRNA)** (the GPS)
- ~20 nucleotides long
- Tells Cas9 WHERE to cut
- Matches the target DNA sequence

##### The Process:
```
1. Guide RNA binds to Cas9 protein
2. gRNA searches DNA for matching sequence
3. When match found (next to PAM), Cas9 cuts DNA
4. Cell repairs the cut (this is where editing happens)
```



##### ðŸŽ¯ The Three Key Requirements

For CRISPR to work, you need:

##### 1. Target Sequence (20bp)
```
5'-GCACTGCCTAGTACGATCGA-3'
```

##### 2. PAM Sequence (Protospacer Adjacent Motif)
```
For Cas9: NGG (where N = any nucleotide)
Must be immediately after target
```

##### 3. Complete CRISPR Target Site
```
5'-[20bp target]-[PAM]-3'
5'-GCACTGCCTAGTACGATCGA-AGG-3'
    ^^^^^^^^^^^^^^^^^^^^  ^^^
    guide RNA matches     PAM required
```



In [None]:
# Let's represent a CRISPR target site
target_sequence = "GCACTGCCTAGTACGATCGA"  # 20bp guide target
pam_sequence = "AGG"  # PAM (must be NGG)
complete_site = target_sequence + pam_sequence

print("Target Sequence:", target_sequence)
print("Length:", len(target_sequence), "bp")
print("PAM Sequence:", pam_sequence)
print("Complete CRISPR Site:", complete_site)
print("Total Length:", len(complete_site), "bp")

##### ðŸ§ª Real-World Applications

##### Medical Applications
1. **Sickle Cell Disease**: Fix mutated hemoglobin gene
2. **Cancer**: Edit immune cells to fight tumors
3. **Blindness**: Correct genetic vision problems

##### Agricultural Applications
1. **Disease-resistant crops**
2. **Higher crop yields**
3. **Drought tolerance**

##### Research Applications
1. **Study gene function**
2. **Create disease models**
3. **Screen drug targets**



In [None]:
# Function to check if a PAM sequence is valid for Cas9
def is_valid_pam(pam):
    """
    Check if a PAM sequence is valid for SpCas9 (NGG format)
    N = any nucleotide, GG = required
    """
    if len(pam) != 3:
        return False
    
    # Check if last two bases are GG
    if pam[1:3] == "GG":
        # Check if first base is A, T, G, or C
        if pam[0] in "ATGC":
            return True
    return False

# Test the function
test_pams = ["AGG", "TGG", "CGG", "GGG", "AAA", "GGA", "GGT"]

for pam in test_pams:
    result = "âœ“ Valid" if is_valid_pam(pam) else "âœ— Invalid"
    print(f"{pam}: {result}")

##### ðŸ’¡ Exercise: Identify CRISPR Components

Given this DNA sequence, identify potential CRISPR target sites:

```
5'-ATGCTAGCTGATCGATCGATAGGCTAGCTGATCGATCGATAGGTACGATCGATCGA-3'
```

Look for:
1. PAM sequences (NGG)
2. 20bp before each PAM (the potential guide sequence)


In [None]:
# Your turn: Write code to find PAM sequences
dna = "ATGCTAGCTGATCGATCGATAGGCTAGCTGATCGATCGATAGGTACGATCGATCGA"

# Hint: Loop through the sequence looking for "GG" patterns
# Check if the base before "GG" is any nucleotide (making it NGG)

# Your code here:


---

##### ðŸ“š References & Further Reading

**Foundational Papers:**
1. Jinek et al. (2012). "A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity." *Science* 337(6096), 816-821.
2. Cong et al. (2013). "Multiplex genome engineering using CRISPR/Cas systems." *Science* 339(6121), 819-823.
3. Mali et al. (2013). "RNA-guided human genome engineering via Cas9." *Science* 339(6121), 823-826.

**Reviews:**
- Doudna & Charpentier (2014). "The new frontier of genome engineering with CRISPR-Cas9." *Science* 346(6213).

**Online Resources:**
- [Addgene CRISPR Guide](https://www.addgene.org/crispr/guide/)
- [NHGRI: About CRISPR](https://www.genome.gov/about-genomics/policy-issues/what-is-CRISPR)


---

##### ðŸš€ Next Lesson

Ready to continue? Open the next lesson notebook:
**[Lesson 12: Pam Identification.Ipynb](lesson12_pam_identification.ipynb)**
