# AI-PSCI-001: Introduction to Google Colab & AI-Assisted Coding
## Annotated Teaching Scaffold

**AI in Pharmaceutical Sciences: Bench to Bedside**  
VCU School of Pharmacy | VIP Program | Spring 2026

---

**Week 1 | Estimated Time: 60-90 minutes**

**Prerequisites**: None - this is your starting point!

---

> **About This Notebook**
> 
> This is an **annotated scaffold** showing the complete teaching design of an AGIL talktorial. 
> It demonstrates:
> - How guided inquiries are structured
> - What verification checkpoints look like
> - Expected outputs students should achieve
> - Pedagogical notes explaining design intent
>
> **Code cells remain empty** - this is intentional. Students write code with AI assistance.
> 
> *See manuscript Section 4.2-4.4 for theoretical grounding.*

---
### Instructor Note: Technical Requirements

**Platform:** Google Colab (free tier sufficient for this notebook)

**Packages Used:**
- `math` (built-in) - logarithms for pIC50 calculation
- `datetime` (built-in) - date handling
- `numpy` (pre-installed in Colab) - numerical operations
- `matplotlib` (pre-installed in Colab) - visualization

**No additional installation required** - all packages are available in Colab by default.

---

## Learning Objectives

After completing this talktorial, you will be able to:

1. Navigate the Google Colab interface confidently
2. Understand the difference between code cells and markdown cells
3. Use AI assistants (Claude, ChatGPT) effectively to write Python code
4. Run Python code and interpret outputs
5. Document your work using markdown formatting

---

## Background

### Why Google Colab?

Google Colab (Colaboratory) is a free, cloud-based platform that lets you write and execute Python code in your browser. It's particularly powerful for pharmaceutical sciences because:

- **No installation required**: Everything runs in the cloud
- **Free GPU access**: Essential for AI tools like AlphaFold2
- **Easy sharing**: Your notebooks are saved to Google Drive
- **Pre-installed libraries**: Many scientific packages are ready to use

### Why AI-Assisted Coding?

In this course, you'll use AI assistants (like Claude or ChatGPT) to help you write code. This isn't "cheating" - it's a modern skill that:

- **Accelerates learning**: Get unstuck quickly when you hit roadblocks
- **Teaches best practices**: AI can explain why code works, not just what works
- **Mirrors real research**: Professional scientists increasingly use AI tools
- **Focuses on concepts**: Spend more time understanding biology, less time debugging syntax

However, **critical evaluation is essential**. AI can make mistakes, and part of your learning is verifying that code does what you expect.

### Connection to Drug Discovery

Throughout this course, we'll use computational tools to study **dihydrofolate reductase (DHFR)**, an enzyme targeted by antibiotics like trimethoprim. Resistance to these antibiotics is a growing clinical problem, and understanding how mutations affect drug binding requires exactly the kinds of computational approaches you'll learn here.

---

---
### Instructor Note: Pedagogical Design

**Why this structure?** (See manuscript Section 4.2)

The background section establishes:
1. **Tool justification** - Why Colab? (reduces barrier to entry)
2. **Pedagogical framing** - AI as partner, not answer key
3. **Scientific context** - DHFR connects to real drug discovery

Students read this BEFORE attempting code, priming them for the inquiry tasks.

---

## Colab Orientation

Before we write any code, let's understand the interface:

### Cell Types
- **Code cells** (like the gray boxes below): Where you write and run Python
- **Text/Markdown cells** (like this one): For notes, explanations, documentation

### Key Actions
- **Run a cell**: Click the play button, or press `Shift + Enter`
- **Add a cell**: Click `+ Code` or `+ Text` in the toolbar
- **Move cells**: Drag using the cell handle on the left
- **Delete a cell**: Click the trash icon (or right-click → Delete)

### Runtime
- The "runtime" is your Python environment in the cloud
- Go to `Runtime → Change runtime type` to select GPU (we'll need this later)
- If disconnected, go to `Runtime → Reconnect`

---

## Guided Inquiry 1: Your First Code Cell

### Context
Let's start with the classic first program - but with a pharmaceutical twist!

### Your Task
In the code cell below, write Python code to:
1. Print the message: `Hello, Pharmaceutical Sciences!`
2. Print today's date
3. Print your name

**Prompting Tips**:
- Open Claude (claude.ai) or ChatGPT in another tab
- Try asking: *"How do I print text in Python?"*
- Then ask: *"How do I print today's date in Python?"*
- If you get an error, copy and paste it to your AI assistant and ask for help

### Verification
After running your code, confirm:
- [ ] Three lines of output appear below the cell
- [ ] No error messages (red text)
- [ ] The date looks correct

**Lab Notebook**: Document what prompts you used to get help from your AI assistant

In [None]:
# Your code here
# Students write code with AI assistance


---
### Expected Output (for verification)

Students should see output similar to:
```
Hello, Pharmaceutical Sciences!
Today's date: 2026-01-15
My name is [Student Name]
```

**Key concepts demonstrated:**
- `print()` function for output
- `from datetime import date` for date handling
- f-strings for formatted output (e.g., `f"Today's date: {today}"`)

---

## Guided Inquiry 2: Variables and Data Types

### Context
In pharmaceutical research, we constantly work with data: drug names, molecular weights, protein sequences, binding affinities. Python uses **variables** to store this information.

### Your Task
Using your AI assistant, write code to:
1. Create a variable called `drug_name` and assign it the value `"Trimethoprim"`
2. Create a variable called `molecular_weight` and assign it the value `290.32` (the MW of trimethoprim in g/mol)
3. Create a variable called `is_antibiotic` and assign it the value `True`
4. Print all three variables with descriptive labels

**Prompting Tips**:
- Ask your AI: *"How do I create variables in Python? I need to store a drug name (text), molecular weight (decimal number), and whether something is an antibiotic (true/false)"*
- Ask: *"What are the different data types in Python?"*

### Verification
After running your code, confirm:
- [ ] Output shows the drug name, molecular weight, and antibiotic status
- [ ] The molecular weight is a number (not in quotes)
- [ ] The boolean is `True` (capital T, no quotes)

**Lab Notebook**: What are the three main data types you used? (Hint: string, float, boolean)

In [None]:
# Your code here
# Students write code with AI assistance


---
### Expected Output (for verification)

```
Drug name: Trimethoprim
Molecular weight: 290.32 g/mol
Is antibiotic: True
```

**Key concepts:**
- String (`str`): Text in quotes
- Float (`float`): Decimal numbers
- Boolean (`bool`): `True` or `False`

---

## Guided Inquiry 5: Your First Function

### Context
When you find yourself doing the same calculation repeatedly, you can create a **function** - a reusable block of code. This is essential for building research pipelines.

### Your Task
Write code to:

1. Create a function called `nm_to_pIC50` that:
   - Takes an IC50 value in nanomolar as input
   - Returns the pIC50 value
   - (Remember: pIC50 = -log10(IC50 in Molar), and 1 nM = 1e-9 M)

2. Test your function with these IC50 values and print the results:
   - 15 nM (should give ~7.82)
   - 100 nM (should give 7.0)
   - 1000 nM (should give 6.0)

**Prompting Tips**:
- Ask: *"How do I create a function in Python that takes a number as input and returns a calculated result?"*
- Ask: *"Can you explain what 'def' means in Python?"*

### Verification
- [ ] Function runs without errors
- [ ] 100 nM gives exactly 7.0
- [ ] 1000 nM gives exactly 6.0
- [ ] Higher IC50 (less potent) = lower pIC50

**Lab Notebook**: What are the advantages of creating a function instead of copying the same code multiple times?

In [None]:
# Your code here
# Students write code with AI assistance


---
### Expected Output (for verification)

```
Testing nm_to_pIC50 function:
------------------------------
IC50: 15 nM → pIC50: 7.82
IC50: 100 nM → pIC50: 7.00
IC50: 1000 nM → pIC50: 6.00
```

**Key insight for verification:**
- 100 nM = 100 × 10⁻⁹ M = 10⁻⁷ M
- pIC50 = -log₁₀(10⁻⁷) = 7.0 exactly

This is why we use 100 nM as a verification checkpoint - the math is clean!

---

---
### Instructor Note: Verification Design (Manuscript Section 4.4)

**Why these specific test values?**

The verification values (100 nM → 7.0, 1000 nM → 6.0) are chosen because:
1. **Clean math**: Students can verify mentally without a calculator
2. **Catches common errors**: Off-by-factor-of-1000 errors become obvious
3. **Builds verification habits**: Students learn to test with known values

**The verification paradox** (Section 4.4.3):
- Novices can't always evaluate AI-generated code quality
- Solution: Provide expected outputs they CAN verify
- This builds verification skills incrementally

---

## Guided Inquiry 8: Creating a Simple Visualization

### Context
Visualization is crucial in research - both for understanding your data and communicating findings. Let's create a simple plot using matplotlib.

### Your Task
Write code to:

1. Import `matplotlib.pyplot` as `plt`

2. Create a bar chart showing the pIC50 values for our compounds:
   - X-axis: Compound labels (you can use "Compound 1", "Compound 2", etc.)
   - Y-axis: pIC50 values
   - Add a horizontal line at pIC50 = 7 (common activity threshold)

3. Add proper labels:
   - X-axis label: "Compound"
   - Y-axis label: "pIC50"
   - Title: "DHFR Inhibitor Potency Comparison"

4. Display the plot

**Prompting Tips**:
- Ask: *"How do I create a bar chart in Python with matplotlib?"*
- Ask: *"How do I add a horizontal line to a matplotlib plot?"*

### Verification
- [ ] A bar chart appears below the code cell
- [ ] Bars are visible for all compounds
- [ ] Horizontal threshold line is visible at y=7
- [ ] Labels and title are present

In [None]:
# Your code here
# Students write code with AI assistance


---
### Expected Output (for verification)

A bar chart showing:
- 6 bars representing compounds with varying pIC50 values
- A horizontal dashed line at pIC50 = 7.0 (potency threshold)
- Bars above the line = potent compounds (IC50 < 100 nM)
- Proper axis labels and title

**Interpretation students should reach:**
- Green/tall bars: Potent (pIC50 > 7, IC50 < 100 nM)
- Orange bars: Moderate (pIC50 6-7)
- Red/short bars: Weak (pIC50 < 6)

---

---

## Checkpoint

Congratulations! You've completed your first talktorial. Before moving on, confirm you can:

- [ ] Navigate Colab: create cells, run code, save your work
- [ ] Use AI assistants to write Python code
- [ ] Create variables of different types (strings, numbers, booleans)
- [ ] Work with lists and perform calculations
- [ ] Write a simple function
- [ ] Use loops to process multiple items
- [ ] Import and use packages (numpy, matplotlib)
- [ ] Create basic visualizations

### Your lab notebook should include:
- [ ] AI prompts you used and what worked well
- [ ] Answers to the reflection questions throughout
- [ ] Any errors you encountered and how you resolved them
- [ ] Notes on concepts that were new or confusing

---

## Reflection Questions

Answer these in your lab notebook:

1. **AI Collaboration**: What kinds of prompts worked best when asking your AI assistant for help? What didn't work as well?

2. **Verification**: How did you verify that the code your AI generated was correct? Did you catch any errors?

3. **Application**: How might you use these Python basics in pharmaceutical research? Give a specific example.

4. **Challenges**: What was the most challenging part of this talktorial? How did you overcome it?

---

---
### Instructor Note: Assessment Integration (Manuscript Section 6)

**What to look for in student submissions:**

1. **Code correctness**: Do outputs match expected values?
2. **AI documentation**: Did they record their prompts and AI interactions?
3. **Verification habits**: Evidence of testing with known values?
4. **Reflection quality**: Thoughtful responses showing metacognition?

**Lab notebook verification rubric** (0-4 scale, Section 6.6):
- 0: No source cited
- 1: Source URL only
- 2: Citation with relevance statement
- 3: Citation with agreement/disagreement analysis  
- 4: Multiple sources with synthesis

---

## Further Reading

- [Google Colab Welcome Notebook](https://colab.research.google.com/notebooks/intro.ipynb) - Official introduction
- [Python for Beginners](https://www.python.org/about/gettingstarted/) - Python.org resources
- [NumPy Quickstart](https://numpy.org/doc/stable/user/quickstart.html) - NumPy basics
- [Matplotlib Tutorials](https://matplotlib.org/stable/tutorials/index.html) - Visualization guides

---

## Connection to Research

The skills you practiced today form the foundation for everything we'll do in this course:

- **Variables and data types** → Storing molecular properties, protein sequences, experimental results
- **Lists and loops** → Processing compound libraries, analyzing mutation effects
- **Functions** → Building reusable analysis pipelines
- **NumPy** → Efficient numerical computations on large datasets
- **Matplotlib** → Visualizing structure-activity relationships, docking results

In the coming weeks, you'll use these basics to query chemical databases, predict protein structures with AlphaFold2, dock small molecules, and build machine learning models for antibiotic resistance prediction.

The DHFR enzyme you'll study is a real drug target with clinical importance. The computational approaches you'll learn are the same ones used in pharmaceutical industry and academic research labs.

---

*AI-PSCI-001 Complete. Proceed to AI-PSCI-002: Effective AI Collaboration & Prompt Engineering.*

---

## Summary: AGIL Talktorial Design Elements

This scaffold demonstrates the key components of every AGIL talktorial:

| Element | Purpose | Manuscript Section |
|---------|---------|-------------------|
| **Learning Objectives** | Clear outcomes students should achieve | 4.1 |
| **Background** | Scientific context before coding | 4.2 |
| **Guided Inquiries** | Structured tasks with AI prompting tips | 4.2, 4.3 |
| **Empty Code Cells** | Students write code, not run pre-written code | 4.2 |
| **Verification Checkpoints** | Expected outputs for self-assessment | 4.4 |
| **Lab Notebook Prompts** | Documentation of process and AI interactions | 4.5 |
| **Reflection Questions** | Metacognitive development | 6.3 |
| **Connection to Research** | Relevance to real drug discovery | 3.2 |

**For the complete talktorial** (all 8 guided inquiries), see `AI-PSCI-001.ipynb` in this folder.

**For solution code**, educators can request access via [FOR_EDUCATORS.md](../FOR_EDUCATORS.md).