# Week 15: Future Trends in AI - Homework

**ML2: Advanced Machine Learning**

**Estimated Time**: 1 hour

---

This homework combines programming exercises and knowledge-based questions to reinforce this week's concepts.

## Setup

Run this cell to import necessary libraries:

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import torch
import torch.nn as nn

# Set random seed for reproducibility
np.random.seed(42)
torch.manual_seed(42)

print('✓ Libraries imported successfully')

---
## Part 1: Programming Exercises (60%)

Complete the following programming tasks. Read each description carefully and implement the requested functionality.

### Exercise 1: Experiment: Multimodal Capabilities

**Time**: 10 min

Explore how models can process text AND images together.

In [None]:
# Multimodal models (GPT-4V, Gemini) can understand images + text

# Task 1: Image captioning
# Input: [image of a cat]
# Output: "A gray cat sitting on a windowsill"

# Task 2: Visual question answering
# Input: [image of a chart] + "What's the trend in 2023?"
# Output: "The trend shows an upward trajectory in 2023"

# Task 3: OCR + reasoning
# Input: [image of a receipt] + "How much did I spend on groceries?"
# Output: "$45.67"

# TODO: Consider what new applications multimodality enables

---
## Part 2: Knowledge Questions (40%)

Answer the following questions to test your conceptual understanding.

### Question 1 (Short Answer)

**Question 1 - Multimodality**

Text-only LLMs:
- Input: text → Output: text

Multimodal LLMs:
- Input: text + images + audio → Output: text (or images)

Explain:
1. What new capabilities does multimodality unlock?
2. What challenges arise from combining modalities?
3. Give 3 real-world applications.

**Hint**: New: visual reasoning, accessibility, richer understanding. Challenges: alignment, training complexity.

**Your Answer**:

[Write your answer here in 2-4 sentences]

### Question 2 (Short Answer)

**Question 2 - Efficiency Frontiers**

Trends in making LLMs more efficient:
- LoRA: Fine-tune only a small subset of parameters
- Quantization: Use fewer bits per parameter (FP16 → INT8)
- Distillation: Train smaller model to mimic larger model
- Mixture of Experts: Activate only relevant parts of model

Explain: Why is efficiency becoming critical as models grow?

**Hint**: Cost, energy, environmental impact, accessibility. Can't keep scaling infinitely.

**Your Answer**:

[Write your answer here in 2-4 sentences]

### Question 3 (Multiple Choice)

**Question 3 - LoRA (Low-Rank Adaptation)**

Instead of fine-tuning 175B parameters, LoRA fine-tunes ~1M parameters.

How does this work?

A) It removes most of the model
B) It trains small adapter layers while freezing the base model
C) It only works on small models
D) It eliminates the need for training data

A) It removes most of the model
B) It trains small adapter layers while freezing the base model
C) It only works on small models
D) It eliminates the need for training data

**Hint**: LoRA adds small trainable matrices that adapt the frozen base model efficiently.

**Your Answer**: [Write your answer here - e.g., 'B']

**Explanation**: [Explain why this is correct]

### Question 4 (Short Answer)

**Question 4 - Open Source vs Closed Source**

Closed: GPT-4 (OpenAI), Claude (Anthropic)
Open: Llama, Mistral, Falcon

Explain:
1. What advantages does open source provide?
2. What are the risks of fully open models?
3. What's the trend in the community?

**Hint**: Open = transparency, customization, research. Risks = misuse, safety. Trend = moving toward open.

**Your Answer**:

[Write your answer here in 2-4 sentences]

### Question 5 (Short Answer)

**Question 5 - Mixture of Experts (MoE)**

Instead of activating the ENTIRE model, MoE activates only relevant "experts" for each input.

Example: For a coding question, activate coding expert. For medical, activate medical expert.

Explain:
1. What efficiency gains does this provide?
2. How does the model decide which experts to activate?
3. What's the tradeoff?

**Hint**: Efficiency: less computation per input. Router network selects experts. Tradeoff: complexity.

**Your Answer**:

[Write your answer here in 2-4 sentences]

### Question 6 (Multiple Choice)

**Question 6 - Context Window Scaling**

GPT-3: 4k tokens
GPT-4: 8k-128k tokens
Claude: 100k-200k tokens

Why is longer context important?

A) Makes model bigger
B) Enables processing entire books, codebases, conversations
C) Faster inference
D) Lower cost

A) Makes model bigger
B) Enables processing entire books, codebases, conversations
C) Faster inference
D) Lower cost

**Hint**: Longer context = can fit more information without external memory/retrieval.

**Your Answer**: [Write your answer here - e.g., 'B']

**Explanation**: [Explain why this is correct]

### Question 7 (Short Answer)

**Question 7 - Synthetic Data**

Running out of internet text to train on? Use LLMs to GENERATE training data.

Phi-2: Trained largely on synthetic data, performs well despite small size.

Explain:
1. How can you use GPT-4 to generate training data for smaller models?
2. What are the risks of synthetic data?
3. When is this approach valuable?

**Hint**: GPT-4 generates diverse examples. Risks: bias amplification, homogenization. Valuable when real data scarce.

**Your Answer**:

[Write your answer here in 2-4 sentences]

### Question 8 (Short Answer)

**Question 8 - Constitutional AI**

Instead of RLHF, train LLMs using a "constitution" (set of principles).

Example principle: "Be helpful, harmless, and honest."

Explain:
1. How does this differ from RLHF?
2. What advantages might it have?
3. What challenges remain?

**Hint**: Constitution = explicit rules vs implicit human feedback. Advantages: transparency, scalability.

**Your Answer**:

[Write your answer here in 2-4 sentences]

### Question 9 (Short Answer)

**Question 9 - Compute Trends**

GPT-3: ~3.14 × 10²³ FLOPs to train
GPT-4: (estimated) ~10²⁵ FLOPs

Explain:
1. Can this exponential growth continue?
2. What are the bottlenecks (energy, cost, hardware)?
3. What alternatives to "bigger is better" are being explored?

**Hint**: Can't scale forever. Bottlenecks: power, chips, cost. Alternatives: efficiency, architecture improvements.

**Your Answer**:

[Write your answer here in 2-4 sentences]

### Question 10 (Short Answer)

**Question 10 - Your Prediction**

Crystal ball time! What do you predict for LLMs in the next 5 years?

Consider:
- Model capabilities
- Accessibility (open source, cost)
- Multimodality
- Applications
- Societal impact

Explain your reasoning.

**Hint**: No wrong answer! Think about current trends and extrapolate.

**Your Answer**:

[Write your answer here in 2-4 sentences]

---
## Submission

Before submitting:
1. Run all cells to ensure code executes without errors
2. Check that all questions are answered
3. Review your explanations for clarity

**To Submit**:
- File → Download → Download .ipynb
- Submit the notebook file to your course LMS

**Note**: Make sure your name is in the filename (e.g., homework_01_yourname.ipynb)