# AI Skills Hub - Python for AI
## Lesson 2: Lists and Indexing

**Learn:** Lists, indexing, slicing, and list comprehensions  
**Build:** Data splitting, batch processing, and filtering  
**Runtime:** ~30 minutes  
**GPU Required:** No  
**License:** MIT

---

## Setup: Run this cell first

Verify your Python environment is ready.

In [None]:
import sys
print(f"Python version: {sys.version}")
print(f"Platform: {sys.platform}")
print("\n‚úÖ Setup complete! Ready to learn about lists.")

---
## Part 1: Creating Lists

Lists are ordered collections that store multiple values. Essential for training data, predictions, and metrics.

In [None]:
# Empty list
empty_list = []

# List of training accuracies over epochs
accuracies = [0.82, 0.85, 0.88, 0.91, 0.93]

# List of epoch numbers
epochs = [1, 2, 3, 4, 5]

# List of model names
models = ["ResNet", "VGG", "MobileNet"]

# Print examples
print(f"Accuracies: {accuracies}")
print(f"Type: {type(accuracies)}")
print(f"Length: {len(accuracies)} epochs")

In [None]:
# AI Example: Tracking training metrics
train_losses = [0.89, 0.76, 0.65, 0.58, 0.52]
val_losses = [0.92, 0.81, 0.72, 0.68, 0.65]

# Display side by side
for epoch, (train_loss, val_loss) in enumerate(zip(train_losses, val_losses), 1):
    print(f"Epoch {epoch}: Train Loss = {train_loss:.4f}, Val Loss = {val_loss:.4f}")

### üéØ Practice: Create Your Own Lists

Create lists to track model predictions and labels.

In [None]:
# TODO: Create the following lists:
# 1. predictions = [0, 1, 1, 0, 1, 0, 0, 1] (binary predictions)
# 2. labels = [0, 1, 0, 0, 1, 0, 1, 1] (ground truth)
# 3. learning_rates = [0.1, 0.01, 0.001, 0.0001] (LR schedule)

# Your code here:


# Test your code
print(f"Predictions: {predictions}")
print(f"Labels: {labels}")
print(f"Learning rates: {learning_rates}")

---
## Part 2: Indexing - Accessing Elements

Lists use zero-based indexing. First element is at index 0.

In [None]:
losses = [0.89, 0.76, 0.65, 0.58, 0.52]

# Positive indexing (from start)
print("Positive indexing:")
print(f"First loss (index 0): {losses[0]}")
print(f"Second loss (index 1): {losses[1]}")
print(f"Last loss (index 4): {losses[4]}")

# Negative indexing (from end)
print("\nNegative indexing:")
print(f"Last loss (index -1): {losses[-1]}")
print(f"Second to last (index -2): {losses[-2]}")
print(f"First loss (index -5): {losses[-5]}")

In [None]:
# AI Use Case: Compare initial vs final metrics
accuracies = [0.65, 0.72, 0.78, 0.82, 0.85, 0.88]

initial_acc = accuracies[0]
final_acc = accuracies[-1]
improvement = final_acc - initial_acc

print(f"Initial accuracy: {initial_acc:.2%}")
print(f"Final accuracy: {final_acc:.2%}")
print(f"Improvement: {improvement:.2%}")

### üéØ Practice: Indexing Challenge

In [None]:
# Model predictions for 10 samples
predictions = [0, 1, 1, 0, 1, 0, 0, 1, 1, 0]
labels = [0, 1, 0, 0, 1, 0, 1, 1, 1, 0]

# TODO: Use indexing to:
# 1. Get the first prediction and label
# 2. Get the last prediction and label
# 3. Check if they match (compare with ==)

# Your code here:
first_pred = None
first_label = None
last_pred = None
last_label = None

first_match = None  # True or False
last_match = None   # True or False

print(f"First prediction: {first_pred}, Label: {first_label}, Match: {first_match}")
print(f"Last prediction: {last_pred}, Label: {last_label}, Match: {last_match}")

---
## Part 3: Slicing - Extracting Ranges

Slicing allows you to extract portions of a list.  
Syntax: `list[start:stop:step]`

In [None]:
data = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

print("Basic slicing:")
print(f"First 5: {data[:5]}")      # [0, 1, 2, 3, 4]
print(f"Last 3: {data[-3:]}")      # [7, 8, 9]
print(f"Middle (3 to 7): {data[3:7]}")  # [3, 4, 5, 6]
print(f"Every 2nd element: {data[::2]}")  # [0, 2, 4, 6, 8]
print(f"Reversed: {data[::-1]}")   # [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]

In [None]:
# AI Use Case: Train/Test Split
dataset = list(range(100))  # 100 samples

# Split 80% train, 20% test
split_index = int(0.8 * len(dataset))
train_data = dataset[:split_index]
test_data = dataset[split_index:]

print(f"Total samples: {len(dataset)}")
print(f"Training samples: {len(train_data)}")
print(f"Test samples: {len(test_data)}")
print(f"Train data (first 10): {train_data[:10]}")
print(f"Test data (first 10): {test_data[:10]}")

In [None]:
# AI Use Case: Get recent training history
loss_history = [1.5 - (i * 0.01) for i in range(100)]

# Get last 10 epochs
recent_losses = loss_history[-10:]

print(f"Total epochs: {len(loss_history)}")
print(f"Last 10 losses: {recent_losses}")
print(f"Best loss: {min(loss_history):.4f}")
print(f"Worst loss: {max(loss_history):.4f}")

### üéØ Practice: Data Splitting

In [None]:
# Dataset with 200 samples
full_dataset = list(range(200))

# TODO: Split into 60% train, 20% validation, 20% test
# Calculate the split indices first
# train_end = ?
# val_end = ?

# Your code here:
train_data = None
val_data = None
test_data = None

print(f"Train samples: {len(train_data)}")
print(f"Validation samples: {len(val_data)}")
print(f"Test samples: {len(test_data)}")

---
## Part 4: List Methods

Lists have built-in methods for adding, removing, and modifying elements.

In [None]:
# Start with empty list
losses = []

# append() - Add single element
losses.append(0.89)
losses.append(0.76)
print(f"After append: {losses}")

# extend() - Add multiple elements
losses.extend([0.65, 0.58, 0.52])
print(f"After extend: {losses}")

# insert() - Add at specific position
losses.insert(0, 1.20)  # Insert at beginning
print(f"After insert: {losses}")

# remove() - Remove first occurrence
losses.remove(1.20)
print(f"After remove: {losses}")

# pop() - Remove and return last element
last = losses.pop()
print(f"Popped: {last}")
print(f"After pop: {losses}")

In [None]:
# AI Example: Building training history
train_losses = []
val_losses = []

# Simulate 5 epochs of training
for epoch in range(5):
    # Simulate decreasing loss
    train_loss = 1.0 / (epoch + 1)
    val_loss = 1.1 / (epoch + 1)
    
    # Record metrics
    train_losses.append(train_loss)
    val_losses.append(val_loss)
    
    print(f"Epoch {epoch + 1}: Train={train_loss:.4f}, Val={val_loss:.4f}")

print(f"\nFinal training history:")
print(f"Train losses: {[f'{loss:.4f}' for loss in train_losses]}")
print(f"Val losses: {[f'{loss:.4f}' for loss in val_losses]}")

### üéØ Practice: Managing Predictions

In [None]:
# Start with empty predictions list
predictions = []

# TODO: 
# 1. Add predictions [0, 1, 1] using append
# 2. Add predictions [0, 1] using extend
# 3. Remove the last prediction using pop
# 4. Print the final predictions list
# Expected result: [0, 1, 1, 0]

# Your code here:


print(f"Final predictions: {predictions}")

---
## Part 5: List Comprehensions

List comprehensions provide a concise way to create lists. Essential for data preprocessing.

In [None]:
# Traditional way (verbose)
squares = []
for i in range(10):
    squares.append(i ** 2)
print(f"Traditional: {squares}")

# List comprehension (concise)
squares = [i ** 2 for i in range(10)]
print(f"Comprehension: {squares}")

# With condition: only even squares
even_squares = [i ** 2 for i in range(10) if i % 2 == 0]
print(f"Even squares: {even_squares}")

In [None]:
# AI Example: Normalize pixel values
pixel_values = [0, 50, 100, 150, 200, 255]

# Normalize to [0, 1] range
normalized = [pixel / 255.0 for pixel in pixel_values]

print("Original pixels:", pixel_values)
print("Normalized:", [f"{val:.3f}" for val in normalized])

In [None]:
# AI Example: Create training batches
dataset = list(range(100))
batch_size = 16

# Create batches using list comprehension
batches = [dataset[i:i+batch_size] for i in range(0, len(dataset), batch_size)]

print(f"Total samples: {len(dataset)}")
print(f"Batch size: {batch_size}")
print(f"Number of batches: {len(batches)}")
print(f"First batch: {batches[0]}")
print(f"Last batch size: {len(batches[-1])}")

In [None]:
# AI Example: Filter high-confidence predictions
predictions = [0, 1, 1, 0, 1, 0, 1, 1]
confidences = [0.92, 0.65, 0.88, 0.95, 0.55, 0.78, 0.82, 0.91]

# Keep only predictions with confidence > 0.8
threshold = 0.8
high_conf_preds = [pred for pred, conf in zip(predictions, confidences) if conf > threshold]

print(f"All predictions: {predictions}")
print(f"Confidences: {confidences}")
print(f"High confidence (>{threshold}): {high_conf_preds}")
print(f"Kept {len(high_conf_preds)}/{len(predictions)} predictions")

### üéØ Practice: List Comprehension Challenge

In [None]:
# Raw loss values (need to be converted to percentages)
losses = [0.05, 0.12, 0.08, 0.03, 0.15, 0.02]

# TODO: Use list comprehension to:
# 1. Convert each loss to percentage (multiply by 100)
# 2. Keep only losses that are less than 10%

# Your code here:
loss_percentages = None  # Convert all to percentages
good_losses = None       # Keep only < 10%

print(f"Loss percentages: {loss_percentages}")
print(f"Good losses (< 10%): {good_losses}")

---
## Part 6: Nested Lists (2D Data)

Nested lists represent multi-dimensional data like batches of samples.

In [None]:
# 2D list: batch of samples with features
# Shape: [batch_size, num_features]
batch = [
    [0.5, 0.3, 0.8],  # Sample 1
    [0.2, 0.9, 0.4],  # Sample 2
    [0.7, 0.1, 0.6]   # Sample 3
]

# Access elements
print("Batch structure:")
print(f"Full batch: {batch}")
print(f"\nFirst sample: {batch[0]}")
print(f"Second sample: {batch[1]}")
print(f"\nFirst feature of first sample: {batch[0][0]}")
print(f"Second feature of first sample: {batch[0][1]}")
print(f"\nBatch shape: [{len(batch)}, {len(batch[0])}]")

In [None]:
# AI Example: Process batch of samples
batch = [
    [1.0, 2.0, 3.0],
    [4.0, 5.0, 6.0],
    [7.0, 8.0, 9.0]
]

# Calculate mean for each sample
sample_means = [sum(sample) / len(sample) for sample in batch]

print("Batch data:")
for i, sample in enumerate(batch):
    print(f"Sample {i + 1}: {sample} -> Mean: {sample_means[i]:.2f}")

---
## üèÜ Final Challenge: Complete Data Pipeline

Apply everything you've learned to create a complete data processing pipeline.

In [None]:
# Full dataset of 100 samples
full_dataset = list(range(100))

# TODO: Complete the following tasks:
# 1. Split into 70% train, 15% val, 15% test
# 2. Create batches of size 8 for training data
# 3. Print first batch and last batch
# 4. Count total batches

# Your code here:
train_end = None  # Calculate split point
val_end = None    # Calculate split point

train_data = None
val_data = None
test_data = None

batch_size = 8
train_batches = None  # Create batches using list comprehension

# Print statistics
print("Data Split:")
print(f"Train samples: {len(train_data)}")
print(f"Val samples: {len(val_data)}")
print(f"Test samples: {len(test_data)}")
print(f"\nBatch Information:")
print(f"Number of batches: {len(train_batches)}")
print(f"First batch: {train_batches[0]}")
print(f"Last batch: {train_batches[-1]}")
print(f"Last batch size: {len(train_batches[-1])}")

---
## üìù Quiz: Check Your Understanding

Test your knowledge with these quick questions.

In [None]:
# Question 1: What does data[3:7] return?
data = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
result_1 = data[3:7]
print(f"Q1: {result_1}")  # Answer: [3, 4, 5, 6]

# Question 2: How to get the last element?
result_2 = data[-1]
print(f"Q2: {result_2}")  # Answer: 9

# Question 3: What's the difference between append and extend?
list1 = [1, 2]
list1.append([3, 4])
print(f"Q3a (append): {list1}")  # [1, 2, [3, 4]]

list2 = [1, 2]
list2.extend([3, 4])
print(f"Q3b (extend): {list2}")  # [1, 2, 3, 4]

# Question 4: Create 80/20 split
dataset = list(range(100))
split_idx = int(0.8 * len(dataset))
train = dataset[:split_idx]
test = dataset[split_idx:]
print(f"Q4: Train={len(train)}, Test={len(test)}")  # 80, 20

---
## üéâ Congratulations!

You've completed Lesson 2! You now understand:

- ‚úÖ Creating and manipulating lists
- ‚úÖ Indexing and slicing for data access
- ‚úÖ List methods (append, extend, pop, etc.)
- ‚úÖ List comprehensions for efficient processing
- ‚úÖ Nested lists for batch data
- ‚úÖ Train/test splitting and batch creation

**Next Steps:**
- Complete the [Lesson 2 Quiz](https://rajgupt.github.io/ai-for-builders/courses/foundation/python-for-ai/quizzes/#lesson-2)
- Move on to [Lesson 3: Dictionaries and Data Structures](https://rajgupt.github.io/ai-for-builders/courses/foundation/python-for-ai/03-dictionaries/)

---

**Resources:**
- [Python Lists Documentation](https://docs.python.org/3/tutorial/datastructures.html)
- [List Comprehensions Guide](https://realpython.com/list-comprehension-python/)
- [Data Splitting Best Practices](https://scikit-learn.org/stable/modules/cross_validation.html)

**License:** MIT | **Course:** AI Skills Hub | **Lesson:** 2/7