# 📘 Notebook 5: Loops for Repetition and Automation
In this notebook, you'll revise how to use loops in Python to repeat tasks efficiently.

❗**This notebook will include two specific tasks involving Generative AI, so you start to get a feel for appropriate use of it in this unit.**

### 🧠 Why This Matters for Machine Learning
Loops are essential when iterating through datasets, processing multiple inputs, or simulating training over several epochs.

## 🔁 For Loops
- Use `for` loops to iterate over sequences like lists, tuples, or arrays.
- Useful when you know how many items to process (e.g., model names or feature values).

In [None]:
# Example: Loop through a list of model names
models = ["Linear Regression", "SVM", "KNN"]

for model in models:
    print(f"Training {model}...")

### 🔢 Looping by Index
- You can use `range(len(list))` to access index positions.
- Useful when modifying values in place or when you need index numbers.
- This is also useful when you need to access elements from multiple lists simultaneously.

In [None]:
# Looping by index
models = ["Model A", "Model B", "Model C"]
accuracies = [0.91, 0.85, 0.88]

# range(len(models)) will return [0,1,2] and i will take the value of those 3 numbers in turn as it loops
for i in range(len(models)):
    print(f"{models[i]} scored {accuracies[i]*100:.1f}% accuracy")

### 🧮 Looping with NumPy Arrays
- NumPy arrays can be looped over like lists, but indexing is often preferred for numerical operations.
- Instead of using `len()`, use `.size` for speed, clarity and compatibility.

In [None]:
import numpy as np

# Creating NumPy array
data = np.array([1.5, 2.3, 3.1, 4.0])

# Iterating over NumPy array, indexing with size (instead of len() as for the python list example above)
for i in range(data.size):
    print(f"Value at index {i}: {data[i]}")

### ❗ More Advanced Loop Example with NumPy Arrays

You can skip this unless you want to see a more advanced example for an additional challenge.

- You can loop through two NumPy arrays together to compute values like the **Euclidean distance** between two data points.
- The Euclidean distance is used in a lot of cluster-based Machine Learning models, as well as for data analysis purposes.
- It gives a measure of how close two data points are together - the lower the value, the closer they are.
- As such, for cluster analysis, data points that are all close together will form one cluster. And this is how we can do that programmatically!

In [None]:
# Comparing two NumPy arrays and calculating Euclidean distance between two data points (that have 3 values each)
point1 = np.array([2.0, 3.0, 4.5])
point2 = np.array([1.0, 0.5, 4.0])

# Instantiating a distance variable, set to 0, which we will append to in the loop below
distance = 0

# For loop, from 0 until the size of the arrays
for i in range(point1.size):
    diff = point1[i] - point2[i] # gets the difference between the two points for each index value (2-1, 3-0.5, 4.5-4)
    distance += diff ** 2        # square the difference so that we always have a positive number, and append to the distance variable

distance = np.sqrt(distance)     # finally, taking the square root of the distance (to balance out the square inside the for loop)

print(f"Euclidean distance between the two data points: {distance:.3f}")

## 💡Generative AI

Want to know more about Euclidean distance?
- This is a good example of something you can look up, or...
- Why not ask your **Generative AI** solution of choice about it, to have it explained in a way that you'd most prefer

Example prompt:
 > "I'm a University student learning about Euclidean distance. I'm a practical and visual learner. Explain what Euclidean distance is and how it works in a step by step manner, with examples and code (Python that I can copy into Google Colab). I am not comfortable with mathematical formulas, so if you include them, make sure to explain those in really simple terms. Really break it down into smaller chunks for me."

## 🔄 While Loops
- Use `while` loops when the number of iterations isn't fixed.
- Be careful to avoid infinite loops!

In [None]:
# Example: Simulate training until a threshold is reached
loss = 1.0
epoch = 0
while loss > 0.3:
    epoch += 1
    loss *= 0.8  # pretend loss reduces each epoch
    print(f"Epoch {epoch}: loss = {loss:.4f}")

### 🧬 Anatomy of a While Loop
- A `while` loop keeps running as long as the condition is `True`.
- Be sure that something in the loop eventually makes the condition `False`, or you'll create an **infinite loop**.

**Example:**
```python
count = 0
while count < 3:
    print("Looping...")
    count += 1  # This prevents infinite loop
```

In [None]:
# Infinite loop example (DO NOT RUN this!)
# count = 0
# while count < 3:
#     print("Looping...")
#     # Oops! count is never updated

# Corrected version
count = 0
while count < 3:
    print(f"Iteration {count}")
    count += 1

## 🧭 Loop Control: break and continue
- `break` exits a loop early.
- `continue` skips the rest of the current iteration.
- These are useful for checking early stopping criteria or skipping bad data.

In [None]:
# Example: Use break and continue
for i in range(10):
    if i == 2:
        continue  # skip this round
    if i == 5:
        break     # exit loop early
    print(i)

## 🎯 Tasks: Try it Yourself

1. Create a list of 5 machine learning model names and use a `for` loop to print each one, prefixed by 'Evaluating: '

In [None]:
models = ["Linear Regression", "SVM", "KNN", "Random Forest", "Naive Bayes"]

for models in models:
    print(f"Evaluating {models}...")

2. Given a list of loss values (e.g., `[0.9, 0.7, 0.5, 0.3]`), use a `for` loop to calculate and print the total loss (i.e., add up all the numbers).

In [None]:
loss = [0.9,0.7,0.5,0.3]

total = 0
for i in loss:
    total += i
    print(f"Current total loss: {total}")

3. Use a `while` loop to simulate training that stops when accuracy reaches 90%. Start with accuracy = 60% and increase by 10% each loop.

In [None]:
accuracy = 60

while accuracy < 90:
    accuracy += 10
    print(f"Current accuracy: {accuracy}%")

## 💥 Mini Challenge
Write a `for` loop that prints accuracy values from a simulated training run: start at 60%, increase by 5% for 10 epochs. Stop printing if accuracy reaches 90%.

In [1]:
accuracy = 60

for i in range(10):
    accuracy += 5
    if accuracy >= 90:
        print(f"Target accuracy reached: {accuracy}%")
        break



Target accuracy reached: 90%


## 💥 Extra Challenge
Ask Generative AI to create a visual chart for you to demonstrate Euclidean distance; in Python code that you can copy into this notebook.

You may want to be specific, or iterate with what you're getting to something that shows:
* a 2D chart
* 2 data points (A and B)
* a straight line between them
* the numeric value calculated for the Euclidean distance between A and B

**PS:** You may already have gotten this above if you asked genAI about Euclidean distance

## 🤔 Reflection
- How can loops help reduce repetition in your code?
- What risks do you need to manage when using `while` loops in ML pipelines?

## ✅ Solutions (Click to Expand)

In [None]:
# Task 1
models = ["Linear Regression", "SVM", "KNN", "Random Forest", "Naive Bayes"]
for model in models:
    print(f"Evaluating: {model}")

# Task 2
losses = [0.9, 0.7, 0.5, 0.3]
total_loss = 0
for loss in losses:
    total_loss += loss
print("Total loss:", total_loss)

# Task 3
accuracy = 60
while accuracy < 90:
    print(f"Current accuracy: {accuracy}%")
    accuracy += 10
print("Reached target accuracy!")

In [None]:
# Mini Challenge
accuracy = 60
for epoch in range(10):
    if accuracy >= 90:
        break
    print(f"Epoch {epoch+1}: accuracy = {accuracy}%")
    accuracy += 5

## ✅ Euclidean Distance Visualisation (Click to Expand)

In [None]:
import matplotlib.pyplot as plt
import math

# Define the points
pointA = (2, 3)
pointB = (7, 6)

# Calculate Euclidean distance
x_diff = pointB[0] - pointA[0]
y_diff = pointB[1] - pointA[1]
distance = math.sqrt(x_diff**2 + y_diff**2)

# Plot the points and line
x_values = [pointA[0], pointB[0]]
y_values = [pointA[1], pointB[1]]

plt.plot(x_values, y_values, marker='o', linestyle='-')
plt.text(pointA[0], pointA[1], 'A (2,3) ', fontsize=12, ha='right', va='bottom')
plt.text(pointB[0], pointB[1], ' B (7,6)', fontsize=12, ha='left', va='bottom')

# Add distance label at midpoint
mid_x = (pointA[0] + pointB[0]) / 2 + 2
mid_y = (pointA[1] + pointB[1]) / 2 - 0.5
plt.text(mid_x, mid_y, f'Distance ≈ {distance:.2f}', fontsize=12, ha='center', va='bottom', color='blue')

# Set axis labels and title
plt.title('Euclidean Distance: Point A to Point B')
plt.xlabel('X-axis')
plt.ylabel('Y-axis')

# Set explicit axis ranges
plt.xlim(0, 10)
plt.ylim(0, 10)

plt.grid(True)
plt.show()