# While Loops — Intermediate Python

---


---

## The `while` loop

A `while` loop runs **as long as the condition is True**.  
Be careful to update the condition inside the loop, otherwise it can run forever.

---

### Example: dividing error until ≤ 1

```python
error = 50.0

while error > 1:
    error = error / 4
    print(error)
```

Output:
```
12.5
3.125
0.78125
```

Explanation:
✅ Start with `error = 50`  
✅ Each time, divide `error` by 4  
✅ Loop continues as long as `error > 1`  
✅ Stops when `error <= 1`

---

## Step by step

### First iteration:
```python
error = 50.0
while error > 1:
    error = error / 4
    print(error)
```

Output:
```
12.5
```

---

### Second iteration:
Now `error = 12.5`, still > 1 → loop runs again:
```
3.125
```

---

### Third iteration:
Now `error = 3.125`, still > 1 → loop runs again:
```
0.78125
```

---

### Fourth iteration:
Now `error = 0.78125`, no longer > 1 → loop exits.

Final output sequence:
```
12.5
3.125
0.78125
```

---

## ⚠️ Infinite loop

If you forget to update the condition inside the loop, it will run forever.

```python
error = 50.0

while error > 1:
    # forgot to update error here
    print(error)
```

Output:
```
50.0
50.0
50.0
...
```

You must stop it manually (Ctrl+C in terminal or stop the kernel).

---

## Summary
✅ Use `while` when you don’t know in advance how many times to loop.  
✅ Make sure the condition eventually becomes False.  
✅ Update your loop variable inside the loop to avoid infinite loops.

---



# For Loops — Intermediate Python


## The `for` loop

General syntax:
```python
for var in seq:
    expression
```

Meaning:  
*"For each `var` in `seq`, execute `expression`."*

---

## Example: list of heights

```python
fam = [1.73, 1.68, 1.71, 1.89]
print(fam)
```

Output:
```
[1.73, 1.68, 1.71, 1.89]
```

---

### Without a loop: accessing each element manually

```python
fam = [1.73, 1.68, 1.71, 1.89]
print(fam[0])
print(fam[1])
print(fam[2])
print(fam[3])
```

Output:
```
1.73
1.68
1.71
1.89
```

---

## Using a `for` loop: cleaner and more scalable

```python
fam = [1.73, 1.68, 1.71, 1.89]

for height in fam:
    print(height)
```

Output:
```
1.73
1.68
1.71
1.89
```

---

### Step-by-step explanation

**First iteration:**
```python
height = 1.73
print(height)
```
Output:  
`1.73`

---

**Second iteration:**
```python
height = 1.68
print(height)
```
Output:
```
1.73
1.68
```

---

**Third iteration:**
```python
height = 1.71
print(height)
```
Output:
```
1.73
1.68
1.71
```

---

**Fourth iteration:**
```python
height = 1.89
print(height)
```
Output:
```
1.73
1.68
1.71
1.89
```

---

### Notes:
✅ When looping like this, you only get the **values**, not their **indexes**.

---

## Getting both index and value with `enumerate`

Use `enumerate()` when you also want the **index**.

```python
fam = [1.73, 1.68, 1.71, 1.89]

for index, height in enumerate(fam):
    print("index " + str(index) + ": " + str(height))
```

Output:
```
index 0: 1.73
index 1: 1.68
index 2: 1.71
index 3: 1.89
```

---

## Summary
✅ Without a loop → need to manually access each index.  
✅ With a `for` loop → iterate over values automatically.  
✅ With `enumerate()` → iterate over both index and value.

---


# Exercises

In [None]:
# Loop over list of lists
# Remember the house variable from the Intro to Python course? Have a look at its definition in the script. It's basically a list of lists, where each sublist contains the name and area of a room in your house.

# It's up to you to build a for loop from scratch this time!

# Instructions
# 100 XP
# Write a for loop that goes through each sublist of house and prints out the x is y sqm, where x is the name of the room and y is the area of the room.


# house list of lists
house = [["hallway", 11.25], 
         ["kitchen", 18.0], 
         ["living room", 20.0], 
         ["bedroom", 10.75], 
         ["bathroom", 9.50]]
         
# Build a for loop from scratch
for x, y in house:
    print("the " + str(x) + " is " + str(y) + " sqm")


# Looping over Data Structures — Part 1

---

## 📖 Introduction
When working with Python data structures like dictionaries and NumPy arrays, you often need to iterate over their contents to process or display data.  
Here we learn how to loop:
- Over **dictionaries** (key-value pairs)
- Over **1D NumPy arrays** (element by element)
- Over **2D NumPy arrays** (rows and then elements)

---

## 📝 Looping over a Dictionary

When looping over a dictionary:
- By default, you only get the **keys**.
- If you try to unpack directly, it raises an error.

### ❌ Incorrect Way
```python
world = {
    "afghanistan": 30.55,
    "albania": 2.77,
    "algeria": 39.21
}

for key, value in world:
    print(key + " -- " + str(value))
```

Output:
```
ValueError: too many values to unpack (expected 2)
```

Explanation:
- Iterating directly over `world` gives only the keys.
- You can’t unpack keys into `key, value`.

---

### ✅ Correct Way: use `.items()`

```python
world = {
    "afghanistan": 30.55,
    "albania": 2.77,
    "algeria": 39.21
}

for key, value in world.items():
    print(key + " -- " + str(value))
```

Output:
```
afghanistan -- 30.55
albania -- 2.77
algeria -- 39.21
```

### Shorthand Variable Names
You can use `k, v` instead of `key, value`:
```python
for k, v in world.items():
    print(k + " -- " + str(v))
```

Output:
```
afghanistan -- 30.55
albania -- 2.77
algeria -- 39.21
```

---

## 📝 Looping over a 1D NumPy Array

We can use a simple `for` loop to iterate element by element.

```python
import numpy as np

np_height = np.array([1.73, 1.68, 1.71, 1.89, 1.79])
np_weight = np.array([65.4, 59.2, 63.6, 88.4, 68.7])

bmi = np_weight / np_height ** 2

for val in bmi:
    print(val)
```

Output:
```
21.852
20.975
21.750
24.747
21.441
```

---

## 📝 Looping over a 2D NumPy Array (by rows)

When working with a 2D array (matrix), looping like this gives you each row:

```python
meas = np.array([np_height, np_weight])

for val in meas:
    print(val)
```

Output:
```
[1.73 1.68 1.71 1.89 1.79]
[65.4 59.2 63.6 88.4 68.7]
```

---

## 📝 Looping over a 2D NumPy Array (element by element)

To iterate **element by element**, use `np.nditer()`:

```python
for val in np.nditer(meas):
    print(val)
```

Output:
```
1.73
1.68
1.71
1.89
1.79
65.4
59.2
63.6
88.4
68.7
```

---

## 🔁 Recap Table

| Data Structure             | How to loop |
|----------------------------|-------------|
| **Dictionary (keys & values)** | `for key, val in my_dict.items():` |
| **1D NumPy array (elements)**  | `for val in my_array:` |
| **2D NumPy array (rows)**      | `for val in my_array:` |
| **2D NumPy array (elements)**  | `for val in np.nditer(my_array):` |

---

## 🌟 Summary
✅ For **dictionaries**, use `.items()` to get both keys and values.  
✅ For **1D arrays**, a simple `for val` works.  
✅ For **2D arrays**, loop gives rows → use `np.nditer()` for elements.  
✅ Always know what your loop gives back — keys, values, rows, or elements.


# Exercises

In [None]:
# Loop over NumPy array
# If you're dealing with a 1D NumPy array, looping over all elements can be as simple as:

# for x in my_array :
#     ...
# If you're dealing with a 2D NumPy array, it's more complicated. A 2D array is built up of multiple 1D arrays. To explicitly iterate over all separate elements of a multi-dimensional array, you'll need this syntax:

# for x in np.nditer(my_array) :
#     ...
# Two NumPy arrays that you might recognize from the intro course are available in your Python session: np_height, a NumPy array containing the heights of Major League Baseball players, and np_baseball, a 2D NumPy array that contains both the heights (first column) and weights (second column) of those players.

# Instructions
# 100 XP
# Import the numpy package under the local alias np.
# Write a for loop that iterates over all elements in np_height and prints out "x inches" for each element, where x is the value in the array.
# Write a for loop that visits every element of the np_baseball array and prints it out.


# Import numpy as np
import numpy as np

# For loop over np_height
for x in np.nditer(np_height) :
    print(str(x) + ' inches')

# For loop over np_baseball
for y in np.nditer(np_baseball) :
    print(y)

# Looping over Data Structures — Part 2 (DataFrames)



---

## 📖 Introduction
In this part, we focus on **looping over a Pandas DataFrame**.  
You’ll see:
- What happens when you loop directly
- How to use `.iterrows()`
- How to access specific elements
- How to create new columns inside a loop (and why it’s inefficient)
- How to use `.apply()` instead (faster & cleaner)

---

## 📝 Load the DataFrame

We use the `brics.csv` dataset:
```python
import pandas as pd

brics = pd.read_csv("brics.csv", index_col=0)
print(brics)
```

Output:
```
           country    capital    area  population
BR          Brazil   Brasilia   8.516      200.40
RU          Russia     Moscow  17.100      143.50
IN           India  New Delhi   3.286     1252.00
CH           China    Beijing   9.597     1357.00
SA    South Africa   Pretoria   1.221       52.98
```

---

## 🚫 First try: Looping directly over `brics`

```python
for val in brics:
    print(val)
```

Output:
```
country
capital
area
population
```

Explanation:
- When you loop over a DataFrame directly, you only iterate over its **columns (labels)** — not the rows or the data.

---

## ✅ Correct way: `.iterrows()`

Use `.iterrows()` to iterate over rows:
- Returns `(label, row)` for each row.
- `label` is the index label.
- `row` is a Pandas Series containing the row data.

```python
for lab, row in brics.iterrows():
    print(lab)
    print(row)
```

Sample Output:
```
BR
country         Brazil
capital       Brasilia
area             8.516
population       200.4
Name: BR, dtype: object

RU
country         Russia
capital         Moscow
area             17.1
population       143.5
Name: RU, dtype: object
...
```

---

## 🎯 Selective printing: print only the capital city

You can access specific columns from each row:
```python
for lab, row in brics.iterrows():
    print(lab + ": " + row["capital"])
```

Output:
```
BR: Brasilia
RU: Moscow
IN: New Delhi
CH: Beijing
SA: Pretoria
```

---

## ✍️ Adding a column inside the loop

You can create a new column and populate it row by row (inefficient for large datasets).

Here we compute the length of each country name:
```python
for lab, row in brics.iterrows():
    brics.loc[lab, "name_length"] = len(row["country"])

print(brics)
```

Output:
```
           country    capital    area  population  name_length
BR          Brazil   Brasilia   8.516      200.40          6
RU          Russia     Moscow  17.100      143.50          6
IN           India  New Delhi   3.286     1252.00          5
CH           China    Beijing   9.597     1357.00          5
SA    South Africa   Pretoria   1.221       52.98         12
```

---

## ⚡ Better way: use `.apply()`

Instead of a loop, you can use `.apply()` — faster and more elegant.

```python
brics["name_length"] = brics["country"].apply(len)
print(brics)
```

Output:
```
           country    capital    area  population  name_length
BR          Brazil   Brasilia   8.516      200.40          6
RU          Russia     Moscow  17.100      143.50          6
IN           India  New Delhi   3.286     1252.00          5
CH           China    Beijing   9.597     1357.00          5
SA    South Africa   Pretoria   1.221       52.98         12
```

---

## 🔁 Summary Table

| Approach                          | What it does |
|----------------------------------|--------------|
| `for val in df:`                 | Iterates over column names |
| `for lab, row in df.iterrows():` | Iterates over rows as (label, Series) |
| Selective print inside loop      | Access row elements by column name |
| Adding column inside loop        | Works, but slow on big data |
| `.apply()`                       | Best practice for column-wise operations |

---

## 🌟 Best Practices:
✅ Use `.iterrows()` only when you really need to process rows one by one.  
✅ Use `.apply()` for fast, vectorized, column-wise operations whenever possible.  
✅ Avoid adding columns inside a loop if you can use `.apply()` instead.


# Exercises

In [None]:
# Loop over DataFrame (1)
# Iterating over a Pandas DataFrame is typically done with the iterrows() method. Used in a for loop, every observation is iterated over and on every iteration the row label and actual row contents are available:

# for lab, row in brics.iterrows() :
#     ...
# In this and the following exercises you will be working on the cars DataFrame. It contains information on the cars per capita and whether people drive right or left for seven countries in the world.

# Instructions
# 100 XP
# Write a for loop that iterates over the rows of cars and on each iteration perform two print() calls: one to print out the row label and one to print out all of the rows contents.


# Import cars data
import pandas as pd
cars = pd.read_csv('cars.csv', index_col = 0)

# Iterate over rows of cars
for lab, row in cars.iterrows() :
    print(lab)
    print(row)


In [None]:
# Loop over DataFrame (2)
# The row data that's generated by iterrows() on every run is a Pandas Series. This format is not very convenient to print out. Luckily, you can easily select variables from the Pandas Series using square brackets:

# for lab, row in brics.iterrows() :
#     print(row['country'])
# Instructions
# 100 XP
# Using the iterators lab and row, adapt the code in the for loop such that the first iteration prints out "US: 809", the second iteration "AUS: 731", and so on.
# The output should be in the form "country: cars_per_cap". Make sure to print out this exact string (with the correct spacing).
# You can use str() to convert your integer data to a string so that you can print it in conjunction with the country label.

# Import cars data
import pandas as pd
cars = pd.read_csv('cars.csv', index_col = 0)

# Adapt for loop
for lab, row in cars.iterrows() :
    print(str(lab) + ': '+ str(row['cars_per_cap']))


In [None]:
# Add column (1)
# In the video, Hugo showed you how to add the length of the country names of the brics DataFrame in a new column:

# for lab, row in brics.iterrows() :
#     brics.loc[lab, "name_length"] = len(row["country"])
# You can do similar things on the cars DataFrame.

# Instructions
# 100 XP
# Use a for loop to add a new column, named COUNTRY, that contains a uppercase version of the country names in the "country" column. You can use the string method upper() for this.
# To see if your code worked, print out cars. Don't indent this code, so that it's not part of the for loop.

# Import cars data
import pandas as pd
cars = pd.read_csv('cars.csv', index_col = 0)

# Code for loop that adds COUNTRY column
for lab, row in cars.iterrows() :
    cars.loc[lab, 'COUNTRY'] = cars.loc[lab, 'country'].upper()

# Print cars
print(cars)

In [3]:
# Add column (2)
# Using iterrows() to iterate over every observation of a Pandas DataFrame is easy to understand, but not very efficient. On every iteration, you're creating a new Pandas Series.

# If you want to add a column to a DataFrame by calling a function on another column, the iterrows() method in combination with a for loop is not the preferred way to go. Instead, you'll want to use apply().

# Compare the iterrows() version with the apply() version to get the same result in the brics DataFrame:

# for lab, row in brics.iterrows() :
#     brics.loc[lab, "name_length"] = len(row["country"])

# brics["name_length"] = brics["country"].apply(len)
# We can do a similar thing to call the upper() method on every name in the country column. However, upper() is a method, so we'll need a slightly different approach:

# Instructions
# 100 XP
# Replace the for loop with a one-liner that uses .apply(str.upper). The call should give the same result: a column COUNTRY should be added to cars, containing an uppercase version of the country names.
# As usual, print out cars to see the fruits of your hard labor




# Import cars data
import pandas as pd
cars = pd.read_csv('cars.csv', index_col = 0)

# Use .apply(str.upper)
cars["COUNTRY"] = cars["country"].apply(str.upper)
cars