
# 🔋 Load Shedding Data Exploration with Python Loops

In this notebook, we explore a dataset containing load shedding information for South African cities using **for loops** and **while loops**.

We'll answer questions using iteration and control flow logic.



## 📥 Load the Data


In [4]:

import pandas as pd

# Load CSV
df = pd.read_csv("../../data/loadshedding_data.csv")
df


Unnamed: 0,City,Stage,Hours_Off_Per_Day,Population
0,Cape Town,2,4,433688
1,Joburg,3,6,957441
2,Durban,1,2,595061
3,Pretoria,4,8,741651
4,Bloemfontein,2,3,256185



## ❓ Q1: Print the total hours off per day for each city using a `for` loop.


In [5]:

for i in range(len(df)):
    city = df.loc[i, "City"]
    hours = df.loc[i, "Hours_Off_Per_Day"]
    print(f"{city} has {hours} hours of power outage per day.")


Cape Town has 4 hours of power outage per day.
Joburg has 6 hours of power outage per day.
Durban has 2 hours of power outage per day.
Pretoria has 8 hours of power outage per day.
Bloemfontein has 3 hours of power outage per day.



## ❓ Q2: Use a `while` loop to print cities with Stage >= 3.


In [6]:

i = 0
while i < len(df):
    if df.loc[i, "Stage"] >= 3:
        print(f"{df.loc[i, 'City']} is in Stage {df.loc[i, 'Stage']}")
    i += 1


Joburg is in Stage 3
Pretoria is in Stage 4



## ❓ Q3: Calculate economic loss per city using a `for` loop

Assume each stage costs **R50 per person per day**.


In [7]:

for i in range(len(df)):
    stage = df.loc[i, "Stage"]
    population = df.loc[i, "Population"]
    city = df.loc[i, "City"]
    cost = stage * 50 * population
    print(f"Estimated economic loss in {city}: R{cost:,}")


Estimated economic loss in Cape Town: R43,368,800
Estimated economic loss in Joburg: R143,616,150
Estimated economic loss in Durban: R29,753,050
Estimated economic loss in Pretoria: R148,330,200
Estimated economic loss in Bloemfontein: R25,618,500



## ❓ Q4: Use a `while` loop to find the city with the highest stage.


In [8]:

i = 0
max_stage = -1
max_city = ""

while i < len(df):
    if df.loc[i, "Stage"] > max_stage:
        max_stage = df.loc[i, "Stage"]
        max_city = df.loc[i, "City"]
    i += 1

print(f"The city with the highest load shedding stage is {max_city} (Stage {max_stage})")


The city with the highest load shedding stage is Pretoria (Stage 4)


## ❓ Q5: Calculate average hours off per city for Stage >= 2


In [9]:
total_hours = 0
count = 0

for i in range(len(df)):
    if df.loc[i, "Stage"] >= 2:
        total_hours += df.loc[i, "Hours_Off_Per_Day"]
        count += 1

if count > 0:
    print("Average hours off (Stage >= 2):", total_hours / count)

Average hours off (Stage >= 2): 5.25


## ❓ Q6: Count how many cities are in each stage

In [10]:
stage_counts = {}

for i in range(len(df)):
    stage = df.loc[i, "Stage"]
    stage_counts[stage] = stage_counts.get(stage, 0) + 1

print("City count per stage:", stage_counts)

City count per stage: {np.int64(2): 2, np.int64(3): 1, np.int64(1): 1, np.int64(4): 1}
