# **Week 1: Introduction to Python**

Welcome to the first session of Introduction to Python for Social Data Science. This notebook will guide you through the foundational concepts of Python programming.

## **What We will Learn Today**

By the end of this session, you will be able to:

1. **Store and manipulate data** using variables and basic data types
2. **Perform calculations** using Python's arithmetic and comparison operators
3. **Work with text** using strings and formatted output
4. **Organise collections of data** using lists
5. **Make decisions in code** using conditional statements
6. **Automate repetitive tasks** using loops
7. **Write reusable code** using functions

---

### **Why These Concepts Matter**

These concepts are the backbone of virtually all programming. Whether you do, you will be using these building blocks constantly.

Think of today's session as learning the grammar of a new language. Once you understand how variables, loops, and functions work, libraries like NumPy (Week 2), Pandas (Week 3), and scikit-learn (Week 6) become tools you can use with ease rather than mysteries to decode.

---

### **How Today's Concepts Connect**

```
Variables & Types
       ↓
   Operators  →  Strings
       ↓
     Lists
       ↓
  Conditionals
       ↓
     Loops
       ↓
   Functions  →  [Week 2: NumPy, Week 3: Pandas, ...]
```

Each concept builds on the previous ones. By the final exercise, you'll combine all of them to write a small but complete data analysis program.

---

### **A Note on Pace**

Some of you have programming experience in R or other languages; others are starting fresh. Both are welcome here.

- **Main exercises** are designed for everyone to complete
- **Extra exercises** (marked with ⭐) offer additional challenge for those who finish early

If you finish early, try the extra exercises or help a neighbour. If you are finding things difficult, focus on the main exercises. That's what matters.

---

## **0. Python Essentials: Objects and Syntax**

Before we dive in, two key ideas will help everything else make sense.

### **Everything in Python is an Object**

Numbers, text, lists, even functions—Python treats them all as *objects*. Each object has:
- A **type** (what kind of thing it is)
- A **value** (the data it holds)
- **Methods** (actions it can perform)

You don't need to understand object-oriented programming deeply right now. Just know that when you see something like `my_text.upper()`, you're asking the object `my_text` to perform its `upper` action.

We will explore objects and classes properly in **Week 5**.

### **Python Cares About Indentation**

Unlike many languages, Python uses indentation (spaces at the start of a line) to define code structure. This will matter when we reach conditionals, loops, and functions.

```python
# Correct
if x > 0:
    print("positive")  # 4 spaces indent

# Wrong - will cause an error
if x > 0:
print("positive")  # no indent
```

---

# **1. Variables and Data Types**

Variables are names that point to values stored in your computer's memory. Think of them as labels on boxes.

### **Creating Variables**

Use the `=` sign to assign a value to a variable name:

In [None]:
country = "Germany"
population = 83.2  # millions
gdp = 4072.6  # billions USD
is_democracy = True

print(country)
print(population)

### **Basic Data Types**

| Type | Description | Example | Check with |
|------|-------------|---------|------------|
| `int` | Whole numbers | `42`, `-7`, `0` | `type(42)` → `int` |
| `float` | Decimal numbers | `3.14`, `-0.001` | `type(3.14)` → `float` |
| `str` | Text (strings) | `"hello"`, `'world'` | `type("hi")` → `str` |
| `bool` | True/False values | `True`, `False` | `type(True)` → `bool` |

You can check any value's type using the `type()` function:

In [None]:
print(type(country))  # str
print(type(population))  # float
print(type(is_democracy))  # bool

### **Variable Naming Rules**

| Rule | Valid | Invalid |
|------|-------|--------|
| Use letters, numbers, underscores | `my_var`, `count2` | — |
| Cannot start with a number | `var1` | `1var` |
| Case-sensitive | `Name` ≠ `name` | — |
| No spaces or special characters | `gdp_per_capita` | `gdp per capita`, `gdp-per-capita` |

**Convention:** Use descriptive, lowercase names with underscores: `voter_turnout`, `poll_results`

### **Dynamic Typing**

Python figures out the type automatically, you do not need to declare it. A variable can even change type:

In [None]:
x = 10  # x is an int
print(type(x))

x = "ten"  # now x is a str
print(type(x))

---

### **Exercise 1: Variables**

Create variables to store information about a country of your choice:
- `country_name` — the country's name (string)
- `population_millions` — population in millions (float)
- `gdp_billions` — GDP in billions USD (float)
- `is_eu_member` — whether it's in the EU (boolean)

Print each variable.

In [None]:
### YOUR CODE HERE
country_name = "China"
population_millions = 1409.0
gdp_billions = 18740.0
is_eu_member = False

### ⭐ **Extra Exercise 1**

Use `type()` to verify the type of each variable you created. Try reassigning `country_name` to a number—what happens to its type?

In [None]:
### YOUR CODE HERE ###
type(country_name)  # str
type(population_millions)  # float
type(gdp_billions)  # float
type(is_eu_member)  # bool

---

# **2. Operators**

Operators let you perform calculations and comparisons.

### **Arithmetic Operators**

| Operator | Description | Example | Result |
|----------|-------------|---------|--------|
| `+` | Addition | `5 + 3` | `8` |
| `-` | Subtraction | `5 - 3` | `2` |
| `*` | Multiplication | `5 * 3` | `15` |
| `/` | Division | `5 / 3` | `1.666...` |
| `//` | Floor division (integer) | `5 // 3` | `1` |
| `%` | Modulus (remainder) | `5 % 3` | `2` |
| `**` | Exponentiation | `5 ** 3` | `125` |

In [None]:
# Calculate GDP per capita
gdp_billions = 4072.6
population_millions = 83.2

# Convert to same units (billions / millions = thousands)
gdp_per_capita = (gdp_billions * 1000) / population_millions
print(gdp_per_capita)

### **Comparison Operators**

These return `True` or `False`:

| Operator | Description | Example | Result |
|----------|-------------|---------|--------|
| `==` | Equal to | `5 == 5` | `True` |
| `!=` | Not equal to | `5 != 3` | `True` |
| `>` | Greater than | `5 > 3` | `True` |
| `<` | Less than | `5 < 3` | `False` |
| `>=` | Greater or equal | `5 >= 5` | `True` |
| `<=` | Less or equal | `5 <= 3` | `False` |

In [None]:
# Is GDP per capita above 40,000?
is_high_income = gdp_per_capita > 40000
print(is_high_income)

### **Logical Operators**

Combine boolean values:

| Operator | Description | Example | Result |
|----------|-------------|---------|--------|
| `and` | Both must be True | `True and False` | `False` |
| `or` | At least one True | `True or False` | `True` |
| `not` | Inverts the value | `not True` | `False` |

In [None]:
is_large = population_millions > 50
is_wealthy = gdp_per_capita > 30000

# Large AND wealthy?
print(is_large and is_wealthy)

# Large OR wealthy?
print(is_large or is_wealthy)

---

### **Exercise 2: Operators**

Using the variables you created in Exercise 1:

1. Calculate GDP per capita (GDP in billions × 1000 / population in millions)
2. Store whether GDP per capita exceeds 35,000 in a variable called `is_high_income`
3. Print both results

In [None]:
### YOUR CODE HERE ###
gdp_per_capita_china = (gdp_billions * 1000) / population_millions
is_high_income_china = gdp_per_capita_china > 40000

print(f"GDP per capita of China: {gdp_per_capita_china}")
print(f"Is China high income? {'Yes' if is_high_income_china else 'No'}")

### ⭐ **Extra Exercise 2**

A country qualifies as a "major EU economy" if it is both an EU member AND has GDP per capita above 30,000. Create a variable `is_major_eu_economy` that stores this result using logical operators.

In [None]:
### YOUR CODE HERE ###
is_major_eu_economy = is_eu_member and gdp_per_capita_china > 30000
is_major_eu_economy_text = "is" if is_major_eu_economy else "is not"
print(f"China {is_major_eu_economy_text} a major EU economy.")

---

# **3. Strings**

Strings are sequences of characters (text). They're created with single `'...'` or double `"..."` quotes.

### **Basic String Operations**

| Operation | Description | Example | Result |
|-----------|-------------|---------|--------|
| `+` | Concatenation | `"Hello" + " World"` | `"Hello World"` |
| `*` | Repetition | `"ab" * 3` | `"ababab"` |
| `len()` | Length | `len("Hello")` | `5` |
| `in` | Contains | `"ell" in "Hello"` | `True` |

In [None]:
first_name = "Angela"
last_name = "Merkel"

# Concatenation
full_name = first_name + " " + last_name
print(full_name)

# Length
print(len(full_name))

### **Common String Methods**

Methods are actions that objects can perform. Call them with `.method_name()`:

| Method | Description | Example | Result |
|--------|-------------|---------|--------|
| `.upper()` | Uppercase | `"hello".upper()` | `"HELLO"` |
| `.lower()` | Lowercase | `"HELLO".lower()` | `"hello"` |
| `.strip()` | Remove whitespace | `" hi ".strip()` | `"hi"` |
| `.replace(a, b)` | Replace a with b | `"hello".replace("l", "x")` | `"hexxo"` |
| `.split(sep)` | Split into list | `"a,b,c".split(",")` | `["a", "b", "c"]` |

In [None]:
text = "  Social Data Science  "

print(text.strip())  # Remove extra spaces
print(text.lower())  # All lowercase
print(text.replace("Data", "Political"))  # Substitution

### **f-strings (Formatted Strings)**

f-strings let you embed variables directly in text. Prefix the string with `f` and put variables in `{curly braces}`:

In [None]:
country = "Germany"
gdp_per_capita = 48959.5
is_democracy = True

# Basic f-string
summary = f"{country} has a GDP per capita of ${gdp_per_capita}."
print(summary)

# With formatting (2 decimal places, comma separators)
summary = f"{country} has a GDP per capita of ${gdp_per_capita:,.2f}."
print(summary)

### **f-string Formatting Options**

| Format | Description | Example | Result |
|--------|-------------|---------|--------|
| `:.2f` | 2 decimal places | `f"{3.14159:.2f}"` | `"3.14"` |
| `:,` | Comma separator | `f"{1000000:,}"` | `"1,000,000"` |
| `:.1%` | Percentage | `f"{0.756:.1%}"` | `"75.6%"` |
| `:>10` | Right-align, width 10 | `f"{"hi":>10}"` | `"        hi"` |

---

### **Exercise 3: Strings**

Using your variables from Exercises 1 and 2, create and print a summary sentence using an f-string:

> `"[Country] has a GDP per capita of $[amount] and is [not] an EU member."`

Format the GDP per capita to 0 decimal places with comma separators.

In [None]:
### YOUR CODE HERE ###
print(
    f"{country_name} has a GDP per capita of ${gdp_per_capita_china:,.2f} and is {'not' if not is_eu_member else ''} an EU member."
)

### ⭐ **Extra Exercise 3**

The string `"  UK, France, Germany, Italy, Spain  "` needs cleaning. Write code that:
1. Removes the extra whitespace at the start and end
2. Replaces "UK" with "United Kingdom"
3. Converts to uppercase
4. Prints the result

Do this in a single chain of method calls (e.g., `text.strip().replace(...)...`).

In [None]:
### YOUR CODE HERE ###
messy_text = "  UK, France, Germany, Italy, Spain  "
text = messy_text.strip().replace("UK", "United Kingdom").upper()
print(text)

---

# **4. Lists**

Lists are ordered collections that can hold multiple items. They're one of Python's most useful data structures.

### **Creating Lists**

In [None]:
# List of strings
countries = ["Germany", "France", "Italy", "Spain", "Poland"]

# List of numbers
populations = [83.2, 67.8, 59.1, 47.4, 37.7]

# Mixed types (less common but allowed)
mixed = ["Germany", 83.2, True]

# Empty list
empty = []

### **Indexing and Slicing**

Access items by position. **Python counts from 0**, not 1.

```
Index:     0         1        2        3        4
        ["Germany", "France", "Italy", "Spain", "Poland"]
Index:    -5        -4       -3       -2       -1
```

| Operation | Description | Example | Result |
|-----------|-------------|---------|--------|
| `[i]` | Item at index i | `countries[0]` | `"Germany"` |
| `[-1]` | Last item | `countries[-1]` | `"Poland"` |
| `[a:b]` | Items from a to b-1 | `countries[1:3]` | `["France", "Italy"]` |
| `[:b]` | First b items | `countries[:2]` | `["Germany", "France"]` |
| `[a:]` | Items from a onward | `countries[3:]` | `["Spain", "Poland"]` |

In [None]:
countries = ["Germany", "France", "Italy", "Spain", "Poland"]

print(countries[0])  # First item
print(countries[-1])  # Last item
print(countries[1:4])  # Items 1, 2, 3 (not 4!)
print(countries[::2])  # Every second item

### **Modifying Lists**

Lists are *mutable*—you can change them after creation.

| Operation | Description | Example |
|-----------|-------------|--------|
| `[i] = x` | Replace item | `countries[0] = "UK"` |
| `.append(x)` | Add to end | `countries.append("UK")` |
| `.insert(i, x)` | Insert at position | `countries.insert(0, "UK")` |
| `.remove(x)` | Remove first occurrence | `countries.remove("Italy")` |
| `.pop(i)` | Remove and return item | `countries.pop(0)` |

In [None]:
countries = ["Germany", "France", "Italy"]

countries.append("Spain")  # Add to end
print(countries)

countries[0] = "United Kingdom"  # Replace first item
print(countries)

removed = countries.pop()  # Remove and return last item
print(f"Removed: {removed}")
print(countries)

### **Useful List Functions**

| Function | Description | Example | Result |
|----------|-------------|---------|--------|
| `len(lst)` | Number of items | `len([1,2,3])` | `3` |
| `sum(lst)` | Sum of numbers | `sum([1,2,3])` | `6` |
| `min(lst)` | Smallest value | `min([3,1,2])` | `1` |
| `max(lst)` | Largest value | `max([3,1,2])` | `3` |
| `sorted(lst)` | Sorted copy | `sorted([3,1,2])` | `[1,2,3]` |

In [None]:
turnouts = [72.5, 65.3, 58.9, 81.2, 69.4]

print(f"Count: {len(turnouts)}")
print(f"Total: {sum(turnouts)}")
print(f"Lowest: {min(turnouts)}")
print(f"Highest: {max(turnouts)}")
print(f"Average: {sum(turnouts) / len(turnouts):.1f}")

---

### **Exercise 4: Lists**

1. Create a list called `eu_countries` containing these 5 countries: Germany, France, Italy, Spain, Poland
2. Print the first country and the last country
3. Print the middle three countries (index 1, 2, and 3) using slicing
4. Append "Netherlands" to the list
5. Print the updated list and its length

In [None]:
### YOUR CODE HERE ###
eu_countries = ["Germany", "France", "Italy", "Spain", "Poland"]
print(f"First country: {eu_countries[0]}\nLast country: {eu_countries[-1]}")
print(f"Middle: {', '.join(eu_countries[1:4])}")
eu_countries.append("Netherlands")
print(f"Updated list: {eu_countries}, length {len(eu_countries)}")
print("hel")

### ⭐ **Extra Exercise 4**

Given the list `poll_results = [42, 45, 44, 47, 51, 48, 52]`:

1. Calculate and print the average (using `sum()` and `len()`)
2. Print the range (difference between max and min)
3. Create a new list containing only the last 3 poll results
4. Replace the first value with `40` and print the modified list

In [None]:
### YOUR CODE HERE ###
poll_results = [42, 45, 44, 47, 51, 48, 52]
avg = sum(poll_results) / len(poll_results)
print(f"Average poll result: {avg:.2f}.")

poll_last_three = poll_results[-3:]
poll_last_three[0] = 40
print(f"Modified last three results: {poll_last_three}")

---

# **5. Conditionals**

Conditionals let your code make decisions based on whether something is `True` or `False`.

### **The if Statement**

```python
if condition:
    # code runs if condition is True
```

**Note the colon `:` and the indentation (4 spaces)**

In [None]:
turnout = 72

if turnout > 70:
    print("High turnout!")

### **if-else**

```python
if condition:
    # runs if True
else:
    # runs if False
```

In [None]:
turnout = 45

if turnout > 50:
    print("Majority participated")
else:
    print("Less than half participated")

### **if-elif-else**

For multiple conditions, use `elif` (short for "else if"):

```python
if condition1:
    # runs if condition1 is True
elif condition2:
    # runs if condition1 is False AND condition2 is True
elif condition3:
    # runs if above are False AND condition3 is True
else:
    # runs if all above are False
```

In [None]:
turnout = 62

if turnout < 50:
    category = "Low"
elif turnout <= 70:
    category = "Moderate"
else:
    category = "High"

print(f"Turnout of {turnout}% is classified as: {category}")

### **Combining Conditions**

Use `and`, `or`, `not` to combine conditions:

In [None]:
turnout = 75
is_national = True

if turnout > 70 and is_national:
    print("Strong democratic engagement")

if turnout < 40 or not is_national:
    print("Consider engagement strategies")

---

### **Exercise 5: Conditionals**

Write code that classifies voter turnout:

Given a variable `turnout` (a percentage between 0-100), print:
- `"Low turnout"` if below 50
- `"Moderate turnout"` if between 50 and 70 (inclusive)
- `"High turnout"` if above 70

Test your code with `turnout = 62`

In [None]:
### YOUR CODE HERE ###
turnout = 62


def classify_turnout(turnout: int) -> str:
    if turnout < 50:
        return "Low turnout"
    elif 50 <= turnout <= 70:
        return "Moderate turnout"
    else:
        return "High turnout"


print(classify_turnout(turnout))

### ⭐ **Extra Exercise 5**

Extend your classifier to also consider whether the election is national or local:

- If **national**: use the same thresholds (50, 70)
- If **local**: use lower thresholds (30, 50) since local elections typically have lower turnout

Test with:
```python
turnout = 45
is_national = False
```
This should print "Moderate turnout" (since 45 > 30 for local elections).

In [None]:
### YOUR CODE HERE ###
turnout = 45
is_national = False

if is_national:
    if turnout < 50:
        print("Low turnout")
    elif 50 <= turnout <= 70:
        print("Moderate turnout")
    else:
        print("High turnout")
else:
    if turnout < 30:
        print("Low turnout")
    elif 30 <= turnout <= 50:
        print("Moderate turnout")
    else:
        print("High turnout")

---

# **6. For Loops**

Loops let you repeat code for each item in a sequence, saving you from writing the same code multiple times.

### **Looping Over a List**

```python
for item in sequence:
    # code to run for each item
```

In [None]:
countries = ["Germany", "France", "Italy"]

for country in countries:
    print(f"Processing: {country}")

### **Using range()**

The `range()` function generates a sequence of numbers:

| Usage | Generates | Example |
|-------|-----------|--------|
| `range(n)` | 0 to n-1 | `range(5)` → 0,1,2,3,4 |
| `range(a, b)` | a to b-1 | `range(2, 5)` → 2,3,4 |
| `range(a, b, step)` | a to b-1, by step | `range(0, 10, 2)` → 0,2,4,6,8 |

In [None]:
# Print numbers 1-5
for i in range(1, 6):
    print(i)

print("---")

# Countdown
for i in range(5, 0, -1):
    print(i)

### **Combining Loops with Conditionals**

A very common pattern: loop through data and apply conditions to each item.

In [None]:
turnouts = [45, 62, 78, 55, 38]

for turnout in turnouts:
    if turnout < 50:
        category = "Low"
    elif turnout <= 70:
        category = "Moderate"
    else:
        category = "High"

    print(f"{turnout}% → {category}")

### **Building Up Results**

Often you want to accumulate results as you loop:

In [None]:
numbers = [1, 2, 3, 4, 5]

# Calculate sum manually
total = 0
for num in numbers:
    total = total + num  # or: total += num

print(f"Sum: {total}")

# Build a new list
squared = []
for num in numbers:
    squared.append(num**2)

print(f"Squared: {squared}")

### **enumerate() for Index + Value**

When you need both the position and the value:

In [None]:
countries = ["Germany", "France", "Italy"]

for index, country in enumerate(countries):
    print(f"{index}: {country}")

---

### **Exercise 6: For Loops**

Given `turnouts = [45, 62, 78, 55, 38, 71, 69]`:

1. Loop through each value and print whether it's "Low", "Moderate", or "High" turnout (using your logic from Exercise 5)
2. Your output should look like:
   ```
   45% - Low turnout
   62% - Moderate turnout
   ...
   ```

In [None]:
### YOUR CODE HERE ###
turnouts = [45, 62, 78, 55, 38, 71, 69]

for turnout in turnouts:
    print(f"{turnout}% - {classify_turnout(turnout)}")

### ⭐ **Extra Exercise 6a**

Modify your loop to count how many elections fall into each category. After the loop, print:
```
Low: X elections
Moderate: Y elections  
High: Z elections
```

*Hint: Create counter variables before the loop and increment them inside.*

In [None]:
### YOUR CODE HERE ###
turnouts = [45, 62, 78, 55, 38, 71, 69]
low_count = 0
moderate_count = 0
high_count = 0
for turnout in turnouts:
    match classify_turnout(turnout):
        case "Low turnout":
            low_count += 1
        case "Moderate turnout":
            moderate_count += 1
        case "High turnout":
            high_count += 1
print(f"Low: {low_count} elections")
print(f"Moderate: {moderate_count} elections")
print(f"High: {high_count} elections")

### ⭐ **Extra Exercise 6b**

Given two parallel lists:
```python
countries = ["Germany", "France", "UK", "Italy", "Spain"]
populations = [83.2, 67.8, 67.0, 59.1, 47.4]
```

Use a loop to print each country with its population. You'll need to use either:
- `range(len(countries))` to get indices, or
- `zip(countries, populations)` to pair them up

Try both approaches!

In [None]:
### YOUR CODE HERE ###
countries = ["Germany", "France", "UK", "Italy", "Spain"]
populations = [83.2, 67.8, 67.0, 59.1, 47.4]

---

# **7. Functions**

Functions are reusable blocks of code. Instead of copying the same code multiple times, you define it once and call it whenever needed.

### **Defining a Function**

```python
def function_name(parameter1, parameter2):
    # code that does something
    return result
```

| Part | Purpose |
|------|--------|
| `def` | Keyword to define a function |
| `function_name` | Name you choose (use lowercase_with_underscores) |
| `parameters` | Inputs the function accepts (optional) |
| `return` | Value the function gives back (optional) |

In [None]:
def greet(name):
    """Return a greeting for the given name."""
    return f"Hello, {name}!"


# Call the function
message = greet("World")
print(message)

print(greet("Python"))

### **Functions with Multiple Parameters**

In [None]:
def calculate_gdp_per_capita(gdp_billions, population_millions):
    """Calculate GDP per capita from GDP (billions) and population (millions)."""
    gdp_per_capita = (gdp_billions * 1000) / population_millions
    return gdp_per_capita


# Germany
germany_gdp_pc = calculate_gdp_per_capita(4072.6, 83.2)
print(f"Germany: ${germany_gdp_pc:,.0f}")

# France
france_gdp_pc = calculate_gdp_per_capita(2782.9, 67.8)
print(f"France: ${france_gdp_pc:,.0f}")

### **Functions with Default Parameters**

In [None]:
def classify_turnout(turnout, low_threshold=50, high_threshold=70):
    """Classify voter turnout into Low, Moderate, or High."""
    if turnout < low_threshold:
        return "Low"
    elif turnout <= high_threshold:
        return "Moderate"
    else:
        return "High"


# Use default thresholds
print(classify_turnout(45))  # Low
print(classify_turnout(65))  # Moderate

# Use custom thresholds for local elections
print(classify_turnout(45, low_threshold=30, high_threshold=50))  # Moderate

### **Functions That Don't Return Anything**

Some functions perform an action (like printing) without returning a value:

In [None]:
def print_report(country, population, gdp):
    """Print a formatted country report."""
    gdp_pc = (gdp * 1000) / population
    print(f"=== {country} ===")
    print(f"Population: {population}M")
    print(f"GDP: ${gdp}B")
    print(f"GDP per capita: ${gdp_pc:,.0f}")
    print()


print_report("Germany", 83.2, 4072.6)
print_report("France", 67.8, 2782.9)

### **Why Use Functions?**

| Benefit | Explanation |
|---------|-------------|
| **Reusability** | Write once, use many times |
| **Readability** | `classify_turnout(62)` is clearer than 5 lines of if/elif |
| **Maintainability** | Fix a bug in one place, fixed everywhere |
| **Testing** | Easy to verify one function works correctly |

---

### **Exercise 7: Functions**

Turn your turnout classifier from Exercise 5 into a function:

1. Define a function `classify_turnout(turnout)` that takes a turnout percentage
2. It should return `"Low"`, `"Moderate"`, or `"High"` based on the thresholds (50, 70)
3. Test it by calling the function with values: 45, 62, 78
4. Use your function inside a loop to classify all values in `[45, 62, 78, 55, 38]`

In [None]:
### YOUR CODE HERE ###
def classify_turnout(turnout, low_threshold=50, high_threshold=70):
    """Classify voter turnout into Low, Moderate, or High."""
    if turnout < low_threshold:
        return "Low"
    elif turnout <= high_threshold:
        return "Moderate"
    else:
        return "High"


print(classify_turnout(45))
print(classify_turnout(62))
print(classify_turnout(78))

### ⭐ **Extra Exercise 7**

Create a function `summarize_polls(polls)` that takes a list of poll numbers and returns a dictionary with:
- `"count"`: number of polls
- `"average"`: mean value
- `"min"`: minimum value
- `"max"`: maximum value
- `"range"`: difference between max and min

Test with: `polls = [42, 45, 44, 47, 51, 48, 52]`

*Note: Dictionaries use `{"key": value}` syntax. You'll learn more about them in Week 2, but try using this structure:*
```python
result = {
    "count": ...,
    "average": ...,
}
return result
```

In [None]:
### YOUR CODE HERE ###
polls = [42, 45, 44, 47, 51, 48, 52]


def summarize_polls(polls: list[int]) -> dict[str, float | None]:
    """Return summary statistics for a list of poll results."""
    count = len(polls)
    average = sum(polls) / count if count > 0 else 0
    minimum = min(polls) if polls else None
    maximum = max(polls) if polls else None

    return {
        "count": count,
        "average": average,
        "min": minimum,
        "max": maximum,
        "range": (maximum - minimum)
        if minimum is not None and maximum is not None
        else None,
    }


print(summarize_polls(polls))

---

# **8. Integration Exercise: Putting It All Together**

Now let's combine everything you've learned into a complete mini-analysis.

### **The Scenario**

You have polling data from the last 7 days of a campaign. You need to analyze the trends and produce a summary report.

### **Exercise 8: Poll Analyzer**

Write a function `analyze_polls(polls, candidate_name)` that:

1. Calculates the **average** poll result
2. Finds the **highest** and **lowest** values
3. Determines the **trend** (compare first half average to second half average):
   - "Rising" if second half > first half
   - "Falling" if second half < first half
   - "Stable" if they're equal
4. Returns a formatted summary string

**Test data:**
```python
poll_results = [42, 45, 44, 47, 51, 48, 52]
```

**Expected output format:**
```
=== Poll Analysis: [Candidate] ===
Polls analyzed: 7
Average: 47.0%
Range: 42% - 52%
Trend: Rising
```

*Hints:*
- To split a list in half: `first_half = polls[:len(polls)//2]`
- Use `sum(lst) / len(lst)` for averages
- Build the output string with f-strings

In [None]:
### YOUR CODE HERE ###


def analyze_polls(polls: list[int], candidate_name: str) -> str:
    """Analyze a list of poll results and return a summary."""
    count = len(polls)
    average = sum(polls) / count if count > 0 else None
    minimum = min(polls) if polls else None
    maximum = max(polls) if polls else None

    first_half_average = sum(polls[: count // 2]) / (count // 2) if count > 0 else None
    second_half_average = (
        sum(polls[count // 2 :]) / (count - count // 2) if count > 0 else None
    )

    trend = (
        "Rising"
        if second_half_average > first_half_average
        else "Falling"
        if second_half_average < first_half_average
        else "stable"
    )

    return f"""
    === Poll Analysis: {candidate_name} ===
    Polls analyzed: {count}
    Average: {average:.2f}%
    Range: {minimum}% - {maximum}%
    Trend: {trend}
    """


# Test your function
poll_results = [42, 45, 44, 47, 51, 48, 52]
print(analyze_polls(poll_results, "Candidate A"))

### ⭐ **Extra Exercise 8**

Extend your analyzer to handle multiple candidates:

```python
election_data = {
    "Candidate A": [42, 45, 44, 47, 51, 48, 52],
    "Candidate B": [48, 46, 47, 44, 40, 43, 39],
    "Candidate C": [10, 9, 9, 9, 9, 9, 9]
}
```

Write code that:
1. Loops through each candidate
2. Calls your `analyze_polls()` function for each
3. Prints all the summaries
4. At the end, prints which candidate has the highest final poll number

*Hint: To loop through a dictionary, use `for name, polls in election_data.items():`*

In [None]:
### YOUR CODE HERE ###

election_data = {
    "Candidate A": [42, 45, 44, 47, 51, 48, 52],
    "Candidate B": [48, 46, 47, 44, 40, 43, 39],
    "Candidate C": [10, 9, 9, 9, 9, 9, 9],
}


def analyze_polls_batch(election_data: dict[str, list[int]]) -> None:
    """Analyze polls for multiple candidates and print summaries."""
    for candidate, polls in election_data.items():
        summary = analyze_polls(polls, candidate)
        print(summary)


analyze_polls_batch(election_data)

---

# **Common Errors and How to Fix Them**

When you encounter an error, don't panic. Read the message carefully—Python usually tells you what went wrong.

### **Most Common Errors**

| Error | Cause | Fix |
|-------|-------|-----|
| `NameError: name 'x' is not defined` | Variable doesn't exist (typo?) | Check spelling |
| `SyntaxError: invalid syntax` | Missing colon, bracket, quote | Check the line indicated |
| `IndentationError` | Wrong number of spaces | Use consistent 4-space indents |
| `TypeError: unsupported operand` | Mixing incompatible types | Check your types with `type()` |
| `IndexError: list index out of range` | Accessing index that doesn't exist | Check list length with `len()` |

In [None]:
# Example: NameError
# print(countr)  # Uncomment to see the error - 'countr' is not defined

country = "Germany"
print(country)  # This works

In [None]:
# Example: TypeError
# result = "5" + 3  # Uncomment to see error - can't add string and int

result = int("5") + 3  # Convert string to int first
print(result)

### **Debugging Tips**

1. **Read the error message** — it tells you the line number and error type
2. **Add print statements** — print variables to see their values
3. **Check types** — use `type(variable)` when confused
4. **Simplify** — test small pieces of code separately
5. **Search** — paste the error message into Google or ask ChatGPT

---

# **Quick Reference: Everything on One Page**

### **Data Types**
```python
x = 42           # int
x = 3.14         # float  
x = "hello"      # str
x = True         # bool
x = [1, 2, 3]    # list
```

### **Operators**
```python
# Arithmetic: + - * / // % **
# Comparison: == != > < >= <=
# Logical: and or not
```

### **Strings**
```python
f"Value is {x}"           # f-string
f"{x:.2f}"                # 2 decimal places
"hello".upper()           # "HELLO"
```

### **Lists**
```python
lst[0]           # first item
lst[-1]          # last item
lst[1:3]         # slice
lst.append(x)    # add item
len(lst)         # length
```

### **Conditionals**
```python
if condition:
    do_something()
elif other_condition:
    do_other()
else:
    do_default()
```

### **Loops**
```python
for item in list:
    process(item)

for i in range(5):
    print(i)
```

### **Functions**
```python
def my_function(param1, param2=default):
    """Docstring explaining the function."""
    result = do_something(param1, param2)
    return result
```

---

# **What's Next?**

In **Week 2**, we'll introduce **NumPy**—a library that makes working with numerical data vastly more efficient. Instead of writing loops to process lists of numbers, NumPy lets you operate on entire arrays at once.

The concepts you learned today—variables, loops, functions—are the foundation. NumPy builds on top of them.

### **Homework**

Complete the exercises in this notebook. Focus on the main exercises; extra exercises are optional.

If you want more practice:
- Try modifying the exercises with different data
- Combine concepts in new ways
- Bring questions to the Python Clinic (Tuesdays, 6-7 PM)

### **Further Resources**

- [Python Official Tutorial](https://docs.python.org/3/tutorial/)
- [Real Python](https://realpython.com/) — beginner-friendly tutorials
- McKinney, *Python for Data Analysis*, Chapters 2-3

---

# **Solutions**

Expand the cells below to see solutions to the main exercises.

*Try to complete the exercises yourself first!*

In [None]:
#@title Solution: Exercise 1 - Variables

country_name = "France"
population_millions = 67.8
gdp_billions = 2782.9
is_eu_member = True

print(country_name)
print(population_millions)
print(gdp_billions)
print(is_eu_member)

In [None]:
#@title Solution: Exercise 2 - Operators

country_name = "France"
population_millions = 67.8
gdp_billions = 2782.9

# Calculate GDP per capita
gdp_per_capita = (gdp_billions * 1000) / population_millions
print(f"GDP per capita: ${gdp_per_capita:,.0f}")

# Check if high income
is_high_income = gdp_per_capita > 35000
print(f"High income: {is_high_income}")

In [None]:
#@title Solution: Exercise 3 - Strings

country_name = "France"
population_millions = 67.8
gdp_billions = 2782.9
is_eu_member = True

gdp_per_capita = (gdp_billions * 1000) / population_millions

# Handle the "not" part conditionally
eu_status = "" if is_eu_member else "not "
summary = f"{country_name} has a GDP per capita of ${gdp_per_capita:,.0f} and is {eu_status}an EU member."
print(summary)

In [None]:
#@title Solution: Exercise 4 - Lists

# 1. Create list
eu_countries = ["Germany", "France", "Italy", "Spain", "Poland"]

# 2. First and last
print(f"First: {eu_countries[0]}")
print(f"Last: {eu_countries[-1]}")

# 3. Middle three
print(f"Middle three: {eu_countries[1:4]}")

# 4. Append
eu_countries.append("Netherlands")

# 5. Print updated list and length
print(f"Updated list: {eu_countries}")
print(f"Length: {len(eu_countries)}")

In [None]:
#@title Solution: Exercise 5 - Conditionals

turnout = 62

if turnout < 50:
    print("Low turnout")
elif turnout <= 70:
    print("Moderate turnout")
else:
    print("High turnout")

In [None]:
#@title Solution: Exercise 6 - For Loops

turnouts = [45, 62, 78, 55, 38, 71, 69]

for turnout in turnouts:
    if turnout < 50:
        category = "Low turnout"
    elif turnout <= 70:
        category = "Moderate turnout"
    else:
        category = "High turnout"
    
    print(f"{turnout}% - {category}")

In [None]:
#@title Solution: Exercise 7 - Functions

def classify_turnout(turnout):
    """Classify voter turnout as Low, Moderate, or High."""
    if turnout < 50:
        return "Low"
    elif turnout <= 70:
        return "Moderate"
    else:
        return "High"

# Test individual values
print(classify_turnout(45))   # Low
print(classify_turnout(62))   # Moderate
print(classify_turnout(78))   # High

# Use in a loop
turnouts = [45, 62, 78, 55, 38]
for turnout in turnouts:
    result = classify_turnout(turnout)
    print(f"{turnout}% → {result}")

In [None]:
#@title Solution: Exercise 8 - Integration

def analyze_polls(polls, candidate_name):
    """Analyze a list of poll results and return a summary."""
    # Calculate statistics
    count = len(polls)
    average = sum(polls) / count
    lowest = min(polls)
    highest = max(polls)
    
    # Calculate trend
    mid = count // 2
    first_half = polls[:mid]
    second_half = polls[mid:]
    
    first_avg = sum(first_half) / len(first_half)
    second_avg = sum(second_half) / len(second_half)
    
    if second_avg > first_avg:
        trend = "Rising"
    elif second_avg < first_avg:
        trend = "Falling"
    else:
        trend = "Stable"
    
    # Build summary string
    summary = f"""=== Poll Analysis: {candidate_name} ===
Polls analyzed: {count}
Average: {average:.1f}%
Range: {lowest}% - {highest}%
Trend: {trend}"""
    
    return summary

# Test
poll_results = [42, 45, 44, 47, 51, 48, 52]
print(analyze_polls(poll_results, "Candidate A"))