# Week 9 Lab: Lists, Dictionaries, and Panda
This week’s lab gives you practical experience with data analysis in Python.

You will:
- Traverse lists using for loops and the accumulator pattern
- Use dictionaries to represent structured data and practice common iteration patterns
- Load, access, and explore data using pandas DataFrames

**Instructions**
- Work through the problems in order.
- Write tests where indicated and run them to verify your progress.


#### Run the cell below once to set up the test environment.

In [None]:
import piplite
await piplite.install(["pytest", "ipytest"])

import ipytest
ipytest.autoconfig()

## Problem 1: Movie Ratings Dashboard 
**Focus:** Lists, loops, accumulator pattern, dictionaries

You are designing a simple analytics utility for a movie review site.

### Task 1.1 – Summing and Averaging Ratings
Implement `average_rating(ratings)` to compute the mean rating (return `0.0` for empty lists). Use a **loop + accumulator**. (avoid using sum()/len() directly for practice). 

**Write some test cases.**

In [None]:
# Implement using a loop + accumulator

def average_rating(ratings: list[float]) -> float:
    # TODO: replace the placeholder implementation below

    pass


In [None]:
%%ipytest -qq
# Your test cases here


### Use the Below Movie Ratings Dictionary for the Next Two Tasks

In [None]:
movie_ratings = {
    "Inception": [5, 4, 5, 5, 4],
    "Avatar": [4, 3, 4, 4],
    "Titanic": [5, 5, 4, 5],
    "Joker": [3, 3.5, 4]
}
movie_ratings

### Task 1.2 – Compute Average Ratings per Movie
Implement `print_movie_averages(movies)` that iterates and prints each movie with its average rating using your function from Task 1.1.

In [None]:
def print_movie_averages(movies: dict[str, list[float]]) -> None:
    # TODO: iterate over items and print each movie with its average rating. 
    # You can use an f-string to format your printed output: f"{title}: {avg:.2f}"
    
    pass

# Call your function to preview


### Task 1.3 – Reverse Engineering a Function
Implement a function `filter_by_threshold(movies, threshold)` so that all tests in the next cell pass.

In [None]:
def filter_by_threshold(movies: dict[str, list[float]], threshold: float) -> list[str]:

    pass

In [None]:
%%ipytest -qq

def test_filter_by_threshold():
    assert filter_by_threshold(movie_ratings, 4.0) == ['Inception', 'Titanic']
    assert filter_by_threshold(movie_ratings, 4.6) == ['Titanic']
    assert filter_by_threshold(movie_ratings, 3.0) == ['Inception', 'Avatar', 'Titanic', 'Joker']


---
## Problem 2: Pandas (Rows, Columns, Basic Analysis)

You will practice **exactly** the core operations from the lecture:
- `pd.read_csv`
- `head()` / `tail()`
- Row access with `iloc` (including slicing)
- Column access with `orders['column']`
- Series operations: `.mean()`, `.sum()`, `.unique()`

Dataset: **`retail_orders.csv`** (coffee shop sales)

**Columns:** `order_id, date, branch, item, size, quantity, unit_price, order_type, payment_method`


### Load the data

Use `pd.read_csv` and preview the first few rows.

In [None]:
import pandas as pd
orders = pd.read_csv('retail_orders.csv')
orders.head()

### Task 2.1 - Row and Column access
1. Show rows **5 to 9** (remember slicing excludes the end index).
2. Show every **10th** row starting at 0.
3. Show the **last row** using negative indexing with `iloc`.
4. Get the `quantity` column as a Series and show the first 8 values.
5. Get the **unique** values of `order_type`.

In [None]:
# TODO: 1 Rows 5 to 9


In [None]:
# TODO: 2 Every 10th row


In [None]:
# TODO: 3 Last row


In [None]:
# TODO: 4 Quantity first 8


In [None]:
# TODO: 5 Unique order_type


### Task 2.2 - Create a derived column
Create `total_price = quantity * unit_price` using simple arithmetic. Then preview with `head()`.

In [None]:
# TODO: Create total_price column then preview


### Task 2.3 - Basic analyses using column selection and Series methods
Use `Boolean` filters inside []. Also use Series methods: `.sum()`/`.mean()`/`.unique()`:<br>
- **Total quantity** sold for the item `'Latte'` (e.g., `orders[orders['item'] == 'Latte']['quantity'].sum()`).<br>
    - `orders['item'] == 'Latte'`: This expression creates a Boolean Series (a list of True/False values), one per row.
    - `orders[orders['item'] == 'Latte']`: This uses Boolean indexing to filter the DataFrame, keeping only the rows where the condition is True.
    - `['quantity']`: From that filtered DataFrame, you now select just the 'quantity' column.
- **Average quantity** for orders with `order_type == 'takeout'`.<br>
- **Unique** items sold at the `'University'` branch.

**Windowed comparisons:** 
<br>
- For rows **0 – 24**, compute the **mean quantity**.<br>
- For rows **25 – 49**, compute the **mean quantity**.<br>
- Which window has the higher mean? 

In [None]:
# TODO: Total quantity for Latte 
orders[orders['item'] == 'Latte']['quantity'].sum()

In [None]:
# TODO: Mean quantity for takeout


In [None]:
# TODO: Unique items at University


In [None]:
# TODO: Compute window means and compare
