<a href="https://colab.research.google.com/github/hchen833/fwe458_2026/blob/main/FWE458_HW2_python_fundamentals_LastName_FirstName.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Homework Assignment 2: Python Fundamentals

## **[Haohua Chen]**

**Due Date:** Feb 18, 2026

**Total Points:** 100

**Instructions:**
- Complete both problems in this notebook
- Write your code in the provided cells
- Save your completed `.ipynb` file to the `homework` folder in your private GitHub repository (shared with the instructor)
- Submit the link to your notebook on Canvas

**⚠️ IMPORTANT:** GitHub records the timestamp of every file update. Your notebook must be committed and pushed to GitHub **before the deadline**. **DO NOT** update the file after the deadline—late modifications will be flagged and may result in a grade penalty.

**Academic Integrity:** This is an individual assignment. You may consult course materials, Python documentation, AI tools, and discuss concepts with classmates, but all code must be your own.

---

## Problem 1: Satellite Image Filename Processor (50 points)

You are working with Landsat satellite imagery. The filenames follow this format:

```
LC08_L2SP_024030_20240615_02_T1_SR_B4.TIF
```

Where the components are: `{sensor}_{level}_{pathrow}_{date}_{collection}_{tier}_SR_{band}.TIF`

### Your Tasks:

**Part A (20 points):** Write a function called `parse_landsat_filename` that takes a filename string and returns a dictionary containing:
- `sensor`: The sensor ID (e.g., "LC08")
- `path`: The path number (e.g., "024")
- `row`: The row number (e.g., "030")
- `date`: The acquisition date formatted as "YYYY-MM-DD" (e.g., "2024-06-15")
- `band`: The band identifier (e.g., "B4")

**Part B (15 points):** Write a function called `classify_band` that takes a band identifier (e.g., "B4") and returns the band's common name based on this table:

| Band | Name |
|------|------|
| B1 | Coastal Aerosol |
| B2 | Blue |
| B3 | Green |
| B4 | Red |
| B5 | NIR |
| B6 | SWIR1 |
| B7 | SWIR2 |

If the band is not in the table, return "Unknown".

**Part C (15 points):** Process the list of filenames provided below. For each file:
1. Parse the filename using your function
2. Print a formatted summary showing the date, path/row, and band name
3. After processing all files, print how many unique dates are represented in the dataset

In [30]:
# Test filenames
filenames = [
    "LC08_L2SP_024030_20240615_02_T1_SR_B4.TIF",
    "LC08_L2SP_024030_20240615_02_T1_SR_B5.TIF",
    "LC08_L2SP_024030_20240701_02_T1_SR_B3.TIF",
    "LC08_L2SP_025031_20240615_02_T1_SR_B4.TIF",
    "LC09_L2SP_024030_20240708_02_T1_SR_B6.TIF",
    "LC08_L2SP_024030_20240701_02_T1_SR_B4.TIF",
]

# Part A: Write your parse_landsat_filename function here
def parse_landsat_filename(filename):
    parts = filename.replace(".TIF", "").split("_")

    sensor = parts[0]
    path = parts[2][:3]
    row = parts[2][3:]

    raw_date = parts[3]
    date = f"{raw_date[:4]}-{raw_date[4:6]}-{raw_date[6:]}"

    band = parts[-1]

    return {
        "sensor": sensor,
        "path": path,
        "row": row,
        "date": date,
        "band": band
    }

# Part B: Write your classify_band function here
def classify_band(band):
    band_map = {
        "B1": "Coastal/Aerosol",
        "B2": "Blue",
        "B3": "Green",
        "B4": "Red",
        "B5": "Near Infrared (NIR)",
        "B6": "Shortwave Infrared 1 (SWIR1)",
        "B7": "Shortwave Infrared 2 (SWIR2)"
    }

    return band_map.get(band, "Unknown")

# Part C: Process all filenames and print results
unique_dates = set()

for f in filenames:
    info = parse_landsat_filename(f)
    band_name = classify_band(info["band"])

    print(f"Date: {info['date']}, Path/Row: {info['path']}/{info['row']}, Band: {band_name}")

    unique_dates.add(info["date"])

print("\nNumber of unique dates:", len(unique_dates))


Date: 2024-06-15, Path/Row: 024/030, Band: Red
Date: 2024-06-15, Path/Row: 024/030, Band: Near Infrared (NIR)
Date: 2024-07-01, Path/Row: 024/030, Band: Green
Date: 2024-06-15, Path/Row: 025/031, Band: Red
Date: 2024-07-08, Path/Row: 024/030, Band: Shortwave Infrared 1 (SWIR1)
Date: 2024-07-01, Path/Row: 024/030, Band: Red

Number of unique dates: 3


---
## Problem 2: Forest Plot Data Analysis (50 points)

You have collected tree measurement data from several forest plots. The data includes some missing values marked as `-999` and some potentially erroneous measurements that need to be flagged.

### Your Tasks:

**Part A (15 points):** Write a function called `calculate_basal_area` that:
- Takes DBH (diameter at breast height) in centimeters as input
- Returns the basal area in square meters using: $BA = \pi \times (DBH/200)^2$
- Returns `None` if DBH is negative, zero, or equals -999 (missing)

**Part B (15 points):** Write a function called `classify_tree` that takes DBH (cm) and height (m) and returns a dictionary with:
- `size_class`: "Seedling" (<10cm), "Sapling" (10-25cm), "Pole" (25-50cm), or "Mature" (≥50cm)
- `flag`: `True` if the data seems erroneous (DBH > 200cm, height > 60m, or height < 1m for trees with DBH > 10cm), `False` otherwise

**Part C (20 points):** Process the tree data provided below:
1. For each tree, calculate basal area and classify it
2. Skip any trees with missing data (-999 values)
3. Print a warning for any flagged trees
4. Calculate and print summary statistics:
   - Total number of valid trees
   - Total basal area (sum of all valid trees)
   - Count of trees in each size class
   - Number of flagged records

In [31]:
# Tree measurement data: [tree_id, species, dbh_cm, height_m]
tree_data = [
    [1, "Quercus rubra", 35.4, 22.1],
    [2, "Acer saccharum", 28.2, 18.5],
    [3, "Pinus strobus", -999, 25.0],      # Missing DBH
    [4, "Betula papyrifera", 18.7, 12.3],
    [5, "Quercus alba", 52.1, 24.8],
    [6, "Acer rubrum", 8.5, 6.2],
    [7, "Tsuga canadensis", 45.0, 85.0],   # Suspicious height
    [8, "Pinus strobus", 62.3, 28.4],
    [9, "Quercus rubra", 41.2, -999],      # Missing height
    [10, "Fagus grandifolia", 22.5, 0.5],  # Suspicious: short for this DBH
    [11, "Acer saccharum", 5.2, 3.1],
    [12, "Pinus resinosa", 38.9, 21.7],
]

# Part A: Write your calculate_basal_area function here
import math
def calculate_basal_area(dbh_cm):
    if dbh_cm <= 0 or dbh_cm == -999:
        return None
    basal_area = math.pi * (dbh_cm / 200) ** 2
    return basal_area

# Part B: Write your classify_tree function here
def classify_tree(dbh_cm, height_m):
    if dbh_cm < 10:
        size_class = "Seedling"
    elif dbh_cm < 25:
        size_class = "Sapling"
    elif dbh_cm < 50:
        size_class = "Pole"
    else:
        size_class = "Mature"

    flag = False
    if dbh_cm > 200:
        flag = True
    if height_m > 60:
        flag = True
    if dbh_cm > 10 and height_m < 1:
        flag = True

    return {"size_class": size_class, "flag": flag}

# Part C: Process all tree data and print results
import math

total_valid = 0
total_basal_area = 0.0
flagged_count = 0

size_counts = {
    "Seedling": 0,
    "Sapling": 0,
    "Pole": 0,
    "Mature": 0
}

for tree in tree_data:
    tree_id = tree[0]
    species = tree[1]
    dbh = tree[2]
    height = tree[3]

    if dbh == -999 or height == -999:
        continue

    ba = calculate_basal_area(dbh)
    if ba is None:
        continue

    result = classify_tree(dbh, height)

    total_valid += 1
    total_basal_area += ba
    size_counts[result["size_class"]] += 1

    if result["flag"]:
        flagged_count += 1
        print(f"WARNING: Tree {tree_id} ({species}) flagged (DBH={dbh} cm, Height={height} m)")

print("\nSummary statistics:")
print("Total number of valid trees:", total_valid)
print("Total basal area:", total_basal_area)
print("Count of trees in each size class:", size_counts)
print("Number of flagged records:", flagged_count)



Summary statistics:
Total number of valid trees: 10
Total basal area: 1.0318199787560514
Count of trees in each size class: {'Seedling': 2, 'Sapling': 2, 'Pole': 4, 'Mature': 2}
Number of flagged records: 2


---
## Submission Checklist

Before submitting, verify that:

- [✅] All code cells run without errors
- [✅] Both problems are complete
- [✅] Output is visible for all cells
- [✅] Your name is included
