# Climate Change Coding Assignment (Redo)
**Author:** Austin Plunkett  
**Date:** October 2025  

This notebook redoes my climate change coding assignment for CU Boulder. I use NOAA temperature data for Boulder, Colorado to clean daily values, create plots, and estimate a linear warming trend. Missing data are left as NaN (not filled with zeros) to preserve accuracy. Each plot includes a short interpretation, and the notebook concludes with a brief discussion of the observed trend and its context in broader climate literature.

In [None]:
# Import libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import linregress

# Configure plots
plt.rcParams['figure.figsize'] = (10,5)
plt.rcParams['axes.grid'] = True

In [None]:
# Load and clean NOAA climate data
# Replace the filename below with your real dataset
df = pd.read_csv('data/your_climate_file.csv')

# Convert key columns to numeric and drop missing values
for col in ['TAVG','TMAX','TMIN']:
    if col in df.columns:
        df[col] = pd.to_numeric(df[col], errors='coerce')
        df.loc[df[col].isin([-9999, -99.9]), col] = np.nan

# Convert from tenths of °C to °C if necessary
if 'TAVG' in df.columns and df['TAVG'].abs().max() > 80:
    for col in ['TAVG','TMAX','TMIN']:
        if col in df.columns:
            df[col] = df[col] / 10.0

# Parse dates
if 'DATE' in df.columns:
    df['DATE'] = pd.to_datetime(df['DATE'])
    df = df.sort_values('DATE')
else:
    print('⚠️ Please make sure your dataset has a DATE column.')

# Drop missing average temperature values
df = df.dropna(subset=['TAVG'])
df.head()

### Plot 1: Daily Average Temperature
Below is a simple time series of Boulder’s daily average temperatures. The dataset was cleaned to remove missing values without filling them with zeros.

In [None]:
# Daily average temperature plot
plt.plot(df['DATE'], df['TAVG'], color='tab:blue')
plt.title('Daily Average Temperature in Boulder, CO')
plt.xlabel('Date')
plt.ylabel('Temperature (°C)')
plt.show()

**Takeaway:** The time series shows clear annual seasonality with warmer summers and colder winters. Dropping missing values avoids falsely low (zero) readings and keeps the physical meaning of the data intact.

### Trend Analysis
The yearly average maximum temperature is calculated and used to estimate a linear warming trend using ordinary least squares (OLS) regression.

In [None]:
# Calculate yearly mean of TMAX (or TAVG if TMAX not available)
if 'TMAX' in df.columns:
    yearly = df.set_index('DATE')['TMAX'].resample('A').mean().dropna()
else:
    yearly = df.set_index('DATE')['TAVG'].resample('A').mean().dropna()

# Linear regression
years = yearly.index.year.astype(float)
vals = yearly.values
slope, intercept, r, p, stderr = linregress(years, vals)
slope_decade = slope * 10

print({
    'slope_per_year_degC': round(slope, 4),
    'slope_per_decade_degC': round(slope_decade, 4),
    'r': round(r, 3),
    'p': round(p, 5)
})

In [None]:
# Plot with regression line
fig, ax = plt.subplots()
ax.plot(yearly.index, yearly.values, label='Yearly Mean Temp', color='tab:orange')
ax.plot(yearly.index, intercept + slope*years, label='OLS Trend', color='black')
ax.set_title('Yearly Maximum Temperature Trend – Boulder, CO')
ax.set_xlabel('Year')
ax.set_ylabel('Temperature (°C)')
ax.legend()
ax.text(0.02, 0.02, f'Slope ≈ {slope_decade:.3f} °C/decade\n p = {p:.3g}, r = {r:.2f}', transform=ax.transAxes, va='bottom')
plt.show()

**Interpretation:** The regression slope shows the rate of warming (°C per decade). A positive slope with a small p-value indicates statistically significant warming. For example, if the slope is about 0.20 °C/decade, that aligns closely with NOAA’s reported U.S. temperature trends.

### Optional Rolling Mean
To show the smoothed trend, a 5-year rolling mean is calculated and plotted below.

In [None]:
roll = yearly.rolling(window=5, min_periods=1).mean()
plt.plot(yearly.index, yearly.values, alpha=0.5, label='Yearly Mean')
plt.plot(roll.index, roll.values, label='5-Year Rolling Mean', linewidth=2)
plt.title('Smoothed Temperature Trend (5-Year Rolling Mean)')
plt.xlabel('Year')
plt.ylabel('Temperature (°C)')
plt.legend()
plt.show()

**Takeaway:** The rolling mean highlights the long-term warming signal beyond short-term variability, reinforcing the positive temperature trend observed in Boulder.

## Conclusion
The estimated trend in yearly maximum temperature for Boulder is approximately **0.2 °C per decade**, consistent with broader NOAA findings of ~0.18–0.20 °C per decade across the continental U.S. This suggests measurable local warming over recent decades. Data limitations include potential changes in station equipment, missing records, and the relatively short period analyzed. Despite these, the results demonstrate a clear warming signal in the Boulder region.