## Using this notebook

### Climate Trends in Ramona, California (1998–2025)

This notebook analyzes long-term temperature trends in Ramona, CA using NOAA NCEI data. We assess whether local temperature patterns support the assumption that Ramona has experienced measurable climate change over the past two decades.

### How this supports the broader project
The increasing minimum and maximum temperatures in Ramona provide important context for interpreting vegetation recovery. Warmer conditions may influence drought stress, shrubland regeneration, and NDVI patterns in the years following major wildfire events.

### Step 0: Libraries
- Import required libraries
### Step 1: Import the climate data
- Pull the NCEI data
### Step 2: Clean the dateframe
- For this portion of the project we need to pull the TMAX column 
### Step 3: Convert temperature units
- For this portion of the project we will relable and recalculate the temperatures under the TMAX and TMIN columns
- Recalculations will result in a farenheit and celcius column 
### Step 4: Plot the data using only farenheit
- Plot the farenheit data for Ramona, CA from 1998 to 2025
    - We will not use celcius for this as US audiences understand farenheit
- When we plot we will not have python plot every single date as the graph will be illegible 
    - Instead we will use the average year using the mean() function
### Step 5: Save the plot
- Save the Ramona CA average yearly temperature data and export as html

### WARNING 
- This code was run locally. If reproducing this project, follow the directions below for downloading the required NOAA/NCEI data for this portion of the project. 
- Pay attention to your directories as well. Either delete the directories in code cell 2 or replace them with your own directory format.

In [1]:
# Step 0: libraries

# Libraries for working with NCEI data
from pathlib import Path
import pandas as pd

# Libraries for plotting and saving plot
import hvplot.pandas
import holoviews as hv

# Making trend (slope) lines
import numpy as np
from sklearn.metrics import r2_score

### What the code in 'Step 0.1: Project Paths' does
In this step, we define reusable directory paths for the project. Setting these once at the top of the notebook helps ensure that data, figures, and exported files are saved in consistent locations across multiple notebooks. In future notebooks, you will see similar code, where other useful directories are made, for example figures. 


In [2]:
# Step 0.1

# Project paths 
PROJECT_ROOT = Path("..").resolve()
DATA_DIR = PROJECT_ROOT / "data"
TEMP_DIR = DATA_DIR / "temperature"


### Selecting your own data from NOAA/NCEI
In this notebook I will be looking at temperature data for Ramona, CA. You can use any station data you would like for your own project. I have downloaded this data directly from NOAA. Follow the steps below for using your own data. 
1. Go to the NOAA National Centers fo Environmental Information, and from the 'Home' tab navigate to 'Climate Data Online' 
2. From here you will want to use the search tool which is available here: https://www.ncei.noaa.gov/cdo-web/search
3. Under 'Select weather observation type/dataset' you will want to select 'daily summaries'
4. For the 'date range' put in the month-day-year end and start dates you are interested in. 
5. For 'search for' set this to Stations
6. Lastly enter a search term. If you already know your station id add it here, otherwise you can type in a location name and once you search it will show you all of the availabel stations tagged to that area. 
7. Regardless of which station you end up add to your cart, you want to ensure that you download your file as a CSV, and that the download includes the 'air temperature' option which will give you all the data you need for this project.

### Ramona, CA station information

This station (GHCND:USW00053120) provides the most complete and continuous temperature record for the Ramona area, making it suitable for trend analysis from 1998–2025.

In this notebook I am using the following station: 

Stations 	

    GHCND:USW00053120 

Begin Date 	
    1998-04-16 00:00

End Date 	
    2025-08-26 23:59

Data Types 	

    TAVG TMAX TMIN 

Units 	

    Standard 

Custom Flag(s) 	

    Station Name 

### Step 1: Import data into python

- index_col='DATE' – this sets the DATE column as the index. Needed for subsetting and resampling later on
- parse_dates=True – this lets python know that you are working with time-series data, and values in the indexed column are date time objects
- na_values=['NaN'] – this lets python know how to handle missing values


In [3]:
# Step 1: Import data into python
# Using your downloaded data we will pull and read the CSV

## NOTE ##
# It is good practice to rename your data files into something descriptive and easy to identify
# For example, I have renamed the Ramona, CA station data as 'ncei-climate-ramona.csv' 
# Instead of typing this out we will create a var called csv_path

# Specify path
csv_path = TEMP_DIR / "ncei-climate-ramona.csv"

ramona_climate_df = pd.read_csv(
    csv_path,
    index_col="DATE",
    parse_dates=True,
    na_values=["NaN"]
)

ramona_climate_df.head()

Unnamed: 0_level_0,STATION,NAME,TAVG,TMAX,TMIN
DATE,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
1998-04-16,USW00053120,"RAMONA AIRPORT, CA US",48.0,61.0,34.0
1998-04-17,USW00053120,"RAMONA AIRPORT, CA US",53.0,69.0,36.0
1998-04-18,USW00053120,"RAMONA AIRPORT, CA US",55.0,72.0,37.0
1998-04-19,USW00053120,"RAMONA AIRPORT, CA US",59.0,77.0,41.0
1998-04-20,USW00053120,"RAMONA AIRPORT, CA US",63.0,82.0,43.0


### Step 2: clean the dataframe

- We will get rid of the unnecessary columns in this csv 
- You can use double brackets ([[ and ]]) to select only the columns that you want from the dataframe
- When specifying a column name use single ' ' quotes not " " full quotes

In [4]:
# Step 2: Clean the dataframe
ramona_climate_max_min_df = ramona_climate_df[['TMAX', 'TMIN']]

# We are using TMAX and TMIN because we later want to plot the average yearly max-min temperature.
# This will help clarify if Ramona, CA is not only getting warmer, but how much hotter these temps are year to year.
# The same logic applies to how much that min temp is rising over time.  
# We are generally more interested in the effects of these extremes when it comes to ecosystem responses. 

# I also know the climate of Ramona, CA well. We generally have more extreme temperature swings in the winter and sping. 
# In the summer we stay consistently hot and do not really cool off that much in the evenings. 
# As a result, the TAVG will actually mask a lot of the seasonal swings that are important to fire season in Ramona, CA
# Especially during the months we see the most fire activity, which is October

# Call the new variable
ramona_climate_max_min_df

Unnamed: 0_level_0,TMAX,TMIN
DATE,Unnamed: 1_level_1,Unnamed: 2_level_1
1998-04-16,61.0,34.0
1998-04-17,69.0,36.0
1998-04-18,72.0,37.0
1998-04-19,77.0,41.0
1998-04-20,82.0,43.0
...,...,...
2025-08-22,100.0,62.0
2025-08-23,103.0,61.0
2025-08-24,95.0,61.0
2025-08-25,95.0,63.0


### Step 3: convert temperature units

- If you remember from earlier, the units for the NCEI data was 'standard.' Check out the documentation for GCHNd data. I downloaded data with “standard” units; find out what that means for temperature.
- See the GCHNd documentation here: https://www.ncei.noaa.gov/data/global-historical-climatology-network-daily/doc/GHCND_documentation.pdf
- For the TOBS column this is reported in farencheit but we will also want celsius 
- We will use some basic math to accomplish this in python

In [5]:
# Step 3: Convert temperature units
# Label the TMAX column as the correct temperature (temp_f)

climate_temp = ramona_climate_max_min_df.rename(columns={
    "TMAX": "max_temp_f",
    "TMIN": "min_temp_f"
})

# We will then convert temp_f to temp_c (celcius) using a basic equation

climate_temp['max_temp_c'] = ((climate_temp['max_temp_f']-32)*5/9)
climate_temp['min_temp_c'] = ((climate_temp['min_temp_f']-32)*5/9)

# Call the new var to see the date, temp_f and temp_c columns
climate_temp.head()

Unnamed: 0_level_0,max_temp_f,min_temp_f,max_temp_c,min_temp_c
DATE,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
1998-04-16,61.0,34.0,16.111111,1.111111
1998-04-17,69.0,36.0,20.555556,2.222222
1998-04-18,72.0,37.0,22.222222,2.777778
1998-04-19,77.0,41.0,25.0,5.0
1998-04-20,82.0,43.0,27.777778,6.111111


### Step 4: plot the data using only farenheit

- We do not want to plot out every single day of data for Ramona, CA from 1988-2025, so we will resample these dates by creating a new variable that reflects the annual average temperature for each year in the NCEI data. 
- So for this data, we will resample to the start of the year or YS.
- To get the average, we will then tell python to use the mean() function.

### Why resample to annual averages?
Daily data can be noisy and harder to interpret. To analyze long-term climate trends, we resample the dataset to yearly averages. This step allows us to detect underlying warming or cooling patterns over multiple decades.

In [None]:
# Step 4: Plot the data using only farenheit 

# Make a new variable using climate_temp to have a new var ann_ramona_climate_df that calculates the average annual value of the temp
#'YS' is the start of the calendar year
# mean() will calculate the average temperature of both the temp_f and temp-c columns for var climate_temp
ann_ramona_climate_df = (
    climate_temp
    .resample('YS')
    .mean()
)

# Here rename the index column from DATE to year so that the displayed table is easier to interpret
ann_ramona_climate_df.index = ann_ramona_climate_df.index.year
ann_ramona_climate_df.index.name = "year"

# Call the new var
ann_ramona_climate_df

# Note that we now have years listed with the max and min temps now
# Since we are using mean here we lose the high peaks Ramona, CA gets in the summers (usually over 100+)

Unnamed: 0_level_0,max_temp_f,min_temp_f,max_temp_c,min_temp_c
year,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
1998,74.799127,45.50655,23.777293,7.503639
1999,75.240331,42.930939,24.022406,6.072744
2000,75.763085,45.421488,24.312825,7.456382
2001,73.967033,45.068681,23.315018,7.260379
2002,74.620879,44.258242,23.678266,6.810134
2003,75.513889,45.663889,24.174383,7.591049
2004,75.456284,45.084699,24.14238,7.269277
2005,75.476712,45.643836,24.153729,7.579909
2006,76.753425,44.879452,24.863014,7.155251
2007,76.723757,44.475138,24.846532,6.930632


### Understanding the trend calculations below
To quantify long-term temperature change, we calculate:
- **Slope**: the average change in °F per year  
- **Intercept**: the model-predicted value at year 0  (which is treated as literally 0AD so don't worry about the intercept values)
- **R²**: how well the trend line explains the variation  

These values help assess how rapidly Ramona is warming.

In [8]:
#### CONT ####
# Step 4: Plot the data using only farenheit 

# Use new var ann_ramona_climate_df to plot the annual data

# Give this plot a descriptive title and label the x and y axis

# By setting y to max_temp_f it will display only the max farenheit data
# We will do this again to make a separate figure for min_temp_f

### Why create individual figures? ###
# Mainly because the range of the graph (y) with both plotted will make it harder to see the magnitude of peaks and lows

# Creating the slope for the ramona_climate_max_f figure

# Extract values
x = ann_ramona_climate_df.index.values.astype(float)
y_max = ann_ramona_climate_df["max_temp_f"].values

# Compute linear regression (slope & intercept)
slope_max, intercept_max = np.polyfit(x, y_max, 1)
y_max_trend = slope_max * x + intercept_max

# Use hvplot not plot so that the resulting figure is interactive
ramona_climate_max_f = ann_ramona_climate_df.hvplot(
    y=["max_temp_f"],
    title= 'Ramona, CA Average Max Yearly Temperature',
    xlabel='Year',
    ylabel='Temperature (°F)',
    line_width=2,
)

# Trend line as separate HoloViews Curve
trend_line_max = hv.Curve(
    (x, y_max_trend),
    kdims="Year", vdims="Temperature (°F)"
).opts(color="red", line_width=2, line_dash="dashed", alpha=0.7)

# Overlay the trend on only this figure
ramona_climate_max_f_with_trend = ramona_climate_max_f * trend_line_max

# Display it
ramona_climate_max_f_with_trend


In [None]:
# We can also call the slope, intercept and R2 per year and per decade

# In the cell above we already calculated the slope_max and intercept_max so we reuse them here

# Compute predicted values for R²
y_max_pred = slope_max * x + intercept_max

# R-squared
r2_max = r2_score(y_max, y_max_pred)

# Decadal slope
slope_max_decade = slope_max * 10

# Print summary
# What this print code does it automatically label the values into something legible for us to use
# This will display everything in appropriate units
# The intercept isn't helpful here because its predicting the temperature at year=0 (which is 0 AD) and entirely meaningless for this
print("=== Ramona, CA Annual Max Temperature Trend ===")
print(f"Slope (per year):   {slope_max:.4f} °F/year")
print(f"Slope (per decade): {slope_max_decade:.4f} °F/decade")
print(f"Intercept:          {intercept_max:.2f}")
print(f"R²:                 {r2_max:.4f}")

=== Ramona, CA Annual Max Temperature Trend ===
Slope (per year):   0.1250 °F/year
Slope (per decade): 1.2496 °F/decade
Intercept:          -174.67
R²:                 0.4500


In [10]:
# We can also see it without the trend line

# Call the plot variable with no trend
ramona_climate_max_f

In [11]:
#### CONT ####
# Step 4: Plot the data using only farenheit

# Now we will do the same code as above but just using temp_min_f instead

# Use new var ann_ramona_climate_df to plot the annual data

# Give this plot a descriptive title and label the x and y axis

# By setting y to min_temp_f it will display only the min farenheit data

# Extract values for regression
x = ann_ramona_climate_df.index.values.astype(float)
y_min = ann_ramona_climate_df["min_temp_f"].values

# Compute linear regression (slope & intercept)
slope_min, intercept_min = np.polyfit(x, y_min, 1)
y_min_trend = slope_min * x + intercept_min

# Plot actual values (min temperatures)
ramona_climate_min_f = ann_ramona_climate_df.hvplot(
    y="min_temp_f",
    title="Ramona, CA Average Min Yearly Temperature",
    xlabel="Year",
    ylabel="Temperature (°F)",
    line_width=2,
)

# Trend line as separate HoloViews Curve
trend_line_min = hv.Curve(
    (x, y_min_trend),
    kdims="Year",
    vdims="Temperature (°F)"
).opts(
    color="red",
    line_width=2,
    line_dash="dashed",
    alpha=0.7
)

# Overlay the trend on only this figure
ramona_climate_min_f_with_trend = ramona_climate_min_f * trend_line_min

# Display it
ramona_climate_min_f_with_trend

In [None]:
#### CONT ####
# Step 4: Plot the data using only farenheit
# We can also call the slope, intercept and R2 per year and per decade

# In the cell above we already calculated the slope_min and intercept_min so we reuse them here
# Predicted values for R²
y_min_pred = slope_min * x + intercept_min

# R-squared
r2_min = r2_score(y_min, y_min_pred)

# Decadal slope
slope_min_decade = slope_min * 10

# Print summary
print("=== Ramona, CA Annual Min Temperature Trend ===")
print(f"Slope (per year):   {slope_min:.4f} °F/year")
print(f"Slope (per decade): {slope_min_decade:.4f} °F/decade")
print(f"Intercept:          {intercept_min:.2f}")
print(f"R²:                 {r2_min:.4f}")

=== Ramona, CA Annual Min Temperature Trend ===
Slope (per year):   0.0507 °F/year
Slope (per decade): 0.5071 °F/decade
Intercept:          -56.59
R²:                 0.1290


In [13]:
# We can also see it without the trend line

# Call the plot variable with no trend
ramona_climate_min_f

In [None]:
#### CONT ####
# Step 4: Plot the data using only farenheit

# To make it easier to view these figures we can stack them
# When we save these as html and static images we can save them both individually and joined

# You will notice that I have labeled these figures differently. That is because I don't want the trend line showing here
# Like we have in the independent figures above. 

# Specifying .cols() makes python stack these top and bottom

ramona_max_temp = ann_ramona_climate_df.hvplot(
    y="max_temp_f",
    title="Annual Average Maximum Temperature",
    xlabel="Year", ylabel="Temperature (°F)", line_width=2,
)

ramona_min_temp = ann_ramona_climate_df.hvplot(
    y="min_temp_f",
    title="Annual Average Minimum Temperature",
    xlabel="Year", ylabel="Temperature (°F)", line_width=2,
)
(ramona_max_temp + ramona_min_temp).cols(1)

### Step: 5 Save the Plot
To make results reusable in later notebooks and for the final report, we save the annual trend figures as `.html` files. These can be opened outside Jupyter and retain their interactivity.


In [None]:
# Step 5: Save the plots 
FIG_DIR = PROJECT_ROOT / "figures"
FIG_DIR.mkdir(exist_ok=True)

# Interacitve figures with NO trend lines

hv.save(ramona_climate_min_f, FIG_DIR / "Ramona_avg_min_yr_temp_f.html")

hv.save(ramona_climate_max_f, FIG_DIR / "Ramona_avg_max_yr_temp_f.html")

combined = (ramona_max_temp + ramona_min_temp).cols(1)
hv.save(combined, FIG_DIR / "ramona_max_min_temps_vertical.html")

# Interactive figures WITH trend lines

hv.save(ramona_climate_max_f_with_trend, FIG_DIR / "ramona_max_temp_trend.html")

hv.save(ramona_climate_min_f_with_trend, FIG_DIR / "ramona_min_temp_trend.html")


# Static figures with NO trend lines

hv.save(ramona_climate_max_f, FIG_DIR / "ramona_avg_max_temp_f_year.png", fmt="png")

hv.save(ramona_climate_min_f, FIG_DIR / "ramona_avg_min_temp_f_year.png", fmt="png")

hv.save(combined, FIG_DIR / "ramona_max_min_stacked.png", fmt="png")

# Static figures WITH trend lines

hv.save(ramona_climate_max_f_with_trend, FIG_DIR / "ramona_max_temp_f_trend.png", fmt="png")

hv.save(ramona_climate_min_f_with_trend, FIG_DIR / "ramona_min_temp_f_trend.png", fmt="png")

### Storing variables for later notebooks
The `%store` command makes the cleaned annual temperature dataframe available to other notebooks in this project without needing to reload or reprocess the original data.
You should not store plots. 

In [16]:
# Store any variables for future notebooks
%store ann_ramona_climate_df

Stored 'ann_ramona_climate_df' (DataFrame)


# Ramona, CA: The hot is getting hotter and the cold is getting warmer

## Temperature Trends in Ramona, CA (1998–2025)

Analysis of annual maximum and minimum temperatures from the NOAA NCEI Ramona Airport station shows clear evidence of warming over the past 25 years. Both daytime highs (TMAX) and nighttime lows (TMIN) have increased, although at different rates.

## Minimum Temperature Trend (Nighttime Lows)

Nighttime temperatures in Ramona exhibit a modest but consistent upward trend, which means our cool nights are getting warmer.

- **Warming rate:** +0.0507 °F per year  
- **Equivalent to:** +0.507 °F per decade  
- **R²:** 0.129  

The relatively low R² indicates greater year-to-year variability, but the long-term trend still points toward steady warming. Increasing minimum temperatures reflect reduced nighttime cooling, and perhaps broader issues of climate change. If we take our yearly warming rate (0.0507) and multiply it by the number of years (27), the (modeled) minimum temperature increased by 1.37 °F.

## Maximum Temperature Trend (Daytime Highs)

Daytime temperatures show a stronger and more statistically robust increase, which means our hot days are getting hotter

- **Warming rate:** +0.1250 °F per year  
- **Equivalent to:** +1.2496 °F per decade  
- **R²:** 0.450  

The higher R² demonstrates a clearer linear warming pattern in daytime highs, consistent with observed regional heat intensification in inland Southern California. The other thing of note, is that Ramona is warming RAPIDLY. If we take our yearly warming rate (0.1250) and multiply it by the number of years (27), the (modeled) maximum temperature increased by 3.38 °F. That is very fast for a 27 year period. 

## Overall Implications

Taken together, the results show that Ramona is warming, with:

- Faster increases in **daytime maximum** temperatures  
- Steady but slower increases in **nighttime minimum** temperatures
- From 1998 to 2025, Ramona’s daytime high temperatures increased by about 3.4 °F, while nighttime minimum temperatures increased by about 1.4 °F.  

These patterns have important implications for:

- **Fire risk:** hotter, drier afternoons increase ignition and spread potential (especially with Southern California's Santa Ana winds)  
- **Ecosystem stress:** vegetation faces higher evaporative demand (meaning when we do our NDVI notebooks we *should* see lower NDVI over time)  
- **Human exposure:** increases in heat-related illnesses (i.e., heat stoke) 

## So why have we spent our first notebook looking at climate for Ramona, CA if our focus in on fire?

The stronger rise in maximum temperatures aligns with broader climate trends across Southern California and may compound post-fire recovery challenges in areas affected by the 2003 Cedar Fire and 2007 Witch Fire. Knowing this climate context is important for helping us interpret the NDVI boundary data we analyze, and the changes in landcover (i.e., less drought tolerant species may disappear from the region over time).