# Climate Coding Challenge

Climate change is impacting the way people live around the world

## There are more Earth Observation data online than any one person could ever look at

[NASA’s Earth Observing System Data and Information System (EOSDIS)
alone manages over 9PB of
data](https://www.earthdata.nasa.gov/learn/articles/getting-petabytes-people-how-eosdis-facilitates-earth-observing-data-discovery-and-use).
1 PB is roughly 100 times the entire Library of Congress (a good
approximation of all the books available in the US). It’s all available
to **you** once you learn how to download what you want.

Here we’re using the NOAA National Centers for Environmental Information
(NCEI) [Access Data
Service](https://www.ncei.noaa.gov/support/access-data-service-api-user-documentation)
application progamming interface (API) to request data from their web
servers. We will be using data collected as part of the Global
Historical Climatology Network daily (GHCNd) from their [Climate Data
Online library](https://www.ncdc.noaa.gov/cdo-web/datasets) program at
NOAA.

For this example we’re requesting [daily summary data in Karachi,
Pakistan (station ID
PKM00041780)](https://www.ncdc.noaa.gov/cdo-web/datasets/GHCND/stations/GHCND:PKM00041780/detail).

<link rel="stylesheet" type="text/css" href="./assets/styles.css"><div class="callout callout-style-default callout-titled callout-response"><div class="callout-header"><div class="callout-icon-container"><i class="callout-icon"></i></div><div class="callout-title-container flex-fill">Research and cite your data</div></div><div class="callout-body-container callout-body"><ol type="1">
<li>Research the <a
href="https://www.ncei.noaa.gov/metadata/geoportal/rest/metadata/item/gov.noaa.ncdc:C00861/html"><strong>Global
Historical Climatology Network - Daily</strong></a> data source.</li>
<li>In the cell below, write a 2-3 sentence description of the data
source.</li>
<li>Include a citation of the data (<strong>HINT:</strong> See the ‘Data
Citation’ tab on the GHCNd overview page).</li>
</ol>
<p>Your description should include:</p>
<ul>
<li>who takes the data</li>
<li>where the data were taken</li>
<li>what the maximum temperature units are</li>
<li>how the data are collected</li>
</ul></div></div>

**YOUR DATA DESCRIPTION AND CITATION HERE** 🛎️

## Access NCEI GHCNd Data from the internet using its API 🖥️ 📡 🖥️

The cell below contains the URL for the data you will use in this part
of the notebook. We created this URL by generating what is called an
**API endpoint** using the NCEI [API
documentation](https://www.ncei.noaa.gov/support/access-data-service-api-user-documentation).

> **What’s an API?**
>
> An **application programming interface** (API) is a way for two or
> more computer programs or components to communicate with each other.
> It is a type of software interface, offering a service to other pieces
> of software ([Wikipedia](https://en.wikipedia.org/wiki/API)).

First things first – you will need to import the `earthpy` library to
help with data management and the `pandas` library to work with tabular
data:

In [None]:
# Import required packages
import holoviews as hv
import hvplot.pandas
import earthpy 
import pandas as pd


In [2]:
%store -r

The cell below contains the URL you will use to download climate data.
There are two things to notice about the URL code:

1.  It is surrounded by quotes – that means Python will interpret it as
    a `string`, or text, type, which makes sense for a URL.
2.  The URL is too long to display as one line on most screens. We’ve
    put parentheses around it so that we can easily split it into
    multiple lines by writing two strings – one on each line.

<link rel="stylesheet" type="text/css" href="./assets/styles.css"><div class="callout callout-style-default callout-titled callout-task"><div class="callout-header"><div class="callout-icon-container"><i class="callout-icon"></i></div><div class="callout-title-container flex-fill">Try It: Format your URL for readability</div></div><div class="callout-body-container callout-body"><ol type="1">
<li>Pick an expressive variable name for the URL.</li>
<li>Reformat the URL so that it adheres to the <a
href="https://peps.python.org/pep-0008/#maximum-line-length">79-character
PEP-8 line limit</a>, and so that it is <strong>easy to read</strong>.
If you are using GitHub Codespaces, you should see two vertical lines in
each cell – don’t let your code go past the second line.</li>
<li>Replace ‘DATATYPE’, ‘STATION’, and the start and end dates
‘YYYY-MM-DD’, with the values for the data you want to download.</li>
</ol></div></div>

In [None]:
stuff23 = ('https://www.ncei.noaa.gov/access/services/data/v1?'
           'dataset=daily-summaries&'
           'dataTypes=DATATYPE&'
           'stations=STATION&'
           'startDate=YYYY-MM-DD&'
           'endDateYYYY-MM-DD')
stuff23

In [21]:
Karachi_API = ('https://www.ncei.noaa.gov/access/services/data/v1?'
           'dataset=daily-summaries&'
           'dataTypes=TAVG&'
           'stations=PKM00041780&'
           'startDate=1942-05-06&'
           'endDate=2025-07-20&'
           'units=standard')
Karachi_API

'https://www.ncei.noaa.gov/access/services/data/v1?dataset=daily-summaries&dataTypes=TAVG&stations=PKM00041780&startDate=1942-05-06&endDate=2025-07-20&units=standard'

In [2]:
FC_PRCP_API = ('https://www.ncei.noaa.gov/access/services/data/v1?'
           'dataset=daily-summaries&'
           'dataTypes=PRCP&'
           'stations=USC00053005&'
           'startDate=1893-01-01&'
           'endDate=2025-07-20&'
           'units=standard'
)
FC_PRCP_API

'https://www.ncei.noaa.gov/access/services/data/v1?dataset=daily-summaries&dataTypes=PRCP&stations=USC00053005&startDate=1893-01-01&endDate=2025-07-20&units=standard'

In [15]:
FC_TMAX_API = ('https://www.ncei.noaa.gov/access/services/data/v1?'
           'dataset=daily-summaries&'
           'dataTypes=TMAX&'
           'stations=USC00053005&'
           'startDate=1893-01-01&'
           'endDate=2025-07-20&'
           'units=standard'
)
FC_TMAX_API

'https://www.ncei.noaa.gov/access/services/data/v1?dataset=daily-summaries&dataTypes=TMAX&stations=USC00053005&startDate=1893-01-01&endDate=2025-07-20&units=standard'

## Download and get started working with NCEI data

Go ahead and use `earthpy` to download data from your API URL. You could
also use Python, but using earthpy saves a file and lets you work
offline later on. If you didn’t already, you should import the `earthpy`
library **at the top of this notebook** so that others who want to use
your code can find it easily.

In [None]:
## SKIP!!!
project = earthpy.Project(dirname=project_dirname)
project.get_data(url=ncei_url, filename=ncei_filename)
ncei_path = project.project_dir / ncei_filename

In [22]:
# Download the climate data
climate_df = pd.read_csv(
    Karachi_API
    # index_col='DATE',
    # parse_dates=True,
    # na_values=['NaN']
)

# Check that the download worked
climate_df.head()

Unnamed: 0,STATION,DATE,TAVG
0,PKM00041780,1942-05-06,90
1,PKM00041780,1942-05-07,85
2,PKM00041780,1942-05-08,87
3,PKM00041780,1942-05-09,86
4,PKM00041780,1942-05-10,86


In [None]:
# Download PRCP data
FC_precip_df = pd.read_csv(
    FC_PRCP_API,
    index_col='DATE',
    parse_dates=True,
    na_values=['NaN']
)

# Check that the download worked
FC_precip_df.head()

Unnamed: 0_level_0,STATION,PRCP
DATE,Unnamed: 1_level_1,Unnamed: 2_level_1
1893-01-01,USC00053005,0.0
1893-01-02,USC00053005,0.0
1893-01-03,USC00053005,0.0
1893-01-04,USC00053005,0.0
1893-01-05,USC00053005,0.0


In [17]:
# Download TMAX data
FC_TMAX_df = pd.read_csv(
    FC_TMAX_API,
    index_col='DATE',
    parse_dates=True,
    na_values=['NaN']
)

# Check that the download worked
FC_TMAX_df.head()

Unnamed: 0_level_0,STATION,TMAX
DATE,Unnamed: 1_level_1,Unnamed: 2_level_1
1893-01-01,USC00053005,51.0
1893-01-02,USC00053005,53.0
1893-01-03,USC00053005,64.0
1893-01-04,USC00053005,67.0
1893-01-05,USC00053005,60.0


In [12]:
print(FC_precip_df)
print(FC_precip_df.dtypes)

                STATION  PRCP
DATE                         
1893-01-01  USC00053005  0.00
1893-01-02  USC00053005  0.00
1893-01-03  USC00053005  0.00
1893-01-04  USC00053005  0.00
1893-01-05  USC00053005  0.00
...                 ...   ...
2025-07-15  USC00053005  0.00
2025-07-16  USC00053005  0.04
2025-07-18  USC00053005  0.03
2025-07-19  USC00053005  0.00
2025-07-20  USC00053005  0.00

[47814 rows x 2 columns]
STATION     object
PRCP       float64
dtype: object


In [18]:
print(FC_TMAX_df)
print(FC_TMAX_df.dtypes)

                STATION  TMAX
DATE                         
1893-01-01  USC00053005  51.0
1893-01-02  USC00053005  53.0
1893-01-03  USC00053005  64.0
1893-01-04  USC00053005  67.0
1893-01-05  USC00053005  60.0
...                 ...   ...
2025-07-15  USC00053005  96.0
2025-07-16  USC00053005  80.0
2025-07-18  USC00053005  90.0
2025-07-19  USC00053005  90.0
2025-07-20  USC00053005  93.0

[47814 rows x 2 columns]
STATION     object
TMAX       float64
dtype: object


In [19]:
%store FC_precip_df FC_TMAX_df

Stored 'FC_precip_df' (DataFrame)
Stored 'FC_TMAX_df' (DataFrame)


In [None]:
# plot PRCP
FC_PCRP_plot = FC_precip_df.plot(
    y='PRCP',
    x='DATE',
    title=f'Daily Precipitation in Fort Collins, Colorado, USA',
    xlabel='Date',
    ylabel='Precipitation (inches)')
FC_PCRP_plot

In [8]:
# make plot interactive
# Plot the annual data using .hvplot
FC_PCRP_plot_hv = FC_precip_df.hvplot(
    y='PRCP',
    x='DATE',
    title=f'Daily Precipitation in Fort Collins, Colorado, USA',
    xlabel='Date',
    ylabel='Precipitation (inches)')
FC_PCRP_plot_hv

In [37]:
# save hv plot
hv.save(FC_PCRP_plot_hv, 'FC_PCRP_plot_hv.html')

In [13]:
# now resample data to get yearly average
FC_annual_PRCP_df = FC_precip_df['PRCP'].resample('YS').mean()
# Store for later
%store FC_annual_PRCP_df
FC_annual_PRCP_df

Stored 'FC_annual_PRCP_df' (Series)


DATE
1893-01-01    0.026111
1894-01-01    0.027143
1895-01-01    0.060424
1896-01-01    0.044842
1897-01-01    0.043295
                ...   
2021-01-01    0.040493
2022-01-01    0.036247
2023-01-01    0.066274
2024-01-01    0.031995
2025-01-01    0.038850
Freq: YS-JAN, Name: PRCP, Length: 133, dtype: float64

In [20]:
# now resample data to get yearly average
FC_annual_TMAX_df = FC_TMAX_df['TMAX'].resample('YS').mean()
# Store for later
%store FC_annual_TMAX_df
FC_annual_TMAX_df

Stored 'FC_annual_TMAX_df' (Series)


DATE
1893-01-01    56.626667
1894-01-01    64.361842
1895-01-01    60.717808
1896-01-01    63.553425
1897-01-01    61.406685
                ...    
2021-01-01    65.939726
2022-01-01    65.583562
2023-01-01    64.671233
2024-01-01    67.349727
2025-01-01    64.250000
Freq: YS-JAN, Name: TMAX, Length: 133, dtype: float64

In [27]:
# now make a new interactive plotusing .hvplot

FC_annual_PCRP_plot_hv = FC_annual_PRCP_df.hvplot(
    y='PRCP',
    x='DATE',
    title=f'Annual Mean Precipitation in Fort Collins, Colorado, USA',
    xlabel='Date',
    ylabel='Precipitation (inches)')
FC_annual_PCRP_plot_hv

# save hv plot
hv.save(FC_annual_PCRP_plot_hv, 'FC_annual_PCRP_plot_hv.html')

In [28]:
FC_annual_PCRP_plot_hv

In [31]:
# now make a new interactive plotusing .hvplot

FC_annual_TMAX_plot_hv = FC_annual_TMAX_df.hvplot(
    y='TMAX',
    x='DATE',
    title=f'Annual Mean Max Temperature in Fort Collins, Colorado, USA',
    xlabel='Date',
    ylabel='Max Temperature (°F)',
    color='red')
FC_annual_TMAX_plot_hv


In [35]:

# save hv plot
hv.save(FC_annual_TMAX_plot_hv, 'FC_annual_TMAX_plot_hv.html')

In [32]:
# add lables 
decade_points_df = FC_annual_TMAX_df[FC_annual_TMAX_df.index.year % 10 == 0].copy()
decade_points_df['label_text'] = decade_points_df['TMAX'].round(2).astype(str) + ' °F'

print("Decade points for labeling:")
print(decade_points_df)


Decade points for labeling:
                 TMAX label_text
DATE                            
1900-01-01  63.583562   63.58 °F
1910-01-01  64.760989   64.76 °F
1920-01-01  59.523288   59.52 °F
1930-01-01  60.561644   60.56 °F
1940-01-01  62.796657    62.8 °F
1950-01-01  62.701370    62.7 °F
1960-01-01  62.822404   62.82 °F
1970-01-01  61.241888   61.24 °F
1980-01-01  63.180328   63.18 °F
1990-01-01  63.315068   63.32 °F
2000-01-01  65.234973   65.23 °F
2010-01-01  64.517808   64.52 °F
2020-01-01  66.265027   66.27 °F


In [33]:


FC_annual_TMAX_plot_hv = FC_annual_TMAX_df.hvplot(
    y='TMAX',
    x='DATE',
    title=f'Annual Mean Max Temperature in Fort Collins, Colorado, USA',
    xlabel='Date',
    ylabel='Max Temperature (°F)',
    color='red')
FC_annual_TMAX_plot_hv

# Create the Labels element
labels_hv = hv.Labels(
    data=decade_points_df,
    kdims=['DATE', 'TMAX'], # X and Y coordinates for the label position
    vdims=['label_text']    # The column containing the label text
)

# Overlay the labels on the plot and customize
# Use yoffset to move the labels slightly above the line
final_plot_with_labels = FC_annual_TMAX_plot_hv * labels_hv.opts(
    hv.opts.Labels(
        xoffset=0,       # No horizontal offset (center on the point)
        yoffset=0.02,    # Adjust vertical offset (e.g., 2% of plot height)
        text_align='center', # Center the text horizontally
        text_color='black',
        text_font_size='9pt',
        # Optional: Add a white box background for better readability
        # These are Bokeh-specific properties, often prefixed with 'background_'
        background_fill_color='white',
        background_fill_alpha=0.7,
        border_line_color='black', # Optional: Add a border
        border_line_alpha=0.3,
        border_line_width=0.5
    )
)

final_plot_with_labels

In [34]:
# save hv plot
hv.save(final_plot_with_labels, 'final_plot_with_labels.html')

# STEP -1: Wrap up

Don’t forget to store your variables so you can use them in other
notebooks! Replace `var1` and `var2` with the variable you want to save,
separated by spaces.

In [9]:
%store var1 var2

Finally, be sure to `Restart` and `Run all` to make sure your notebook
works all the way through!