<a href="https://colab.research.google.com/github/cchen744/uhi-extreme-heat-response/blob/main/notebooks/01_data_exploration.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**In this notebook, I will quantify Urban Heat Isalnd Effect and Extreme Heat within #CITIES in #CLIMATE_ZONE climate zones.**
<p>
<img src=https://upload.wikimedia.org/wikipedia/commons/7/77/K%C3%B6ppen_Climate_Types_US_50.png align="middle" width=600 />
</p>

Notably, here are following definitions of variables of interest:
- **Urban Heat Island**: Based on land surface temperture, it is defined as SUHI(day) = LST_urb(day) − LST_rur(day), where LST_urb(day) = Aggregate urban pixels (mean or median, predefined) and LST_rur(day) = Aggregate rural reference pixels.
- **Extreme Heat**: Days when the daily mean landsurface temperature overpasses 90 percentile of its historical data.
- **Urban & Rural Definition**：

  - Urban: US Census Urbanized Area
  - Rural: Spatial mean of all non-urban, non-water pixels within the same UA



In [18]:
# Initialize
!git init
!git remote add origin https://github.com/cchen744/uhi-extreme-heat-response.git
!git pull origin main --allow-unrelated-histories

Reinitialized existing Git repository in /content/.git/
error: remote origin already exists.
From https://github.com/cchen744/uhi-extreme-heat-response
 * branch            main       -> FETCH_HEAD
Already up to date.


In [26]:
from pathlib import Path
import os
import pandas as pd
import ee
import uhi_pipeline
import importlib
importlib.reload(uhi_pipeline)
print("uhi_pipeline module reloaded.")

ee.Authenticate()
ee.Initialize(project='extremeweatheruhi')

DATA_DIR = Path("data/cities")
DATA_DIR.mkdir(parents=True, exist_ok=True)

ua_fc = ee.FeatureCollection("projects/extremeweatheruhi/assets/uac20_2025")

uhi_pipeline module reloaded.


In [27]:
ee.Authenticate()
ee.Initialize(project='extremeweatheruhi')

In [28]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [29]:
def append_csv(df_part: pd.DataFrame, out_csv: Path):
    if df_part is None or df_part.empty:
        return
    header = not out_csv.exists()
    df_part.to_csv(out_csv, mode="a", header=header, index=False)

In [30]:
START_DATE = "2013-06-01"
END_DATE   = "2019-08-31"

CFG = dict(
    lst_band="LST_Night_1km",
    qc_band="QC_Night",
    agg_func="mean",
    ring_outer_m=30000,
    ring_inner_m=5000,
    lst_scale_m=1000,
    min_urban_pixels=30,
    min_rural_pixels=30,
    extreme_percentile=90,
)

def build_city_safe(city_name: str):
    out_csv = DATA_DIR / f"{city_name}_daily_suhi.csv"

    done_months = set()
    if out_csv.exists():
        tmp = pd.read_csv(out_csv, usecols=["date"])
        # Use errors='coerce' to turn unparseable dates into NaT and then drop them.
        # This is more robust than catching the ValueError directly for each row.
        dates = pd.to_datetime(tmp["date"], errors='coerce').dropna()
        done_months = set(dates.dt.strftime("%Y-%m"))
        # If the file was completely unparseable or empty after dropping NaT, done_months will be empty,
        # which means it will re-process all months. This is a safe fallback.

    months_done = 0
    rows_written = 0

    for s, e in uhi_pipeline.month_starts(START_DATE, END_DATE):
        ym = s[:7]
        if ym in done_months:
            continue

        try:
            df_m = uhi_pipeline.run_city(
                ua_contains=city_name,
                ua_fc=ua_fc,
                start_date=s,
                end_date=e,
                out_csv=None,
                **CFG,
            )
            if df_m is None or df_m.empty:
                continue

            append_csv(df_m, out_csv)
            months_done += 1
            rows_written += len(df_m)
        except ee.EEException as ee_err:
            print(f"EEException for {city_name} on {ym}: {ee_err}. Skipping this month.")
        except Exception as generic_err:
            print(f"Generic error for {city_name} on {ym}: {generic_err}. Skipping this month.")

    return {"city": city_name, "file": str(out_csv), "months_done": months_done, "rows_written": rows_written}

In [31]:
import geemap
intended_cities = ['Philadelphia','Los Angeles','Philadelphia','Nashville','Phoenix','Houston','Chicago']
for city_name in intended_cities:
  city = ua_fc.filter(ee.Filter.stringContains("NAME20", city_name))
  city_count = city.size().getInfo()

  if city_count > 0:
      print(f"{city_name} found in ua_fc with {city_count} features.")
      display(geemap.ee_to_df(city))
  else:
      print(f"{city_name} not found in ua_fc.")

Philadelphia found in ua_fc with 3 features.


Unnamed: 0,ALAND20,AWATER20,FUNCSTAT20,GEOID20,GEOIDFQ20,INTPTLAT20,INTPTLON20,LSAD20,MTFCC20,NAME20,NAMELSAD20,UACE20
0,19043536,0,S,69049,400C200US69049,32.772735,-89.1140846,67,G3500,"Philadelphia, MS","Philadelphia, MS Urban Area",69049
1,61787113,833909,S,62731,400C200US62731,40.4892232,-81.4405015,67,G3500,"New Philadelphia--Dover, OH","New Philadelphia--Dover, OH Urban Area",62731
2,4916312705,134125903,S,69076,400C200US69076,39.9872279,-75.2884721,67,G3500,"Philadelphia, PA--NJ--DE--MD","Philadelphia, PA--NJ--DE--MD Urban Area",69076


Los Angeles found in ua_fc with 1 features.


Unnamed: 0,ALAND20,AWATER20,FUNCSTAT20,GEOID20,GEOIDFQ20,INTPTLAT20,INTPTLON20,LSAD20,MTFCC20,NAME20,NAMELSAD20,UACE20
0,4242653496,45567066,S,51445,400C200US51445,33.9849958,-118.122395,67,G3500,"Los Angeles--Long Beach--Anaheim, CA","Los Angeles--Long Beach--Anaheim, CA Urban Area",51445


Philadelphia found in ua_fc with 3 features.


Unnamed: 0,ALAND20,AWATER20,FUNCSTAT20,GEOID20,GEOIDFQ20,INTPTLAT20,INTPTLON20,LSAD20,MTFCC20,NAME20,NAMELSAD20,UACE20
0,19043536,0,S,69049,400C200US69049,32.772735,-89.1140846,67,G3500,"Philadelphia, MS","Philadelphia, MS Urban Area",69049
1,61787113,833909,S,62731,400C200US62731,40.4892232,-81.4405015,67,G3500,"New Philadelphia--Dover, OH","New Philadelphia--Dover, OH Urban Area",62731
2,4916312705,134125903,S,69076,400C200US69076,39.9872279,-75.2884721,67,G3500,"Philadelphia, PA--NJ--DE--MD","Philadelphia, PA--NJ--DE--MD Urban Area",69076


Nashville found in ua_fc with 2 features.


Unnamed: 0,ALAND20,AWATER20,FUNCSTAT20,GEOID20,GEOIDFQ20,INTPTLAT20,INTPTLON20,LSAD20,MTFCC20,NAME20,NAMELSAD20,UACE20
0,10167128,114650,S,61219,400C200US61219,31.2060314,-83.2446024,67,G3500,"Nashville, GA","Nashville, GA Urban Area",61219
1,1515057888,13721302,S,61273,400C200US61273,36.1259094,-86.6911686,67,G3500,"Nashville-Davidson, TN","Nashville-Davidson, TN Urban Area",61273


Phoenix found in ua_fc with 2 features.


Unnamed: 0,ALAND20,AWATER20,FUNCSTAT20,GEOID20,GEOIDFQ20,INTPTLAT20,INTPTLON20,LSAD20,MTFCC20,NAME20,NAMELSAD20,UACE20
0,330537813,985600,S,69192,400C200US69192,33.4679624,-112.3639331,67,G3500,"Phoenix West--Goodyear--Avondale, AZ","Phoenix West--Goodyear--Avondale, AZ Urban Area",69192
1,2876252398,8810551,S,69184,400C200US69184,33.4999005,-111.9631531,67,G3500,"Phoenix--Mesa--Scottsdale, AZ","Phoenix--Mesa--Scottsdale, AZ Urban Area",69184


Houston found in ua_fc with 1 features.


Unnamed: 0,ALAND20,AWATER20,FUNCSTAT20,GEOID20,GEOIDFQ20,INTPTLAT20,INTPTLON20,LSAD20,MTFCC20,NAME20,NAMELSAD20,UACE20
0,4540250252,65791956,S,40429,400C200US40429,29.7730212,-95.4003948,67,G3500,"Houston, TX","Houston, TX Urban Area",40429


Chicago found in ua_fc with 1 features.


Unnamed: 0,ALAND20,AWATER20,FUNCSTAT20,GEOID20,GEOIDFQ20,INTPTLAT20,INTPTLON20,LSAD20,MTFCC20,NAME20,NAMELSAD20,UACE20
0,6055713369,100561449,S,16264,400C200US16264,41.8304099,-87.9086694,67,G3500,"Chicago, IL--IN","Chicago, IL--IN Urban Area",16264


In [32]:
CITIES = ['Philadelphia', 'Nashville'] #'Los Angeles','Phoenix','Houston','Chicago'

results = []
for c in CITIES:
    print(c)
    print("Building:", c)
    results.append(build_city_safe(c))

pd.DataFrame(results)[["city", "months_done", "rows_written"]].head()

Philadelphia
Building: Philadelphia




Nashville
Building: Nashville


Unnamed: 0,city,months_done,rows_written
0,Philadelphia,75,1812
1,Nashville,75,1447


In [33]:
!git status

On branch master
Your branch is up to date with 'origin/master'.

Untracked files:
  (use "git add <file>..." to include in what will be committed)
	[31m.config/[m
	[31mdrive/[m
	[31msample_data/[m

nothing added to commit but untracked files present (use "git add" to track)
