# Total heatwave exposures

From the health system perspective, we'd like to know not just exposures to change (which in the climate perspective is useful to demonstrate that  HWs are really forming a trend relative to the null hypothesis of being normally distributed around 0) - but the absolute values with the idea to know a) how big is this change from 'normal' and b) how it compares to what we already cope with. More generally the idea is that if you measure millions more exposure days but on a total value of billions, then even if you pick out a statistucally significant trend you might not (from the policy POV) care that much. On the other hand if you are talking 2x historical it's an issue.

The ideal is to show 'percentage change' rel. to a baseline. the problem is the population data doesn't exist and even if it does, it doesn't make sense to average over 20years like we do for climatologies.

The first step is to just calculate absolute valeus - these aren't too problematic since anyway the 'HW delta' is kinda double-normalising since we 1x used 20y period for climatology then again for the baseline of the delta. Just plotting then the time series gives a pretty good idea of where you stand relative to'normal'

The next idea is to copy how GDP is presented as a percentage year-to-year. Since it doesn't make sense with pop to normalise to a baseline period, and it's very arbitrary to pick one year of period, instead plot the percentage change from previous year (e.g. https://fred.stlouisfed.org/graph/?g=eUmi)


In [1]:
from pathlib import Path
import numpy as np
import pandas as pd

import xarray as xr
import matplotlib.pyplot as plt
import matplotlib.colors as colors

from cartopy import crs as ccrs
from scipy import stats

import os
import sys

project_path = os.path.abspath(os.path.join('..', '..'))
if project_path not in sys.path:
    sys.path.insert(0, project_path)

from source.config import DATA_SRC, POP_DATA_SRC, WEATHER_SRC


In [2]:
# Figure settings
plt.rcParams['figure.dpi'] = 100
plt.rcParams['savefig.dpi'] = 300
plt.rcParams['figure.figsize'] = (5,2.5)
plt.rcParams['figure.titlesize'] = 'medium'
plt.rcParams['axes.titlesize'] = 'medium'
plt.rcParams['savefig.bbox'] = 'tight'

In [3]:
MAP_PROJECTION = ccrs.EckertIII()

In [4]:
MAX_YEAR = 2023
MIN_YEAR = 1980
REFERENCE_YEAR_START = 1986
REFERENCE_YEAR_END = 2005

RESULTS_FOLDER = DATA_SRC / 'lancet/results/results_2024/gpw_hw_exposure'





# Load Data

## Load population and demographic data

In [5]:
DEMOGRAPHICS_TOTALS_FILE = POP_DATA_SRC / 'hybrid_2023' / 'demographics_hybrid_1950_2020_15_min_era_compat.nc'

demographics_totals = xr.open_dataarray(DEMOGRAPHICS_TOTALS_FILE)
population_over_65 = demographics_totals.sel(age_band_lower_bound=65).sel(year=slice(1980,2020))

infants_totals_file = POP_DATA_SRC / 'hybrid_2023' / 'infants_1950_2020_hybrid_15_min_era_compat.nc' # files generated for lancet report 2023
population_infants = xr.open_dataarray(infants_totals_file).sel(year=slice(1980,2020))

 Extrapolate demographic data to 2022

In [6]:
extrapolated_years = np.arange(2020+1, MAX_YEAR+1)

In [7]:
population_over_65 = xr.concat(
    [population_over_65, 
     population_over_65.interp(year=extrapolated_years, kwargs=dict(fill_value="extrapolate"))
    ], 'year').compute()

In [8]:
population_infants = (
    xr.concat([population_infants, 
               population_infants.interp(year=extrapolated_years, kwargs=dict(fill_value="extrapolate"))
              ], 'year').load())

In [9]:
population = xr.concat([population_infants, population_over_65], 
                      dim=pd.Index([0, 65], name='age_band_lower_bound'))

## Load heatwave lengths and counts data

> TODO would like to split this up into yearly files so I only need to re-do one year at a time...

In [11]:
HEATWAVE_FOLDER = DATA_SRC / 'lancet/results/results_2024'

heatwave_metrics_files = sorted((HEATWAVE_FOLDER / 'heatwave_days_era5').glob('*.nc'))
heatwave_metrics = xr.open_mfdataset(heatwave_metrics_files, combine='by_coords')

## Calculate some utility data

In [12]:
# Get the grid weighting factor from the latitude
cos_lat = np.cos(np.radians(heatwave_metrics.latitude))

# Calculate total exposures and save for all metrics

Because the calculation is the same for all metrics, we can calculated it once on the dataset and save

In [28]:
exposures_over65 = heatwave_metrics['heatwaves_days'].transpose('year','latitude','longitude') * population_over_65.transpose('year','latitude','longitude')

exposures_over65 = exposures_over65.drop('age_band_lower_bound')

exposures_infants = heatwave_metrics['heatwaves_days'].transpose('year','latitude','longitude')  * population_infants.transpose('year','latitude','longitude')

# exposures = xr.concat([exposures_infants, exposures_over65], 
#                       dim=pd.Index([0, 65], name='age_band_lower_bound'))

In [29]:
exposures_over65.to_netcdf(RESULTS_FOLDER / f'heatwave_exposure_over65_multi_threshold_{MIN_YEAR}-{MAX_YEAR}.nc')

In [30]:
exposures_infants.to_netcdf(RESULTS_FOLDER / f'heatwave_exposure_infants_multi_threshold_{MIN_YEAR}-{MAX_YEAR}.nc')

In [21]:
#exposures.to_netcdf(INTERMEDIATE_RESULTS_FOLDER / f'heatwave_exposure_change_multi_threshold_{MIN_YEAR}-{MAX_YEAR}.nc')

## Total Exposure to change in  to heatwaves

Calculate exposure changes in terms of difference aspects of heatwaves - frequency, length, load. 

> **NOTE**: Keep the of number of individual events (rather than number of days) for historical reasons, but no longer focus on that because number of heatwave days is just generally a better measure than individual instances.


In [19]:
exposures_over65

Unnamed: 0,Array,Chunk
Bytes,348.05 MiB,7.91 MiB
Shape,"(44, 720, 1440)","(1, 720, 1440)"
Count,309 Tasks,44 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 348.05 MiB 7.91 MiB Shape (44, 720, 1440) (1, 720, 1440) Count 309 Tasks 44 Chunks Type float64 numpy.ndarray",1440  720  44,

Unnamed: 0,Array,Chunk
Bytes,348.05 MiB,7.91 MiB
Shape,"(44, 720, 1440)","(1, 720, 1440)"
Count,309 Tasks,44 Chunks
Type,float64,numpy.ndarray


In [25]:
total_exposures_over65 = exposures_over65.sum(dim=['latitude', 'longitude']).to_dataframe('elderly')
total_exposures_infants = exposures_infants.sum(dim=['latitude', 'longitude']).to_dataframe('infants')

In [24]:
# total_exposures_infants.to_excel(RESULTS_FOLDER / 'heatwave_exposure_indicator_totals_infants.xlsx')
# total_exposures_infants.to_csv(RESULTS_FOLDER / 'heatwave_exposure_indicator_totals_infants.csv')

# Weighted mean change

In [22]:
weighted_mean_infants = (exposures_infants / population_infants.sum(dim=['latitude', 'longitude']))

In [23]:
divnorm = colors.TwoSlopeNorm(vmin=-100, vcenter=0, vmax=400)

In [27]:
# baseline = weighted_mean_infants.sel(year=slice(2001,2010)).mean(dim='year')
# decadal = 100 * (weighted_mean_infants.sel(year=slice(2011,2020)).mean(dim='year') - baseline) / baseline
# decadal = decadal.compute()

# f, ax = plt.subplots(figsize=(6,3), subplot_kw=dict(projection=MAP_PROJECTION),dpi=300)

# decadal.heatwaves_days.plot.pcolormesh(
#     norm=divnorm,
#     cbar_kwargs=dict(label='%'),
#     transform=ccrs.PlateCarree(),
#     ax=ax)

# ax.coastlines(linewidth=0.5)
# ax.set_title(f'Exposure change of infants between decades\n 2001-2010 and 2011-2020')
# f.savefig(RESULTS_FOLDER / 'decade change lt 1.png')
# f.savefig(RESULTS_FOLDER / 'decade change lt 1.pdf')

In [28]:
# weighted_mean_over65 = (exposures_over65 / population_over_65.sum(dim=['latitude', 'longitude']))

# baseline = weighted_mean_over65.sel(year=slice(2001,2010)).mean(dim='year')
# decadal = 100 * (weighted_mean_over65.sel(year=slice(2011,2020)).mean(dim='year') - baseline) / baseline
# decadal = decadal.compute()

# f, ax = plt.subplots(figsize=(6,3), subplot_kw=dict(projection=MAP_PROJECTION),dpi=300)

# decadal.heatwaves_days.plot(norm=divnorm,
# #                             robust=True,
# #                             vmin=-100, vmax=400, cmap='plasma',
#                             cbar_kwargs=dict(label='%'),
#                             transform=ccrs.PlateCarree(),
#                             ax=ax)

# ax.coastlines(linewidth=0.5)
# ax.set_title(f'Exposure change of over-65s between decades\n 2001-2010 and 2011-2020')
# f.savefig(RESULTS_FOLDER / 'decade change over 65.png')
# f.savefig(RESULTS_FOLDER / 'decade change over 65.pdf')

In [29]:
divnorm

<matplotlib.colors.TwoSlopeNorm at 0x2b53fb8728c0>