Skip to content

Repository for scripts generated for the Australian Urban Health Indicator (AusUrbHI) project.

Notifications You must be signed in to change notification settings

AURIN-OFFICE/AusUrbHI

Repository files navigation

AusUrbHI Heat Vulnerability and Urban Livability Case Study Documentation Archive

This is the repository for scripts and supporting documents generated for the Australian Urban Health Indicator (AusUrbHI) project. The study aims to analyze the number and cause of emergency department (ED) presentations, hospitalizations, and deaths during a heatwave using person-level deidentified linked health data. The methodology involves determining the health component of the sensitivity sub-indicator, normalizing and categorizing variables, and conducting a Poisson multivariable regression to calculate the Heat Vulnerability Index (HVI) score. Spatial smoothing will be applied to the geographic data to protect data privacy and ensure statistical stability while adjusting for age, sex, and comorbidities. The resulting heat health vulnerability indicator will reveal the relative vulnerability of locations across the study area, and hotspot analysis will be performed to investigate statistically significant locations of heat health vulnerability. This comprehensive approach combines statistical methods and spatial analysis techniques to provide valuable insights into heat health vulnerability. The repository is structured as shown below.

├── scripts
├── documentation
    ├── data_catalogue_and_metadata
    ├── linked_health_data_application_and_documents (sensitive and not publicly awailable)
├── img


Source: http://www.bom.gov.au/state-of-the-climate/australias-changing-climate.shtml

Project diagram
Project diagram

Systen e-Infrastructure
Systen e-Infrastructure

Study area


Case Study Area

Population:

Greater Sydney - 4.8 million
Wollongong - 0.3 million
Newcastle - 0.17 million
Maitland - 0.5 million
Tweed heads - 0.65 million
Albury - 0.05 million

Key Datasets

AURIN ABS Socio-Demographic Datasets

AURIN Health-Related Datasets

  • PHIDU diseases, conditions, risk factors, etc.
  • ABS Health & Disability
  • NHSD health service locations
    • Density of numbers of GPs and EDs within buffered region

Built and Natual Environment Data

  • Geoscape
    • Buildings - roof height, material, cooling, building area, levels, swimming pools
    • surface cover - % of bare earth, road, vegetation, built-up, water, building ,etc.
    • closeness to hydro body, train station, and green space

Temperature Data

  • Longpaddock SILO
    • Average and max summer percentile deviation
    • Excess Heat Factor (EHF) average and maximum
    • Days and average duration of excess heat, heatwave, and extreme heatwave days

SA1-Level Individual Record Linked Health Datasets

  • APDC admitted-patient-data-coll: 298,120,88 records for 4,540,451 patients
  • EDDC emergency-department-data-coll: 23,330,967 records for 4,273,272 patients
  • CODURF cause-of-death-unit-record-file: 221,787 records for 221,751 patients
  • RBDM registry-of-births-deaths-and-marriages-death-registrations: 264, 594 records for 264,530 patients

RMIT Urban Observatory Livability Data

(Hidden unpublished content under review.)

Data Pre-pocessing

AURIN vector data processing

  • transformation and spatialization if needed (e.g., NHSD)
  • refinement to study area
  • 2016 to 2021 ASGS concordance
    • n:m mapping resolution
  • Disaggreation (PHA and SA2 to SA1)
    • population based
    • number of sub-division area based
    • no division

SA1 Data Concordance
SA1 Data Concordance

Temperature raster data cube building, storage, and analysis

Creating Land Surface Temperature Data Cube from Raster
Creating Land Surface Temperature Data Cube from Raster

Deriving EHI and EHF to Identify Heat Days and Heat Waves
Deriving EHI and EHF to Identify Heat Days and Heat Waves

EHF is calculated based on the definition and algorithm provided by BOM.

EHF Algorithm
EHF Algorithm

Datasets we initially planned to use:

Dataset Spatial Resolution Temporal Resolution
MODIS 250m - 1km Daily, 8-day, 16-day, and monthly (depending on product)
Landsat-8 15m - 100m 16 days
BOM/SILO Varies (station-based) Daily, monthly, and annual
ERA5 ~31km Hourly

MODIS (Moderate Resolution Imaging Spectroradiometer) provides data with a spatial resolution ranging from 250 meters to 1 kilometer, depending on the specific product. The temporal resolution varies as well, with daily, 8-day, 16-day, and monthly options available.
Landsat-8 offers data with a spatial resolution of 15 meters for the panchromatic band, 30 meters for the visible and near-infrared bands, and 100 meters for the thermal infrared bands. Landsat-8 has a 16-day repeat cycle, which means it captures data for the same location every 16 days.
BOM (Bureau of Meteorology, Australia) provides weather station-based data, and its spatial resolution varies depending on the density and distribution of weather stations. BOM data is typically available on daily, monthly, and annual timescales.
Longpaddock SILO Processed weather station data
ERA5, a reanalysis dataset produced by the European Centre for Medium-Range Weather Forecasts (ECMWF), has a spatial resolution of approximately 31 kilometers. It provides data at an hourly temporal resolution.
BOM The heatwave data (EHF) is available from late 2018 on wards only.\

Key processing steps:

  • download max/min daily temperature netcdf files from 2010 to 2022
  • merge nc to 2010_2022_max_temp.nc and 2011_2022_min_temp.nc
  • refine to study area and buff by 0.5, fill nan value
  • derive EHI, EHF, tx90, and tn90
  • perform SA1 level analysis of the previously mentioned variables (e.g., days and average) and derive shapefile

Geoscape building polygon data transformation and statistics

Building Data Statistics
Building Data Statistics

Geoscape landcover raster data analysis integration and statistics

Land Cover Data Transformation and Statistics
Land Cover Data Transformation and Statistics

raster input include 2m surface rasters in NSW, as well as two 30m surface rasters ACTNSW_SURFACECOVER_30M_Z55.tif and Z56.tif (others are not interesting with the study area)
  • preprocess 30m raster
    • clip the two 30m rasters with study area
    • resample the rasters to 2m
  • mosaic dataset creation
    • create a mosaic dataset in GDA94
    • add all rasters (original 30m and 2m) to it
    • avoid duplicated count from overlapping regions by setting Mosaic rule (By Attribute)
  • derive landuse shapefile
    • reclassify symbology to value with 0-12 from 256 (stretch -> discrete)
    • zonal histogram analysis
    • convert to shapefile (table to excel, and postprocessing)
    • manual validation and change field name

Service area network distance analysis


NHSD service accessibility

input:

  • osm line shapefile
  • nhsd service location point shapefile

step (using ArcGIS Pro):

  • make sure CRS is consistent, and refine shapefiles to sudy area
  • create feature dataset in geodatabase, add the line data to it, and create network dataset
  • Analysis (top task bar) -> Network Analysis -> Service Area
  • build network dataset (right-click the dataset from geodatabase)
  • service area layer (top task bar) -> import facilities
  • run service area analysis, save resulting polygons as /polygons<1000>/<2000>/<3000>.shp
  • generating a shapefile with service accessibility (e.g., number of GPs within 1000m) divided by SA1 area (Python). This could be done with tools such as Spatial Join and Summary Statistics


Closest train stations from each SA1

step changed (ArcGIS Pro):

  • using Closest Facility instead of Service Area
  • define train stations as facilities and SA1 centroids as incidents
  • compute route length for each SA1 centroid to the closest train station

Linked data curation, processing, and analysis

Health Indices Generation


Indices and sub-indices

Heat Health Indices Correlation
Heat Health Indices Correlation

(Archived) Microsoft building footprint Building Point Cloud Data Processing

The building footprint processing methodology consists of three steps.

  1. Hole Removal: A BuildingHoleRemover class is initialized with an input shapefile of building polygons. It reads the shapefile, removes small holes from the building polygons based on a minimum area, and saves the processed polygons to a new shapefile.
  2. Rebuffering: A BuildingReBuffer class is initialized with the input shapefile of processed building polygons. It applies a buffer to the building polygons and saves the buffered polygons to a new shapefile.
  3. Regularization: A regularize_building_footprints function takes an input feature class of building footprints, simplifies them using the Simplify Building tool in ArcGIS Pro, and saves the regularized footprints to an output feature class. An example result of the approach is shown below:

Process building point cloud data
Process building point cloud data

Comparison of processing result
Comparison of processing result

(Archived) SNOMED-CT-AU to ICD-10 Mapping

The package performs code mapping between SNOMED-CT-AU and ICD-10. The methodology involved mapping SNOMED CT codes to ICD codes using a mock-up dataset with 4,000+ entries. Four steps were taken: 1) using Snapper to resolve synonyms and unofficial names, 2) using SnoMap, 3) using the IHTSDO international SNOMED mapping tool for cross-validation, and 4) using the I-MAGIC mapper, which resolved some unmatched entries. The final result was 99.68% recall, with 13 unmatched codes. However, there are risks associated with the process: 1) the actual dataset might not be as clean as the mock-up, 2) the size of the real dataset is unknown and may affect tool usability, 3) precision has not been checked, making validation difficult, 4) a coder with professional knowledge may be required to resolve unmatched entries and disambiguate multi-match cases, and 5) differences between the Australian and international versions of the codes may impact the reliability of the results from international tools. In conclusion, the methodology achieved a high recall rate, but further validation and professional expertise may be needed to ensure accurate results.

The SnoMap tool
The SnoMap tool

The IHTSDO international SNOMED mapping tool
The IHTSDO international SNOMED mapping tool

The I-MAGIC tool
The I-MAGIC tool

About

Repository for scripts generated for the Australian Urban Health Indicator (AusUrbHI) project.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published