This code is for processing air temperature (T2m) and land surface temperature (LST) data, and producing climate indices from both sources for comparison.
This is not a module, and scripts should be run seperately. The instructions to run the code and order of scripts is shown below.
Note, parallel process is strongly advised for some scripts in order to speed up the processing.
Air temperature data is from GHCNd: https://www.ncei.noaa.gov/cdo-web/search?datasetid=GHCND
LST data is from LST_cci: https://catalogue.ceda.ac.uk/uuid/a7e811fe11d34df5abac6f18c920bbeb/
IPCC AR6 region data: https://github.com/IPCC-WG1/Atlas/blob/v2.0-final/reference-regions/IPCC-WGI-reference-regions-v4.geojson?short_path=b4bcbc8
- Download data
- Set up
conf.json - Run
pre_run_code/assign_AR6regions.py processing/pre_process_ghcnd.py(parallel processing recommended)processing/pre_process_lst_cci.py(parallell processing recommended)- Run these in any order:
analysis/indices.pyanalysis/percentiles.py(parallel processing recommended)
analysis/summarise_indices.py- Different scripts to plot different indices:
The regions are typically presented with coded letters but in the code these are substituted for numbers which can be found in the shape files for the Working Group 1 Atlas
Whenever regions are supplied to the code, either through the command line or editing the scripts, users must supply the region numbers and not the region letter codes.
The configuration file conf.json stores custom filepaths and variables which can change the analysis:
year_start- the start year of the analysisyear_end- the end year of the analysisn_missing_days_allowed- the number of missing days allowed to calculate monthly percentiles to be used for percentile indicesAR6_shape_file- location ofIPCC-WGI-reference-regions-v4.geojsonghcnd_station_metadata- directory containing.txtstation metadata from downloaded from GHCNdghcnd_metadata- File name to save processed metadata from pre-run code (this metadata includes AR6 regions and IDs, and is a .csv format)ghcnd_folder- directory containing the downloaded GHCNd station datalst_cci_folder- dirctory containing LST_cci datamw_file_template- template of microwave LST filesghcnd_output_dir- directory to save processed station data tolst_output_dir- directory to save processed LST data toresults_dir- director to save indices results toplot_dir- directory to save plots to
pre_run_code/assign_AR6regions.py
This script reads and formats the .txt metadata file provided by GHCNd. It also assigns AR6 regions and IDs to the stations for use later.
This takes about 10 minutes depending on computer power.
processing/pre_process_ghcnd.py
This script reads GHCNd station Daily Summaries and extracts quality controlled T2m data.
Run by using python processing/pre_process_ghcnd.py [path to conf.json]
Addtional keyword arguments are:
- region - IPCC AR6 region to restrict stations to (defaults to
None) - batches - How many batches to split the data into (defaults to
None) - subset - Specific batch to process (defaults to
None)
Keywords batches and subset are useful when paralell processing.
Using defaults (i.e None) processes all regions in a single batch.
Data is outputted to a single Pickle file per station to a conf.json[ghcnd_output_dir]/BASE directory.
processing/pre_process_lst_cci.py [path to conf.json]
This script iterates through GHCNd station Pickle files and for each station gets the LST from LST_cci data. The process co-locates the grid cells with station locations, extracts quality controlled data, and produces a timeseries of temperatures.
Run by using python processing/pre_process_lst_cci.py [path to config.json]
Addtional keyword arguments are:
- region - IPCC AR6 region to restrict stations to (defaults to
None) - batches - How many batches to split the data into (defaults to
None) - subset - Specific batch to process (defaults to
None)
Keywords batches and subset are useful when paralell processing.
Using defaults (i.e None) processes all regions in a single batch.
Data is outputted to a single Pickle file per station to a conf.json[lst_output_dir]/BASE directory.
analysis/indices.py
Computes the value-based and threshold-based indices listed in config_and_paths/index_details.py
Run by using python analysis/indices.py [path to conf.json]
Addtional keyword arguments are:
- region - IPCC AR6 region to restrict stations to (defaults to
None) - batches - How many batches to split the data into (defaults to
None) - subset - Specific batch to process (defaults to
None)
Keywords batches and subset are useful when paralell processing.
Using defaults (i.e None) processes all regions in a single batch.
Indices are outputted in Pickle files per station to conf.json[ghcnd_output_dir]/INDEX directory.
analysis/percentiles.py
Computes the percentile-based indices listed in config_and_paths/index_details.py. This requires computing monthly percentiles.
Run by using python analysis/percentiles.py [path to conf.json]
Addtional keyword arguments are:
- region - IPCC AR6 region to restrict stations to (defaults to
None) - batches - How many batches to split the data into (defaults to
None) - subset - Specific batch to process (defaults to
None) - multiprocess - included to multi-process T2m and LST percentile-based indices simulatenously, else they are done sequentially.
Keywords batches and subset are useful when paralell processing. Keyword multiprocess can be used in conjuction with the other keywords, but users should ensure a sufficient number of processors is available.
Using defaults (i.e None) processes all regions in a single batch, and T2m and LST sequentially.
Indices are outputted in Pickle files per station to conf.json[ghcnd_output_dir]/INDEX directory.
analysis/summarise_indices.py
As all indices are kept in seperate per-station files, this script pools the results for stations in the same IPCC AR6 region.
Run by using python analysis/summarise_indices.py
Currently this script reads the configuration file with default of ./conf.json. Users will need to change this path in the script if their configuration file isn't in the root directory.
Ouputs two lots of files:
- Seperate T2m and LST files per index per region holding all the index results as timeseries. Stored at
conf.json[region_station_index_file]. - Seperate T2m and LST files per index holding the differnt regional index means as timeseries. Stored at
conf.json[index_regiona_mean_file]
Plots the index results producing 2d density plots of T2m index values vs LST index values, and timeseries of T2m and LST index mean timeseries.
For value-based and threshold-based indices, run either:
python plotting/plot_value_indices.pypython plotting/plot_threshold_indices.py
To select specific regions or indices, this must be done in the script by using the region and/or indices keywords and adding the desired regions/indices as lists where the ValueIndexPlotter or ThresholdIndexPlotter is instantiated.
Currently these scripts read the configuration file with default of ./conf.json. Users will need to change this path in the script if their configuration file isn't in the root directory.
For percentile-based indices, run:
python plotting/plot_percentile_indices
This can be used with keywords:
index- the positions of the chosen indices inconfig_and_paths/index_details.pye.g. for TX95p and TN5p--index 0 9. Defaults to -1 i.e. all percentile-based indicesregion- the IPCC AR6 regions to produce percentile-based index plots for e.g. for regions NEU and WCE--region 16 17