# Download timeseries climate data from cities

This script downloads, cleans and saves timeseries data from GCM and ERA-Interim.

**Requires:**
* Baspy module https://github.com/scott-hosking/baspy
* Run `python setup.py install` to use modules

**Notes:**
* Data is saved at /data_directory/filetype/variable_label/City_rcpX.nc (for GCM) or City_ERAI.nc (for ERA) - you can specify the parent data_directory and variable_label in settings.
* Extracting one city for one GCM takes around 3-5 mins on JASMIN, quicker with ERA-Interim
* Downloading GCM data (e.g. from 1980 to 2050) concatenates historical and future RCP run into one timeseries


In [4]:
settings = {
    # --------------------------- #
    #  Location / variable        #
    # --------------------------- #
    # 'cities': ['London', 'NYC', 'Beijing', 'Tokyo', 'Madrid'],  # Cities to download
    'cities': ['London', 'NYC'],

    'variable_name_gcm': 'tasmax',  # GCM official variable name e.g. 'tas' (mean T), 'tasmax' (max T), 'tasmin' (min T)
    'variable_name_era': 't2max',   # ERAI official variable name e.g. 'T2' (mean T), 't2max' (max T), 't2min' (min T)
    'variable_label': 'max_temperature',  # Downloaded data will go in this folder; ensure accurate discription

    'no_leap_years': True,   # Remove 29th Feb's to get 365-days consistent calender

    # --------------------------- #
    #  GCM                        #
    # --------------------------- #
    'model': 'HadGEM2-CC',  # Climate model
    'future_rcp': ['rcp85', 'rcp45'],  # Model RCP
    'model_start': 1980,  # Model start year (inclusive)
    'model_end': 2050,  # Model end year (inclusive)

    # --------------------------- #
    #  ERA Interim                #
    # --------------------------- #
    'observed_start': 1980,  # ERA Interim start year (inclusive)
    'observed_end': 2017,  # ERA Interim end year (inclusive)

    # --------------------------- #
    #  Saving files               #
    # --------------------------- #
    'save_coords': True,  # Optional: save loaded coordinates from cities
    'filetype': 'netcdf',  # Climate data type to save 'netcdf' or 'df'
    'data_directory': '../data/riskindex/'  # Directory to save created data
}

In [5]:
from downloader.get1D import ClimateDataProcessing
cd = ClimateDataProcessing(settings)

## Download clean data for all cities in one line:

### GCM

In [6]:
cd.save_clean_gcm_data()

Geopy found the following locations:
* 'London': London, Greater London, England, SW1A 2DX, United Kingdom
* 'NYC': New York, United States of America


IsADirectoryError: [Errno 21] Is a directory: './data/riskindex/coords'

### ERA-Interim

In [None]:
cd.save_clean_era_data()

### If you want to manually specify coordinates:

In [None]:
coords = {}
coords['Some_city1'] = {'longitude': -10, 'latitude': 20}
coords['Some_city2'] = {'longitude': 0, 'latitude': 100}

cd.save_clean_gcm_data(coords=coords)
cd.save_clean_era_data(coords=coords)

### Load saved data

In [None]:
da_gcm = cd.load_data_at_location(city='London', rcp=45)
da_era = cd.load_data_at_location(city='London', model='era')

In [None]:
da_gcm

In [None]:
da_era

In [None]:
# To convert to pandas dataframe:
from scripts import dataprocessing
df_gcm = dataprocessing.da_to_df(da_gcm)
df_gcm.head()

## Behind ClimateDataProcessing