## Obtaining Climate Data from Google Earth Engine

To investigate how climate impacts human rodent-borne disease cases, we must obtain historical climate data for a variety of countries. The climate parameters we are interested in and the datasets we are sourcing from:

Parameters used from other paper
Tmean, minimum daily air temperature—Tmin, and maximum daily air temperature—Tmax (all temperatures are the monthly averages of the corresponding daily values, in 2 m height above ground, in °C); total precipitation in mm—Pr, total sunshine duration in hours—SD, mean monthly soil temperature in 5 cm depth under uncovered typical soil of location in °C—ST, and soil moisture under grass and sandy loam in percent plant useable water—SM. 

WorldClim for historical data? https://developers.google.com/earth-engine/datasets/catalog/UCSB-CHG_CHIRPS_PENTAD


|Parameter|Dataset|Resolution|Details|Reference|
|----|----|----|----|---|
|Mean daily land surface temperature|MODIS MOD11A1 v061|1 km|daily temperature|https://doi.org/10.5067/MODIS/MOD11A1.061|
|Monthly total precipitation|CHIRPS 2.0|0.05 degrees (5566 meters)|Climate hazards group InfraRed Precip wth station data. Each image represents a pentad (5 days).|https://developers.google.com/earth-engine/datasets/catalog/UCSB-CHG_CHIRPS_PENTAD
|Precipitation|IMERG GPM v6|11132 meters|Precip observations every 3 hours|https://doi.org/10.5067/GPM/IMERG/3B-HH-L/06|
|Soil Temperature|
|Enhaced vegetation index|MODIS Terra Vegetation 16-Day Global 1km|

In [19]:
# import packages
import ee
from datetime import datetime
import pandas as pd
import geopandas as gpd
import matplotlib.pyplot as plt
import os

Initialize Earth Engine. If first time using GEE in notebook setup, make sure that a project has been created and your account as been added, then run the authenticate command to link your account. Additionally, set up the working directory. Enter the authorization code you receive from google into the box. 

In [20]:
ee.Authenticate()
ee.Initialize()

wd = '~/Dropbox/RBDML'


Successfully saved authorization token.


### Define Regions of Interest

The case data for hantavirus and lassa fever only focus on the following countries based on the listed criteria. We will be downloading gridded raster for the climate parameters 

Country inclusion criteria:
- More than 10 cases in one month
- More than 300 total cases across all years

Countries included:

In [24]:
# import csv with regions and date ranges
df = pd.read_csv('../../data/processed/country_list.csv')
#df.set_index('country', inplace=True)
df.head()

Unnamed: 0,country,minyear,maxyear,max_monthly_cases,total_cases
0,Austria,2008,2021,46,1134
1,Bolivia,2010,2023,219,9276
2,Brazil,2001,2020,65,544071
3,Chile,1995,2022,18,9055
4,China,2004,2023,4189,228738


As we will be using prior climate data to inform predictions, we will climate data from up 2 years prior and will subtract 2 years from the minimum year.

In [26]:
df.minyear = df.minyear - 2

### 1. Obtain Land Surface Temperature (LST)

First we will define the functions needed:

In [None]:
'''
def sumDailyPrecip(collection, date):
    '''
        #Calculate the sum for the day
    '''
    date = ee.Date(date)
    return (collection
            .filter(ee.Filter.date(date.getRange('day')))
            .sum()
            .set("day", date.format("DD"))
            .set("month", date.format("MM"))
            .set("year", date.format("YYYY"))
            .set("system:index", ee.String(year).cat('-').cat(month).cat('-').cat(day)))
'''

In [43]:
def scaleAndMask(img):
    '''
        Takes in an image and returns the the day LST in celcius, masked by quality.
    '''
    # Select the QA bands
    qc = img.select('QC_Day')

    # Create masks:
    qa_flag_mask = qc.bitwiseAnd(0b11).lt(2) # Bits 0-1 <= 1; good or other quality
    data_quality_mask = qc.bitwiseAnd(0b1100).rightShift(2).eq(0) # Bits 2-3 = 0; good data quality

    good_qc = qa_flag_mask.And(data_quality_mask)

    return(img.select('LST_Day_1km')
           .multiply(0.02) # convert from Kelvin
           .subtract(273.15)
           .updateMask(good_qc)
           .copyProperties(img, ['system:time_start']))

def eviMask(img):
    '''
        Takes in an image and returns the the day LST in celcius, masked by quality.
    '''
    # Select the QA bands
    qc = img.select('SummaryQA')

    # Create masks:
    qa_flag_mask = qc.bitwiseAnd(1 << 0).eq(0) # left-shift: only looking at last bit -> if 0 good quality

    return(img.select('EVI')
           .multiply(0.0001)
           .updateMask(qa_flag_mask)
           .copyProperties(img, ['system:time_start']))



def sumDailyPrecip(collection, date):
    '''
        Calculate the sum for the day
    '''
    date = ee.Date(date)
    # Filter the collection for the given date
    filtered_collection = collection.filterDate(date, date.advance(1, 'day'))
    # Calculate the sum for the day
    daily_sum = filtered_collection.sum()
    # Get day, month, and year
    day = date.format("DD")
    month = date.format("MM")
    year = date.format("YYYY")
    # Set properties
    return daily_sum.set({
        "day": day,
        "month": month,
        "year": year,
        "system:index": ee.String(year).cat('-').cat(month).cat('-').cat(day)
    })


def sumMonthlyComposite(collection, date):
    '''
        Calculate the sum for a month for an image.
    '''
    date = ee.Date(date)
    return (collection
            .filterDate(date, date.advance(1, 'month'))
            .sum()
            .set("month", date.format("MM"))
            .set("year", date.format("YYYY"))
            .set("system:index", date.format("MM-YYYY")))

def meanMonthlyComposite(collection, date):
    '''
        Calculate the mean for a month for an image.
    '''
    date = ee.Date(date)
    return (collection
            .filterDate(date, date.advance(1, 'month'))
            .mean()
            .set("month", date.format("MM"))
            .set("year", date.format("YYYY"))
            .set("system:index", date.format("MM-YYYY")))


def fcToDict(fc):
  '''
    Turns a feature collection into a dictionary. 
  '''
  prop_names = fc.first().propertyNames()
  prop_lists = fc.reduceColumns(
      reducer=ee.Reducer.toList().repeat(prop_names.size()),
      selectors=prop_names).get('list')

  return ee.Dictionary.fromLists(prop_names, prop_lists)

Next we need to create a list of the countries we will be using and load image collection.

In [31]:
# Feature Collection for list of countries
# Using the FAO GAUL Admin Layers: https://developers.google.com/earth-engine/datasets/catalog/FAO_GAUL_2015_level0#table-schema
# Export for admin 1 level for all 
countries = ee.Filter.inList('ADM0_NAME', list(df['country'])[0]) # using FIPS country code
regions = ee.FeatureCollection('FAO/GAUL/2015/level1').filter(countries)

# Dates required
#year = '2012'
#start = ee.Date('2012-01-01')
#end = ee.Date('2013-01-01')

# Image collection
# Import the MODIS land surface temperature collection.
#collection = ee.ImageCollection('MODIS/006/MOD11A1').filterDate(start,end).map(scaleAndMask)

### Export daily LST

In [490]:
collection = ee.ImageCollection('MODIS/006/MOD11A1').map(scaleAndMask)
collection_min_date = ee.Date(collection.aggregate_min('system:time_start'))
collection_max_date = ee.Date(collection.aggregate_max('system:time_start'))

for country in df['country']:
    # define date range
    start = ee.Date.fromYMD(df.loc[df['country'] == country, 'minyear'].item(), 1, 1)
    end = ee.Date.fromYMD(df.loc[df['country'] == country, 'maxyear'].item(), 12, 31)

    # if dates are outside of collection range
    if start.difference(collection_min_date, 'days').getInfo() < 0: start = collection_min_date
    if end.difference(collection_max_date, 'days').getInfo() > 0: end = collection_max_date
    
    region = ee.FeatureCollection('FAO/GAUL/2015/level1').filter(ee.Filter.eq('ADM0_NAME', country))
    daily_lst_filtered = collection.filterDate(start, end)

    # Determine mean for admin region
    daily_lst = daily_lst_filtered.map(lambda image: image.reduceRegions(
        reducer=ee.Reducer.mean().setOutputs(['mean_lst']),
        collection=region,
        scale=1000
    )).flatten()

    # Export the FeatureCollection to Google Drive as a CSV file
    task = ee.batch.Export.table.toDrive(
        collection=daily_lst,
        description=country+'_DailyLST',
        fileFormat='CSV',
        folder='test',
        selectors=['system:index', 'ADM0_CODE',	'ADM0_NAME', 'ADM1_CODE', 'ADM1_NAME', 'mean_lst']
    )

    # Start the export task
    task.start()
    print('Starting task for '+ country)

Starting task for Austria
Starting task for Brazil
Starting task for Chile
Starting task for China
Starting task for Finland
Starting task for France
Starting task for Germany
Starting task for Nigeria
Starting task for Norway
Starting task for Slovakia
Starting task for Slovenia
Starting task for Sweden
Starting task for United States of America


Finland at 2nd admin unit so do more detailed 

In [77]:
collection = ee.ImageCollection('MODIS/006/MOD11A1').map(scaleAndMask)
collection_min_date = ee.Date(collection.aggregate_min('system:time_start'))
collection_max_date = ee.Date(collection.aggregate_max('system:time_start'))

for country in ['Finland']:
    # define date range
    start = ee.Date.fromYMD(df.loc[df['country'] == country, 'minyear'].item(), 1, 1)
    end = ee.Date.fromYMD(df.loc[df['country'] == country, 'maxyear'].item(), 12, 31)

    # if dates are outside of collection range
    if start.difference(collection_min_date, 'days').getInfo() < 0: start = collection_min_date
    if end.difference(collection_max_date, 'days').getInfo() > 0: end = collection_max_date
    
    region = ee.FeatureCollection('FAO/GAUL/2015/level2').filter(ee.Filter.eq('ADM0_NAME', country))
    daily_lst_filtered = collection.filterDate(start, end)

    # Determine mean for admin region
    daily_lst = daily_lst_filtered.map(lambda image: image.reduceRegions(
        reducer=ee.Reducer.mean().setOutputs(['mean_lst']),
        collection=region,
        scale=1000
    )).flatten()

    # Export the FeatureCollection to Google Drive as a CSV file
    task = ee.batch.Export.table.toDrive(
        collection=daily_lst,
        description=country+'_DailyLST',
        fileFormat='CSV',
        folder='GEE/Temperature/',
        selectors=['system:index', 'ADM0_CODE',	'ADM0_NAME', 'ADM1_CODE', 'ADM1_NAME', 'ADM2_CODE', 'ADM2_NAME', 'mean_lst']
    )

    # Start the export task
    task.start()
    print('Starting task for '+ country)

Starting task for Finland


## Precipitation Data

For CHIRPS data

In [59]:
collection = ee.ImageCollection('UCSB-CHG/CHIRPS/DAILY')
collection_min_date = ee.Date(collection.aggregate_min('system:time_start'))
collection_max_date = ee.Date(collection.aggregate_max('system:time_start'))

for country in df['country']:
    # define date range
    start = ee.Date.fromYMD(df.loc[df['country'] == country, 'minyear'].item(), 1, 1)
    end = ee.Date.fromYMD(df.loc[df['country'] == country, 'maxyear'].item(), 12, 31)

    # if dates are outside of collection range
    if start.difference(collection_min_date, 'days').getInfo() < 0: 
        print("Data for "+country+" starts earlier than dataset.")
        start = collection_min_date
    if end.difference(collection_max_date, 'days').getInfo() > 0: 
        print("Data for "+country+" extends past dataset limit.")
        end = collection_max_date
    
    region = ee.FeatureCollection('FAO/GAUL/2015/level2').filter(ee.Filter.eq('ADM0_NAME', country))
    filtered_collection = collection.filterDate(start, end)

    # determine mean for admin region
    daily_precip = filtered_collection.map(lambda image: image.reduceRegions(
        reducer=ee.Reducer.mean().setOutputs(['precipitation']),
        collection=region,
        scale=5000
    )).flatten()

    # Export the FeatureCollection to Google Drive as a CSV file
    task = ee.batch.Export.table.toDrive(
        collection=daily_precip,
        description=country+'_DailyCHIRPSPrecip',
        fileFormat='CSV',
        folder='test',
        selectors=['system:index', 'ADM0_CODE',	'ADM0_NAME', 'ADM1_CODE', 'ADM1_NAME', 'precipitation']
    )

    # Start the export task
    task.start()
    print('Starting task for '+ country)

Data for Finland extends past dataset limit. Setting end date as 
Starting task for Finland
Starting task for Norway
Data for Sweden extends past dataset limit. Setting end date as 
Starting task for Sweden


For daily IMERG data (https://developers.google.com/earth-engine/datasets/catalog/NASA_GPM_L3_IMERG_V06#description)


PrecipitationCal: Multi-satellite precipitation estimate with gauge calibration (recommended for general use)

In [41]:
filtered_collection.first()

<ee.image.Image at 0x28945c9d0>

In [44]:
collection = ee.ImageCollection("NASA/GPM_L3/IMERG_V06")
collection_min_date = ee.Date(collection.aggregate_min('system:time_start'))
collection_max_date = ee.Date(collection.aggregate_max('system:time_start'))
collection = collection.select('precipitationCal')

for country in df['country'][1]:
    # define date range
    #start = ee.Date.fromYMD(df.loc[df['country'] == country, 'minyear'].item(), 1, 1)
    #end = ee.Date.fromYMD(df.loc[df['country'] == country, 'maxyear'].item(), 12, 31)
    start = ee.Date.fromYMD(2012, 1, 1)
    end = ee.Date.fromYMD(2012, 1, 5)

    # if dates are outside of collection range
    if start.difference(collection_min_date, 'days').getInfo() < 0: 
        print("Data for "+country+" starts earlier than dataset.")
        start = collection_min_date
    if end.difference(collection_max_date, 'days').getInfo() > 0: 
        print("Data for "+country+" extends past dataset limit.")
        end = collection_max_date

    region = ee.FeatureCollection('FAO/GAUL/2015/level1').filter(ee.Filter.eq('ADM0_NAME', country))
    filtered_collection = collection.filterDate(start, end)

    n_days = end.difference(start, 'day').subtract(1)
    days = ee.List.sequence(0, n_days).map(lambda n : start.advance(n, 'day'))
    daily_collection = ee.ImageCollection(days.map(lambda date: sumDailyPrecip(filtered_collection, date)))

    daily_precip = daily_collection.map(lambda image: image.reduceRegions(
        reducer=ee.Reducer.mean().setOutputs(['precipitation']),
        collection=region,
        scale=10000
    ))

    # Define a function to transfer properties from the original image to the features
    def transferProperties(feature):
        # Get the corresponding image
        image = ee.Image(feature.get('image'))
        # Transfer desired properties from the image to the feature
        return feature.copyProperties(image, ['day', 'month', 'year'])

    # Map the transferProperties function over the daily_precip feature collection
    transferred_precip = daily_precip.map(transferProperties).flatten()

    # Export the FeatureCollection to Google Drive as a CSV file
    task = ee.batch.Export.table.toDrive(
        collection=transferred_precip,
        description=country+'_sum_GPM',
        fileFormat='CSV',
        folder='GEE_data/precip_daily',
        selectors=['system:index', 'day', 'month', 'year', 'ADM0_CODE',	'ADM0_NAME', 'ADM1_CODE', 'ADM1_NAME', 'precipitation']
    )

    # Start the export task
    task.start()

KeyboardInterrupt: 

Trying monthly aggregate from here: https://developers.google.com/earth-engine/datasets/catalog/ECMWF_ERA5_LAND_MONTHLY_AGGR

In [147]:
collection = ee.ImageCollection("ECMWF/ERA5_LAND/MONTHLY_AGGR")
collection_min_date = ee.Date(collection.aggregate_min('system:time_start'))
collection_max_date = ee.Date(collection.aggregate_max('system:time_start'))
collection = collection.select('total_precipitation_sum')

for country in df['country']:
    # define date range
    start = ee.Date.fromYMD(df.loc[df['country'] == country, 'minyear'].item(), 1, 1)
    end = ee.Date.fromYMD(df.loc[df['country'] == country, 'maxyear'].item(), 12, 31)

    # if dates are outside of collection range
    if start.difference(collection_min_date, 'days').getInfo() < 0: 
        print("Data for "+country+" starts earlier than dataset.")
        start = collection_min_date
    if end.difference(collection_max_date, 'days').getInfo() > 0: 
        print("Data for "+country+" extends past dataset limit.")
        end = collection_max_date
    
    region = ee.FeatureCollection('FAO/GAUL/2015/level1').filter(ee.Filter.eq('ADM0_NAME', country))
    filtered_collection = collection.filterDate(start, end)

   # n_days = end.difference(start, 'day').subtract(1)
   # days = ee.List.sequence(0, n_days).map(lambda n : start.advance(n, 'day'))
   # daily_collection = ee.ImageCollection(days.map(lambda date: sumDailyPrecip(filtered_collection, date)))

    # determine mean for admin region
    monthly_precip = collection.map(lambda image: image.reduceRegions(
        reducer=ee.Reducer.mean().setOutputs(['precipitation_m']),
        collection=region,
        scale=10000
    )).flatten()

    # Export the FeatureCollection to Google Drive as a CSV file
    task = ee.batch.Export.table.toDrive(
        collection=daily_precip,
        description=country+'_monthly_GPM',
        fileFormat='CSV',
        folder='GEE_data/precip_monthly',
        selectors=['system:index', 'ADM0_CODE',	'ADM0_NAME', 'ADM1_CODE', 'ADM1_NAME', 'precipitation_m']
    )

    # Start the export task
    task.start()

Data for China extends past dataset limit.
Data for Finland extends past dataset limit.
Data for Germany extends past dataset limit.
Data for Nigeria extends past dataset limit.
Data for Sweden extends past dataset limit.


In [34]:
val = img.get('HQ_Precipitation')

In [39]:
collection = ee.ImageCollection('UCSB-CHG/CHIRPS/DAILY')
print(collection.first())

ee.Image({
  "functionInvocationValue": {
    "functionName": "Collection.first",
    "arguments": {
      "collection": {
        "functionInvocationValue": {
          "functionName": "ImageCollection.load",
          "arguments": {
            "id": {
              "constantValue": "UCSB-CHG/CHIRPS/DAILY"
            }
          }
        }
      }
    }
  }
})


In [40]:
collection = ee.ImageCollection("NASA/GPM_L3/IMERG_V06")
collection_min_date = ee.Date(collection.aggregate_min('system:time_start'))
collection_max_date = ee.Date(collection.aggregate_max('system:time_start'))
collection = collection.select('precipitationCal')

for country in df['country']:
    # define date range
    start = ee.Date.fromYMD(df.loc[df['country'] == country, 'minyear'].item(), 1, 1)
    end = ee.Date.fromYMD(df.loc[df['country'] == country, 'maxyear'].item(), 12, 31)

    # if dates are outside of collection range
    if start.difference(collection_min_date, 'days').getInfo() < 0: 
        print("Data for "+country+" starts earlier than dataset.")
        start = collection_min_date
    if end.difference(collection_max_date, 'days').getInfo() > 0: 
        print("Data for "+country+" extends past dataset limit.")
        end = collection_max_date
    
    region = ee.FeatureCollection('FAO/GAUL/2015/level1').filter(ee.Filter.eq('ADM0_NAME', country))
    filtered_collection = collection.filterDate(start, end)

    n_days = end.difference(start, 'day').subtract(1)
    days = ee.List.sequence(0, n_days).map(lambda n : start.advance(n, 'day'))
    daily_collection = ee.ImageCollection(days.map(lambda date: sumDailyPrecip(filtered_collection, date)))

    # determine mean for admin region
    daily_precip = daily_collection.map(lambda image: image.reduceRegions(
        reducer=ee.Reducer.mean().setOutputs(['precipitation']),
        collection=region,
        scale=10000
    )).flatten()

    # Export the FeatureCollection to Google Drive as a CSV file
    task = ee.batch.Export.table.toDrive(
        collection=daily_precip,
        description=country+'_GPM',
        fileFormat='CSV',
        folder='GEE_data/precip',
        selectors=['system:index', 'ADM0_CODE',	'ADM0_NAME', 'ADM1_CODE', 'ADM1_NAME', 'precipitation']
    )

    # Start the export task
    task.start()
    print('Starting task for '+ country)

Starting task for Austria
Starting task for Brazil
Data for Chile starts earlier than dataset.
Starting task for Chile
Starting task for China
Data for Finland starts earlier than dataset.
Starting task for Finland
Starting task for France
Starting task for Germany
Starting task for Nigeria
Starting task for Norway
Starting task for Slovakia
Starting task for Slovenia
Starting task for Sweden
Data for United States of America starts earlier than dataset.
Starting task for United States of America


## MODIS EVI DATA

In [91]:
collection = ee.ImageCollection("MODIS/061/MOD13A2")
collection_min_date = ee.Date(collection.aggregate_min('system:time_start'))
collection_max_date = ee.Date(collection.aggregate_max('system:time_start'))

#for country in ['Finland']:
country = 'Finland'
# define date range
start = ee.Date.fromYMD(df.loc[df['country'] == country, 'minyear'].item(), 1, 1)
end = ee.Date.fromYMD(df.loc[df['country'] == country, 'maxyear'].item(), 12, 31)

# if dates are outside of collection range
if start.difference(collection_min_date, 'days').getInfo() < 0: 
    print("Data for "+country+" starts earlier than dataset.")
    start = collection_min_date
if end.difference(collection_max_date, 'days').getInfo() > 0: 
    print("Data for "+country+" extends past dataset limit.")
    end = collection_max_date

region = ee.FeatureCollection('FAO/GAUL/2015/level2').filter(ee.Filter.eq('ADM0_NAME', country))
filtered_collection = collection.filterDate(start, end)

# determine mean for admin region
daily_evi = filtered_collection.map(lambda image: image.reduceRegions(
    reducer=ee.Reducer.mean().setOutputs(['EVI']),
    collection=region,
    scale=1000
)).flatten()

# Export the FeatureCollection to Google Drive as a CSV file
task = ee.batch.Export.table.toDrive(
    collection=daily_evi,
    description=country+'_MonthlyModisEVI',
    fileFormat='CSV',
    folder='GEE_data',
    selectors=['system:index', 'ADM0_CODE',	'ADM0_NAME', 'ADM1_CODE', 'ADM1_NAME', 'ADM2_CODE', 'ADM2_NAME', 'EVI']
)

# Start the export task
task.start()
print('Starting task for '+ country)

Data for Finland starts earlier than dataset.
Data for Finland extends past dataset limit.
Starting task for Finland


### Organize downloaded data

In [6]:
data_dir = '../../data/climate/'

In [7]:
# define a dictionary for filename, value name, and unit

climate_factors = {'DailyLST': ['mean_lst', 'celcius'],
            'CHIRPSPrecip': ['precipitation', 'mm'],
            'ModisEVI': ['EVI', 'spectral index']
            }

In [9]:
import os
import pandas as pd

# get and combine files
climate_df_list = []

for key, value in climate_factors.items():
    ext = key + '.csv'
    files = [file for file in os.listdir(data_dir) if file.endswith(ext)]

    df_list = []

    # create dataframe list
    for file_name in files:
        file_path = os.path.join(data_dir, file_name)
        df = pd.read_csv(file_path)  
        df_list.append(df)

    # combine to one dataframe
    climate_df = pd.concat(df_list, ignore_index=True)

    # rename column 
    climate_df.rename(columns={value[0]: 'value'}, inplace=True)
    climate_df['measure'] = value[0]
    climate_df['unit'] = value[1]

    # fix date column
    if key != 'CHIRPSPrecip':
        climate_df[['year', 'month', 'day', 'index']] = climate_df['system:index'].str.split('_', expand=True)
    else:
        climate_df['year'] = climate_df['system:index'].str[:4]
        climate_df['month'] = climate_df['system:index'].str[4:6]
        climate_df['day'] = climate_df['system:index'].str[6:8]
        climate_df['index'] = climate_df['system:index'].str[9:]

    climate_df = climate_df.drop(columns=['system:index', 'index'])

    # summarize by month
    if key != 'CHIRPSPrecip':
        climate_df = climate_df.groupby(['ADM0_NAME', 'ADM0_CODE', 'year', 'month', 'day']).agg({
            'value': 'mean',
         #   'ADM0_CODE': 'first',  
          #  'ADM0_NAME': 'first',
            'measure': 'first',
            'unit': 'first',
        }).reset_index()
    else:
        climate_df = climate_df.groupby(['ADM0_NAME', 'ADM0_CODE', 'year', 'month', 'day']).agg({
            'value': 'sum',
         #   'ADM0_CODE': 'first',  
         #   'ADM0_NAME': 'first',
            'measure': 'first',
            'unit': 'first',
        }).reset_index()

    # add to list of dataframes
    climate_df_list.append(climate_df)

In [12]:
out_df = pd.concat(climate_df_list)

In [15]:
# combine dfs
#df = pd.concat(climate_df_list, ignore_index=True)

# change column order
#df = df[['ADM0_NAME', 'ADM0_CODE', 'ADM1_NAME', 'ADM1_CODE', 'year', 'month', 'value', 'measure', 'unit']]

# save
out_df.to_csv(os.path.join(data_dir, '../processed/climate_data_daily.csv'), index=False)