# Download survey GeoTiffs

New, faster method for downloading DHS cluster images! Based on [this blog post by Noel Gorelick](https://gorelick.medium.com/fast-er-downloads-a2abd512aa26).

Adapted from code provided by Markus Pettersson.

Import, authenticate and initialize the earth-engine library

In [1]:
import ee

ee.Authenticate()

# Initialize the Google Earth Engine API with the high volume end-point.
# See https://developers.google.com/earth-engine/cloud/highvolume
ee.Initialize(opt_url='https://earthengine-highvolume.googleapis.com')

Enter verification code: 4/1AfJohXkEeOWKps4TXYpc258gDnIVOCa7NjjV4r7Udd9aotr6tOgLHURQMlU

Successfully saved authorization token.


In [3]:
# Import other libraries
import pandas as pd
import os
import satellite_sampling_annual_v2
import datetime

Read the csv file with survey points

In [4]:
interim_data_dir = '/mimer/NOBACKUP/groups/globalpoverty1/cindy/eoml_ch_wb/data/interim'
dhs_cluster_file_path = os.path.join(interim_data_dir, 'dhs_est_iwi.csv')
df = pd.read_csv(dhs_cluster_file_path)
df.head()

Unnamed: 0,country,survey_start_year,year,lat,lon,households,rural,iwi,dhs_id,image_file,...,iwi_1990_1992_est,iwi_1993_1995_est,iwi_1996_1998_est,iwi_1999_2001_est,iwi_2002_2004_est,iwi_2005_2007_est,iwi_2008_2010_est,iwi_2011_2013_est,iwi_2014_2016_est,iwi_2017_2019_est
0,south_africa,2016,2016,-34.463232,19.542468,6,1,70.723295,48830,./data/dhs_tifs/south_africa_2016/00743.tif,...,35.205078,30.981445,33.911133,43.969727,38.295898,33.579102,32.757568,38.330078,44.604492,49.267578
1,south_africa,2016,2016,-34.418873,19.188926,11,0,76.798705,48781,./data/dhs_tifs/south_africa_2016/00694.tif,...,49.243164,53.222656,56.29883,59.228516,60.98633,63.51563,66.223145,66.45508,66.137695,64.501953
2,south_africa,2016,2016,-34.412835,19.178965,4,0,81.053723,48828,./data/dhs_tifs/south_africa_2016/00741.tif,...,48.388672,51.97754,54.44336,58.71582,60.419923,63.03711,66.430664,65.934247,66.186523,64.25781
3,south_africa,2016,2016,-34.292107,19.563813,6,1,72.76688,48787,./data/dhs_tifs/south_africa_2016/00700.tif,...,21.789551,22.2229,20.300293,25.082397,27.207032,27.719727,26.94702,34.114584,36.865234,42.041016
4,south_africa,2016,2016,-34.1875,22.113079,3,0,77.864113,48756,./data/dhs_tifs/south_africa_2016/00669.tif,...,44.04297,46.875,49.617514,48.321533,53.23242,56.865233,65.01465,65.65755,72.90039,67.529297


Split the dataframe into each country-year combination:

In [5]:
surveys_with_dfs = [(survey, survey_df.reset_index(drop=True)) for survey, survey_df in 
                    df.groupby(['country', 'year'])]

Function for checking if sample is already downloaded, in case the script needs to be restarted for some reason

In [6]:
def check_if_downloaded(row, save_dir, min_file_size=3145728):
    file_name = f'{row.name:05d}.tif'
    file_path = os.path.join(save_dir, file_name)
    
    # Check if file exists and is larger than min_file_size
    return os.path.isfile(file_path) and (os.stat(file_path).st_size > min_file_size)

Download each survey from Google Earth Engine

In [7]:
for survey, survey_df in surveys_with_dfs:
    country, year = survey
    print(f'Downloading images for {country}-{year}...'+
        datetime.datetime.now().strftime("%d.%b %Y %H:%M:%S"))
       
    data_dir = '/mimer/NOBACKUP/groups/globalpoverty1/cindy/eoml_ch_wb/data/'    
    save_dir = os.path.join(data_dir, f'dhs_tifs_annual/{country}_{year}')        
           
    # Check if survey is already fully/partially downloaded
    if os.path.exists(save_dir):
        is_downloaded = survey_df.apply(lambda row: check_if_downloaded(row, save_dir), axis=1)
        samples_to_download = survey_df[~is_downloaded]
    else:
        os.makedirs(save_dir)
        samples_to_download = survey_df
    
    # If there are no samples to download for this survey, continue to next one
    if len(samples_to_download) == 0:
        continue
    
    satellite_sampling_annual_v2.export_images(samples_to_download, save_dir, span_length=1)

Downloading images for angola-2006...08.Oct 2023 08:44:20
Downloading images for benin-1996...08.Oct 2023 08:46:42
Downloading images for burkina_faso-1999...08.Oct 2023 08:51:40
Downloading images for burundi-2010...08.Oct 2023 08:53:58
Downloading images for cameroon-2004...08.Oct 2023 09:00:52
Downloading images for central_african_republic-1995...08.Oct 2023 09:13:21
Downloading images for chad-2014...08.Oct 2023 09:15:15
Downloading images for comoros-2012...08.Oct 2023 09:23:00
Downloading images for democratic_republic_of_congo-2007...08.Oct 2023 09:27:15
Downloading images for egypt-1996...08.Oct 2023 09:38:03
Downloading images for eswatini-2006...08.Oct 2023 09:38:51
Downloading images for ethiopia-2000...08.Oct 2023 09:43:52
Downloading images for gabon-2012...08.Oct 2023 10:08:13
Downloading images for ghana-1999...08.Oct 2023 10:17:38
Downloading images for guinea-1999...08.Oct 2023 10:21:35
Downloading images for ivory_coast-1999...08.Oct 2023 10:31:25
Downloading images 