# Introduction

The *pyveg* package contains some useful functions for interacting with the Python API of Google Earth Engine to download images, and also code to process these images and prepare them for analysis.

In particular, we want to look at the "connectedness" of patterned vegetation.  To do this, we download NDVI (Normalised Difference Vegetation Index) images from GEE, and use some image processing techniques to convert these into binary black-and-white images, that we then divide into 50x50 pixel sub-images, and do some network analysis on them.

Before we can use GEE, we need to authenticate (assuming we have an account).

In [1]:
import ee
ee.Authenticate()

Enter verification code: 4/4gHY1mbA7s9uO-zuYK0K7y5Q-BdxFj5xfLOMEebXFBBueb3ZsZgiEeU

Successfully saved authorization token.


## Pipelines, Sequences, and Modules

In pyveg, we have the concept of a "Pipeline" for downloading and processing data from GEE.

A Pipeline is composed of one or more Sequences, which are in turn composed of Modules.

A Module is an class designed for one specific task (e.g. "download vegetation data from GEE", or "calculate network centrality of binary images"), and they are generally grouped into Sequences such that one Module will work on the output of the previous one.  
So our standard Pipeline has:
* A vegetation Sequence consisting of VegetationDownloader, VegetationImageProcessor, NetworkCentralityCalculator, and NDVICalculator.   
* A weather Sequence consisting of WeatherDownloader, WeatherImageToJSON
* A combiner Sequence consisting of a single combiner Module, that takes the outputs of the other two Sequences and produces a final output file.

### Running the full pipeline from the command-line

For the second part of this notebook will will demonstrate running individual Modules and Sequences, but the majority of users will probably just want to run the full Pipeline for their selected location/collection/date range, so we will cover that first.

We have a couple of "entrypoints" (i.e. command-line commands) linked to functions in some pyveg scripts to help do this.  
* To configure and run a downloading-and-processing pipeline we run the command `pyveg_run_pipeline --config_file <some-config-file>`
* To generate the config file in the above command we have the command `pyveg_generate_config`.

Both these can accept multiple command-line arguments, and these can be seen with the `--help` argument:

In [2]:
!pyveg_generate_config --help

usage: pyveg_generate_config [-h] [--coords_id COORDS_ID]
                             [--configs_dir CONFIGS_DIR]
                             [--collection_name COLLECTION_NAME]
                             [--output_dir OUTPUT_DIR] [--test_mode]
                             [--latitude LATITUDE] [--longitude LONGITUDE]
                             [--country COUNTRY] [--start_date START_DATE]
                             [--end_date END_DATE]
                             [--time_per_point TIME_PER_POINT]
                             [--region_size REGION_SIZE]
                             [--pattern_type PATTERN_TYPE]
                             [--run_mode RUN_MODE] [--n_threads N_THREADS]

Create a config file for running pyveg_pipeline. If run with no arguments
(recommended), the user will be prompted for each parameter, or can choose a
default value.

optional arguments:
  -h, --help            show this help message and exit
  --coords_id COORDS_ID
         

For `pyveg_generate_config` any parameters it needs that are not provided as command-line arguments will be requested from the user, and the various allowed options will be provided, along with (in most cases) default values that will be used if the user just presses "enter".
However, although just running `pyveg_generate_config` with no arguments and then responding to the prompts is probably the easiest way to run it on the command line, this doesn't seem to work so well with Jupyter, so let's just provide all the arguments it needs:

In [3]:
!pyveg_generate_config --configs_dir ../../pyveg/configs --collection_name Sentinel2 --output_dir ./ --test_mode --latitude 11.58 --longitude 27.94 --country Sudan --start_date 2019-01-01 --end_date 2019-04-01 --time_per_point 1m --run_mode local --n_threads 2   --region_size 0.08 --pattern_type 'unknown'


    output_location ./Sentinel2-11.58N-27.94E-Sudan
    collection: Sentinel2
    latitude: 11.58
    longitude: 27.94
    country: Sudan
    pattern_type: unknown
    start_date: 2019-01-01
    end_date: 2019-04-01
    time_per_point: 1m
    region_size: 0.08
    run_mode: local
    n_threads: 2
    
Wrote file 
  ../../pyveg/configs/testconfig_Sentinel2_11.58N_27.94E_Sudan_0.08_unknown_2019-01-01_2019-04-01_1m_local.py
We recommend that you add and commit this to your version control repository.

To run pyveg using this configuration, do:

pyveg_run_pipeline --config_file ../../pyveg/configs/testconfig_Sentinel2_11.58N_27.94E_Sudan_0.08_unknown_2019-01-01_2019-04-01_1m_local.py




We can see from the output that a new config file has been written, and the command we should use to run with it.

In [4]:
!pyveg_run_pipeline --config_file ../../pyveg/configs/testconfig_Sentinel2_11.58N_27.94E_Sudan_2019-01-01_2019-04-01_1m_local.py




    azure_config.py not found - this is needed for using Azure storage or batch.
    Copy pyveg/azure_config_template.py to pyveg/azure_config.py then input your
    own values for Azure Storage account name and Access key, then redo `pip install .`
    

    azure_config.py not found - this is needed for using Azure storage or batch.
    Copy pyveg/azure_config_template.py to pyveg/azure_config.py then input your
    own values for Azure Storage account name and Access key, then redo `pip install .`
    
Traceback (most recent call last):
  File "/anaconda3/envs/veg/bin/pyveg_run_pipeline", line 8, in <module>
    sys.exit(main())
  File "/anaconda3/envs/veg/lib/python3.7/site-packages/pyveg/scripts/run_pyveg_pipeline.py", line 147, in main
    pipeline = build_pipeline(args.config_file, args.from_cache)
  File "/anaconda3/envs/veg/lib/python3.7/site-packages/pyveg/scripts/run_pyveg_pipeline.py", line 41, in build_pipeline
    raise FileNotFoundError("Unable to find config file {}".f


So we just:
* Downloaded some Sentinel2 images from GEE
* Converted these raw tif images into RGB png, greyscale NDVI png, and black-and-white binarized png images.
* Split the above pngs into 50x50 sub-images
* Calculated the Network Centrality and total NDVI of each sub-image
* Downloaded some ERA5 weather data from GEE
* Read off the values of precipitation and temperature from these tifs
* Combined the vegetation and weather data into one output file

The final output file is called "results_summary.json" and contains some metadata describing the configuration, and time-series data for the vegetation and weather.

# (Optional) Running the pieces individually

Though the above method is the easiest way to get up-and-running, some users may be interested in running the components of pyveg individually.

In [5]:
from pyveg.src.download_modules import VegetationDownloader


    azure_config.py not found - this is needed for using Azure storage or batch.
    Copy pyveg/azure_config_template.py to pyveg/azure_config.py then input your
    own values for Azure Storage account name and Access key, then redo `pip install .`
    

    azure_config.py not found - this is needed for using Azure storage or batch.
    Copy pyveg/azure_config_template.py to pyveg/azure_config.py then input your
    own values for Azure Storage account name and Access key, then redo `pip install .`
    


In [6]:
# instantiate this Module:
vd = VegetationDownloader("Sentinel2_download")

A lot of the parameters we need to configure this Module are in the `configs/collections.py` file - there is a large dictionary containing values for e.g. Sentinel 2.

In [7]:
from pyveg.configs.collections import data_collections
s2_config = data_collections["Sentinel2"]
print(s2_config)

{'collection_name': 'COPERNICUS/S2', 'data_type': 'vegetation', 'RGB_bands': ['B4', 'B3', 'B2'], 'NIR_band': 'B8', 'mask_cloud': True, 'cloudy_pix_frac': 50, 'cloudy_pix_flag': 'CLOUDY_PIXEL_PERCENTAGE', 'min_date': '2016-01-01', 'max_date': '2020-01-01', 'time_per_point': '1m'}


we also need to specify the coordinates we want to look at (in ***(long,lat)*** format) - let's look at one of our locations in the Sahel:

In [8]:
coords = [28.37,11.12]

And we need to choose a date range.  If we are looking at vegetation data as in this case, we will take the median of all images available within this date range (after filtering out cloudy ones).

For the sake of this tutorial, let's just look at a short date range - in fact just a single month:

In [9]:
date_range = ["2018-06-01","2018-07-01"]

We also need to set an output location to store the files.  We can just use a temporary directory.   The downloaded files will go into a subdirectory of this called "RAW", and then into further subdirectories per mid-point of each date sub-range we're looking at.   Here, we are just looking at one month, and the midpoint will be "2018-06-16".

In [10]:
import os
if os.name == "posix":
    TMPDIR = "/tmp"
else:
    TMPDIR = "%TMP%"
    
output_veg_location = os.path.join(TMPDIR,"gee_veg_download_example")
output_location_type = "local" # other alternative currently possible is `azure` for MS Azure cloud, if setup

Now we're ready to configure the module:

In [11]:
# we could go through all the key,value pairs in the s2_config dict setting them all
# individually, but lets do them all at once
vd.set_parameters(s2_config)
vd.coords = coords
vd.date_range = date_range
vd.output_location = output_veg_location
vd.output_location_type = output_location_type
vd.configure()
print(vd)

2020-09-28 11:20:24,909 [INFO] Sentinel2_download: setting collection_name to COPERNICUS/S2
2020-09-28 11:20:24,913 [INFO] Sentinel2_download: setting data_type to vegetation
2020-09-28 11:20:24,914 [INFO] Sentinel2_download: setting RGB_bands to ['B4', 'B3', 'B2']
2020-09-28 11:20:24,915 [INFO] Sentinel2_download: setting NIR_band to B8
2020-09-28 11:20:24,916 [INFO] Sentinel2_download: setting mask_cloud to True
2020-09-28 11:20:24,917 [INFO] Sentinel2_download: setting cloudy_pix_frac to 50
2020-09-28 11:20:24,918 [INFO] Sentinel2_download: setting cloudy_pix_flag to CLOUDY_PIXEL_PERCENTAGE
2020-09-28 11:20:24,919 [INFO] Sentinel2_download: setting min_date to 2016-01-01
2020-09-28 11:20:24,919 [INFO] Sentinel2_download: setting max_date to 2020-01-01
2020-09-28 11:20:24,920 [INFO] Sentinel2_download: setting time_per_point to 1m


        [Module]: Sentinel2_download 
        depends_on: []
        is_configured: True
        is_finished: False
        run_status: {'succeeded': 0, 'failed': 0, 'incomplete': 0}
        collection_name: COPERNICUS/S2
        data_type: vegetation
        RGB_bands: ['B4', 'B3', 'B2']
        NIR_band: B8
        mask_cloud: True
        cloudy_pix_frac: 50
        cloudy_pix_flag: CLOUDY_PIXEL_PERCENTAGE
        min_date: 2016-01-01
        max_date: 2020-01-01
        time_per_point: 1m
        coords: [28.37, 11.12]
        date_range: ['2018-06-01', '2018-07-01']
        output_location: /tmp/gee_veg_download_example
        output_location_type: local
        region_size: 0.08
        scale: 10
        replace_existing_files: False
        num_files_per_point: 4




The Module is all configured and ready-to-go!

In [12]:
vd.run()

2020-09-28 11:20:56,293 [INFO] Sentinel2_download: Will download to /tmp/gee_veg_download_example/2018-06-16/RAW
INFO:pyveg_logger:Sentinel2_download: Will download to /tmp/gee_veg_download_example/2018-06-16/RAW
2020-09-28 11:20:56,320 [INFO] Sentinel2_download: download succeeded for date range ['2018-06-01', '2018-07-01']
INFO:pyveg_logger:Sentinel2_download: download succeeded for date range ['2018-06-01', '2018-07-01']


{'succeeded': 1, 'failed': 0, 'incomplete': 0}

There should now be some files in the output location:

In [13]:
os.listdir(os.path.join(output_veg_location,"2018-06-16","RAW"))

['download.NDVI.tif', 'download.B4.tif', 'download.B3.tif', 'download.B2.tif']

So we have one .tif file per band.   

The next Module we would normally run in the vegetation Sequence is the VegetationImageProcessor that will take these tif files and produce png images from them.  This includes histogram equalization, adaptive thresholding and median filtering on an input image, to give us binary NDVI images.  It then divides these into 50x50 sub-images.

In [14]:
from pyveg.src.processor_modules import VegetationImageProcessor
vip = VegetationImageProcessor("Sentinel2_img_processor")
vip.set_parameters(s2_config)
vip.coords = coords


2020-09-28 11:20:56,677 [INFO] Sentinel2_img_processor: setting collection_name to COPERNICUS/S2
INFO:pyveg_logger:Sentinel2_img_processor: setting collection_name to COPERNICUS/S2
2020-09-28 11:20:56,678 [INFO] Sentinel2_img_processor: setting data_type to vegetation
INFO:pyveg_logger:Sentinel2_img_processor: setting data_type to vegetation
2020-09-28 11:20:56,680 [INFO] Sentinel2_img_processor: setting RGB_bands to ['B4', 'B3', 'B2']
INFO:pyveg_logger:Sentinel2_img_processor: setting RGB_bands to ['B4', 'B3', 'B2']
2020-09-28 11:20:56,681 [INFO] Sentinel2_img_processor: setting NIR_band to B8
INFO:pyveg_logger:Sentinel2_img_processor: setting NIR_band to B8
2020-09-28 11:20:56,683 [INFO] Sentinel2_img_processor: setting mask_cloud to True
INFO:pyveg_logger:Sentinel2_img_processor: setting mask_cloud to True
2020-09-28 11:20:56,684 [INFO] Sentinel2_img_processor: setting cloudy_pix_frac to 50
INFO:pyveg_logger:Sentinel2_img_processor: setting cloudy_pix_frac to 50
2020-09-28 11:20:56,

The only other things we need to set are the `input_location` (which will be the `output_location` from the downloader), and the `output_location` (which we will put as the same as the downloader's one - the results of this will go into different subdirectories of the date-named subdirectories).

In [15]:
vip.input_location = vd.output_location
vip.output_location = vd.output_location
vip.configure()
print(vip)

        [Module]: Sentinel2_img_processor 
        depends_on: []
        is_configured: True
        is_finished: False
        run_status: {'succeeded': 0, 'failed': 0, 'incomplete': 0}
        collection_name: COPERNICUS/S2
        data_type: vegetation
        RGB_bands: ['B4', 'B3', 'B2']
        NIR_band: B8
        mask_cloud: True
        cloudy_pix_frac: 50
        cloudy_pix_flag: CLOUDY_PIXEL_PERCENTAGE
        min_date: 2016-01-01
        max_date: 2020-01-01
        time_per_point: 1m
        coords: [28.37, 11.12]
        input_location: /tmp/gee_veg_download_example
        output_location: /tmp/gee_veg_download_example
        replace_existing_files: False
        num_files_per_point: 3
        input_location_type: local
        output_location_type: local
        dates_to_process: []
        run_mode: local
        n_batch_tasks: -1
        batch_task_dict: {}
        timeout: 30
        region_size: 0.08
        split_RGB_images: True
        input_location_subdirs: [

In [16]:
vip.run()

2020-09-28 11:20:56,702 [INFO] Sentinel2_img_processor: Running local
INFO:pyveg_logger:Sentinel2_img_processor: Running local
2020-09-28 11:20:56,706 [INFO] Sentinel2_img_processor processing files in /tmp/gee_veg_download_example/2018-06-16/RAW
INFO:pyveg_logger:Sentinel2_img_processor processing files in /tmp/gee_veg_download_example/2018-06-16/RAW
2020-09-28 11:20:56,708 [INFO] ['download.NDVI.tif', 'download.B4.tif', 'download.B3.tif', 'download.B2.tif']
INFO:pyveg_logger:['download.NDVI.tif', 'download.B4.tif', 'download.B3.tif', 'download.B2.tif']
2020-09-28 11:20:56,709 [INFO] Sentinel2_img_processor: Saving RGB image for 2018-06-16 28.37_11.12
INFO:pyveg_logger:Sentinel2_img_processor: Saving RGB image for 2018-06-16 28.37_11.12
2020-09-28 11:21:02,051 [INFO] Will save image to /tmp/gee_veg_download_example/2018-06-16/PROCESSED / 2018-06-16_28.37_11.12_RGB.png
INFO:pyveg_logger:Will save image to /tmp/gee_veg_download_example/2018-06-16/PROCESSED / 2018-06-16_28.37_11.12_RGB.p

Saved image '/tmp/gee_veg_download_example/2018-06-16/PROCESSED/2018-06-16_28.37_11.12_RGB.png'
Saved image '/tmp/gee_veg_download_example/2018-06-16/PROCESSED/2018-06-16_28.37_11.12_NDVI.png'
Saved image '/tmp/gee_veg_download_example/2018-06-16/PROCESSED/2018-06-16_28.37_11.12_BWNDVI.png'


{'succeeded': 1, 'failed': 0, 'incomplete': 0}

This should have created two new subdirectories: "PROCESSED" contains the full-size RGB, greyscale, and black-and-white images (the first of these using the RGB bands, and the latter two based on the NDVI band).  "SPLIT" contains the 50x50 sub-images.

In [17]:
os.listdir(os.path.join(output_veg_location,"2018-06-16","PROCESSED"))

['2018-06-16_28.37_11.12_NDVI.png',
 '2018-06-16_28.37_11.12_BWNDVI.png',
 '2018-06-16_28.37_11.12_RGB.png']

## Calculating network centrality

The next step in the standard vegetation sequence is the calculation of "offset50", which is related to the "connectedness" of the vegetation in the black-and-white NDVI sub-images.

In [18]:
from pyveg.src.processor_modules import NetworkCentralityCalculator
ncc = NetworkCentralityCalculator("Sentinel2_ncc")
ncc.set_parameters(s2_config)
ncc.input_location = vip.output_location
ncc.output_location = vip.output_location # same output location again - will create a 'JSON' subdir
ncc.configure()
print(ncc)

2020-09-28 11:21:14,402 [INFO] Sentinel2_ncc: setting collection_name to COPERNICUS/S2
INFO:pyveg_logger:Sentinel2_ncc: setting collection_name to COPERNICUS/S2
2020-09-28 11:21:14,404 [INFO] Sentinel2_ncc: setting data_type to vegetation
INFO:pyveg_logger:Sentinel2_ncc: setting data_type to vegetation
2020-09-28 11:21:14,405 [INFO] Sentinel2_ncc: setting RGB_bands to ['B4', 'B3', 'B2']
INFO:pyveg_logger:Sentinel2_ncc: setting RGB_bands to ['B4', 'B3', 'B2']
2020-09-28 11:21:14,407 [INFO] Sentinel2_ncc: setting NIR_band to B8
INFO:pyveg_logger:Sentinel2_ncc: setting NIR_band to B8
2020-09-28 11:21:14,408 [INFO] Sentinel2_ncc: setting mask_cloud to True
INFO:pyveg_logger:Sentinel2_ncc: setting mask_cloud to True
2020-09-28 11:21:14,410 [INFO] Sentinel2_ncc: setting cloudy_pix_frac to 50
INFO:pyveg_logger:Sentinel2_ncc: setting cloudy_pix_frac to 50
2020-09-28 11:21:14,411 [INFO] Sentinel2_ncc: setting cloudy_pix_flag to CLOUDY_PIXEL_PERCENTAGE
INFO:pyveg_logger:Sentinel2_ncc: setting cl

        [Module]: Sentinel2_ncc 
        depends_on: []
        is_configured: True
        is_finished: False
        run_status: {'succeeded': 0, 'failed': 0, 'incomplete': 0}
        collection_name: COPERNICUS/S2
        data_type: vegetation
        RGB_bands: ['B4', 'B3', 'B2']
        NIR_band: B8
        mask_cloud: True
        cloudy_pix_frac: 50
        cloudy_pix_flag: CLOUDY_PIXEL_PERCENTAGE
        min_date: 2016-01-01
        max_date: 2020-01-01
        time_per_point: 1m
        input_location: /tmp/gee_veg_download_example
        output_location: /tmp/gee_veg_download_example
        replace_existing_files: False
        num_files_per_point: 1
        input_location_type: local
        output_location_type: local
        dates_to_process: []
        run_mode: local
        n_batch_tasks: -1
        batch_task_dict: {}
        timeout: 30
        n_threads: 4
        n_sub_images: -1
        input_location_subdirs: ['SPLIT']
        output_location_subdirs: ['JSON', '

One other setting that we might want to change is the number of sub-images per full-size-image for which we do the network centrality calculation.   There are 289 sub-images per full-size-image, and it can be quite time-consuming to process all of them (even though some parallization is implemented - see `n_threads` argument).   We can set this to a smaller number for testing purposes.

In [19]:
ncc.n_sub_images = 10

In [20]:
ncc.run()

2020-09-28 11:21:14,428 [INFO] Sentinel2_ncc: Running local
INFO:pyveg_logger:Sentinel2_ncc: Running local
2020-09-28 11:21:14,434 [INFO] Sentinel2_ncc: processing 2018-06-16
INFO:pyveg_logger:Sentinel2_ncc: processing 2018-06-16
2020-09-28 11:21:14,813 [INFO] Sentinel2_ncc found 289 sub-images
INFO:pyveg_logger:Sentinel2_ncc found 289 sub-images


Processed 10 sub-images...

2020-09-28 11:21:25,138 [INFO] 
 Consolidating json from all subimages
INFO:pyveg_logger:
 Consolidating json from all subimages


{'succeeded': 1, 'failed': 0, 'incomplete': 0}

We should now have a json file in the output directory:

In [21]:
os.listdir(os.path.join(output_veg_location,"2018-06-16","JSON","NC"))

['network_centralities.json']

In [22]:
import json
j=json.load(open(os.path.join(output_veg_location,"2018-06-16","JSON","NC","network_centralities.json")))


The contents of the json file is a list (one entry per sub-image) of dictionaries, and the dictionary keys includ latitude, longitude of the sub-image, as well as "offset50".

In [23]:
j[0].keys()

dict_keys(['slope', 'offset', 'offset50', 'mean', 'std', 'feature_vec', 'date', 'latitude', 'longitude'])

### Running the weather Sequence

Here we ran the vegetation-related Modules one-by-one, but we can also combine Modules into Sequences.  As an example, lets do this for the weather downloader Module, and the Module that reads the downloaded images and produces output json files.

In [24]:
from pyveg.src.pyveg_pipeline import Sequence
from pyveg.src.download_modules import WeatherDownloader
from pyveg.src.processor_modules import WeatherImageToJSON

In [25]:
era_config = data_collections["ERA5"]
era_config

{'collection_name': 'ECMWF/ERA5/MONTHLY',
 'data_type': 'weather',
 'precipitation_band': ['total_precipitation'],
 'temperature_band': ['mean_2m_air_temperature'],
 'min_date': '1986-01-01',
 'max_date': '2020-01-01',
 'time_per_point': '1m'}

The default is to download all the monthly weather data since 1986, but for the sake of speed, lets just look at the same small date range as before

In [26]:
s=Sequence("era5_sequence")
s.date_range = date_range
s.coords = coords # use the same location as we used above, in the Sahel
s.set_config(era_config)

2020-09-28 11:21:25,205 [INFO] era5_sequence: setting collection_name to ECMWF/ERA5/MONTHLY
INFO:pyveg_logger:era5_sequence: setting collection_name to ECMWF/ERA5/MONTHLY
2020-09-28 11:21:25,207 [INFO] era5_sequence: setting data_type to weather
INFO:pyveg_logger:era5_sequence: setting data_type to weather
2020-09-28 11:21:25,209 [INFO] era5_sequence: setting precipitation_band to ['total_precipitation']
INFO:pyveg_logger:era5_sequence: setting precipitation_band to ['total_precipitation']
2020-09-28 11:21:25,211 [INFO] era5_sequence: setting temperature_band to ['mean_2m_air_temperature']
INFO:pyveg_logger:era5_sequence: setting temperature_band to ['mean_2m_air_temperature']
2020-09-28 11:21:25,213 [INFO] era5_sequence: setting min_date to 1986-01-01
INFO:pyveg_logger:era5_sequence: setting min_date to 1986-01-01
2020-09-28 11:21:25,215 [INFO] era5_sequence: setting max_date to 2020-01-01
INFO:pyveg_logger:era5_sequence: setting max_date to 2020-01-01
2020-09-28 11:21:25,217 [INFO] e

Now we can add Modules to the Sequence, just using the "+=" operator:

In [27]:
s += WeatherDownloader()
s += WeatherImageToJSON()
s.configure()
print(s)


    [Sequence]: era5_sequence 
    depends_on: []
    output_location: gee_28.37_11.12_era5_sequence
    output_location_type: local
    is_configured: True
    is_finished: False
    run_status: {}
    date_range: ['2018-06-01', '2018-07-01']
    coords: [28.37, 11.12]
    collection_name: ECMWF/ERA5/MONTHLY
    data_type: weather
    precipitation_band: ['total_precipitation']
    temperature_band: ['mean_2m_air_temperature']
    min_date: 1986-01-01
    max_date: 2020-01-01
    time_per_point: 1m

    ------- Modules ----------

        [Module]: era5_sequence_WeatherDownloader 
        depends_on: []
        is_configured: True
        is_finished: False
        run_status: {'succeeded': 0, 'failed': 0, 'incomplete': 0}
        output_location: gee_28.37_11.12_era5_sequence
        output_location_type: local
        coords: [28.37, 11.12]
        date_range: ['2018-06-01', '2018-07-01']
        region_size: 0.08
        scale: 10
        replace_existing_files: False
        num_

We have been given default values for the "output_location", which we might want to override for this example and just use a temporary location

In [28]:
output_weather_location = os.path.join(TMPDIR, "gee_weather_download_example")
s.output_location = output_weather_location
# need to reconfigure to propagate this to the Modules
s.configure()
print(s)


    [Sequence]: era5_sequence 
    depends_on: []
    output_location: /tmp/gee_weather_download_example
    output_location_type: local
    is_configured: True
    is_finished: False
    run_status: {}
    date_range: ['2018-06-01', '2018-07-01']
    coords: [28.37, 11.12]
    collection_name: ECMWF/ERA5/MONTHLY
    data_type: weather
    precipitation_band: ['total_precipitation']
    temperature_band: ['mean_2m_air_temperature']
    min_date: 1986-01-01
    max_date: 2020-01-01
    time_per_point: 1m

    ------- Modules ----------

        [Module]: era5_sequence_WeatherDownloader 
        depends_on: []
        is_configured: True
        is_finished: False
        run_status: {'succeeded': 0, 'failed': 0, 'incomplete': 0}
        output_location: /tmp/gee_weather_download_example
        output_location_type: local
        coords: [28.37, 11.12]
        date_range: ['2018-06-01', '2018-07-01']
        region_size: 0.08
        scale: 10
        replace_existing_files: False
    

And we're ready to run!

In [29]:
s.run()

2020-09-28 11:21:28,893 [INFO] era5_sequence_WeatherDownloader: Will download to /tmp/gee_weather_download_example/2018-06-16/RAW
INFO:pyveg_logger:era5_sequence_WeatherDownloader: Will download to /tmp/gee_weather_download_example/2018-06-16/RAW
2020-09-28 11:21:28,900 [INFO] era5_sequence_WeatherDownloader: download succeeded for date range ['2018-06-01', '2018-07-01']
INFO:pyveg_logger:era5_sequence_WeatherDownloader: download succeeded for date range ['2018-06-01', '2018-07-01']
2020-09-28 11:21:28,901 [INFO] era5_sequence_WeatherImageToJSON: Running local
INFO:pyveg_logger:era5_sequence_WeatherImageToJSON: Running local
2020-09-28 11:21:28,904 [INFO] era5_sequence_WeatherImageToJSON: Processing date 2018-06-16
INFO:pyveg_logger:era5_sequence_WeatherImageToJSON: Processing date 2018-06-16


Let's check we got some output:

In [30]:
os.listdir(os.path.join(output_weather_location, "2018-06-16","JSON","WEATHER"))

['weather_data.json']

In [31]:
import json
j=json.load(open(os.path.join(output_weather_location, "2018-06-16","JSON","WEATHER","weather_data.json")))
print(j)

{'mean_2m_air_temperature': 302.51324462890625, 'total_precipitation': 0.04973767697811127}
