<a href="https://colab.research.google.com/github/Vizzuality/copernicus-climate-data/blob/master/upload_and_define_datasets.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Prepare data for the copernicus-climate project

https://github.com/Vizzuality/copernicus-climate-data

`Edward P. Morris (vizzuality.)`

## Description
This notebook exports tables of time-series per location, and defines datasets and layers using the API.

### TODO
+ add breaks as attributes to zarr data sources

```
MIT License

Copyright (c) 2020 Vizzuality

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
```

# Setup

Instructions for setting up the computing environment.

In [None]:
%%bash
# Remove sample_data
rm -r sample_data

## Linux dependencies

Instructions for adding linux (including node, ect.) system packages. 

In [None]:
# Packages for projections and geospatial processing
!apt install -q -y libspatialindex-dev libproj-dev proj-data proj-bin libgeos-dev

Reading package lists...
Building dependency tree...
Reading state information...
proj-data is already the newest version (4.9.3-2).
proj-data set to manually installed.
The following package was automatically installed and is no longer required:
  libnvidia-common-440
Use 'apt autoremove' to remove it.
The following additional packages will be installed:
  libspatialindex-c4v5 libspatialindex4v5
Suggested packages:
  libgdal-doc
The following NEW packages will be installed:
  libgeos-dev libproj-dev libspatialindex-c4v5 libspatialindex-dev
  libspatialindex4v5 proj-bin
0 upgraded, 6 newly installed, 0 to remove and 33 not upgraded.
Need to get 860 kB of archives.
After this operation, 5,014 kB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu bionic/universe amd64 libgeos-dev amd64 3.6.2-1build2 [73.1 kB]
Get:2 http://archive.ubuntu.com/ubuntu bionic/universe amd64 libspatialindex4v5 amd64 1.8.5-5 [219 kB]
Get:3 http://archive.ubuntu.com/ubuntu bionic/unive

## Python packages

In [None]:
# connect to Google cloud storage
!pip install -q gcsfs

In [None]:
# xarray, Zarr and geometry tools
!pip install -q cftime netcdf4 nc-time-axis zarr xarray bottleneck rtree geopandas shapely --upgrade

[K     |████████████████████████████████| 327kB 2.8MB/s 
[K     |████████████████████████████████| 4.1MB 9.1MB/s 
[K     |████████████████████████████████| 3.3MB 44.8MB/s 
[K     |████████████████████████████████| 71kB 7.5MB/s 
[K     |████████████████████████████████| 962kB 44.8MB/s 
[K     |████████████████████████████████| 3.8MB 40.7MB/s 
[K     |████████████████████████████████| 14.7MB 313kB/s 
[K     |████████████████████████████████| 10.9MB 45.0MB/s 
[?25h  Building wheel for zarr (setup.py) ... [?25l[?25hdone
  Building wheel for rtree (setup.py) ... [?25l[?25hdone
  Building wheel for asciitree (setup.py) ... [?25l[?25hdone
  Building wheel for numcodecs (setup.py) ... [?25l[?25hdone


In [None]:
#!pip uninstall -y earthengine-api
#!pip install 'earthengine-api==0.1.215'

In [None]:
# Need to restart kernal
#import importlib
#importlib.reload(earthengine-api)

In [None]:
!pip install -q Skydipper jenkspy palettable ipythonblocks #carto 

[K     |████████████████████████████████| 51kB 1.9MB/s 
[K     |████████████████████████████████| 51kB 3.2MB/s 
[K     |████████████████████████████████| 153kB 4.8MB/s 
[K     |████████████████████████████████| 655kB 4.6MB/s 
[?25h  Building wheel for jenkspy (setup.py) ... [?25l[?25hdone
  Building wheel for earthengine-api (setup.py) ... [?25l[?25hdone
  Building wheel for pypng (setup.py) ... [?25l[?25hdone


In [None]:
import Skydipper

In [None]:
# Show python package versions
!pip list

Package                  Version        
------------------------ ---------------
absl-py                  0.9.0          
alabaster                0.7.12         
albumentations           0.1.12         
altair                   4.1.0          
asciitree                0.3.3          
asgiref                  3.2.10         
astor                    0.8.1          
astropy                  4.0.1.post1    
astunparse               1.6.3          
atari-py                 0.2.6          
atomicwrites             1.4.0          
attrs                    19.3.0         
audioread                2.1.8          
autograd                 1.3            
Babel                    2.8.0          
backcall                 0.2.0          
beautifulsoup4           4.6.3          
bleach                   3.1.5          
blis                     0.4.1          
bokeh                    1.4.0          
boto                     2.49.0         
boto3                    1.14.9         
botocore        

## Authorisation

Setting up connections and authorisation to cloud services.

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&response_type=code&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly

Enter your authorization code:
··········
Mounted at /content/drive


In [None]:
import os
import json
import shutil

env_fn = 'env-variables.json'

# Get json file defining env variable key-value pairs
shutil.copyfile(f"/content/drive/My Drive/{env_fn}", f"/root/.{env_fn}")
with open(f"/root/.{env_fn}") as f:
   for k,v in json.load(f).items():
      os.environ[k] = v

### Google Cloud

This can be done in the URL or via adding service account credentials.

If you do not share the notebook, you can mount your Drive and and transfer credentials to disk. Note if the notebook is shared you always need to authenticate via URL.  

In [None]:
# Set Google Cloud information
gc_project = "skydipper-196010"
gc_creds = "skydipper-196010-f842645fd0f3.json"
gc_user = "edward-morris@skydipper-196010.iam.gserviceaccount.com"
gcs_prefix = "gs://copernicus-climate"
gcs_http_url = "https://storage.googleapis.com/copernicus-climate"

In [None]:
# For auth WITHOUT service account
# https://cloud.google.com/resource-manager/docs/creating-managing-projects
#from google.colab import auth
#auth.authenticate_user()
#!gcloud config set project {project_id}

In [None]:
# If the notebook is shared
#from google.colab import drive
#drive.mount('/content/drive')

In [None]:
# If Drive is mounted, copy GC credentials to home (place in your GDrive, and connect Drive)
!cp "/content/drive/My Drive/{gc_creds}" "/root/.{gc_creds}"

In [None]:
# Auth WITH service account
!gcloud auth activate-service-account {gc_user} --key-file=/root/.{gc_creds} --project={gc_project}

Activated service account credentials for: [edward-morris@skydipper-196010.iam.gserviceaccount.com]


In [None]:
# Test GC auth
!gsutil ls {gcs_prefix}

gs://copernicus-climate/heatwave_seasonal_06_2020.zip
gs://copernicus-climate/heatwaves_historical_Basque.zip
gs://copernicus-climate/heatwaves_historical_Basque_coastal.zip
gs://copernicus-climate/heatwaves_longterm_Basque.zip
gs://copernicus-climate/heatwaves_longterm_Basque_coastal.zip
gs://copernicus-climate/heatwaves_seasonal_2020_06_Basque_coastal.zip
gs://copernicus-climate/spain.zarr.zip
gs://copernicus-climate/coldsnaps/
gs://copernicus-climate/data_for_PET/
gs://copernicus-climate/dataset/
gs://copernicus-climate/european-nuts-lau-geometries.zarr/
gs://copernicus-climate/heatwaves/
gs://copernicus-climate/pet/
gs://copernicus-climate/spain-zonal-stats.zarr/
gs://copernicus-climate/spain.zarr/
gs://copernicus-climate/tasmax/
gs://copernicus-climate/tasmin/
gs://copernicus-climate/to_delete/
gs://copernicus-climate/zonal_stats/


### Skydipper API

You need to register with the API at https://api.skydipper.com/auth , we then login with our email and password to get an authorisation token. Be aware users need specific authorisation scopes linked to projects.

In [None]:
# Set API information (note credentials should be defined in ENV)
sky_api_app = "copernicusClimate"
sky_creds = "skydipper-creds.txt"

In [None]:
# Set up first time
# Get auth token from API
#import requests
#import os
#payload = {
#    "email":os.environ['SKY_API_EMAIL'],
#    "password":os.environ['SKY_API_PWD']
#}
#url = 'https://api.skydipper.com/auth/login'
#headers = {'Content-Type': 'application/json'}
#r = requests.post(url, json=payload, headers=headers)
#r.json()
#token= r.json().get('data').get('token')
#headers = {'Authorization': f"Bearer {token}"}

In [None]:
# Copy previously generated creds
!mkdir /root/.Skydipper
!cp "/content/drive/My Drive/{sky_creds}" /root/.Skydipper/creds
with open("/root/.Skydipper/creds") as f:
   sky_api_token = f.read()
headers = {'Authorization': f"Bearer {sky_api_token}"}   

In [None]:
# Check it works
import Skydipper

Skydipper.Dataset('3a46bbff-73bc-4abc-bad6-11be6e99e2cb')

### Carto

In [None]:
# Set API information (note credentials should be defined in ENV)
carto_user = "skydipper"
carto_base_url = f"http://35.233.41.65/user/{carto_user}"  

In [None]:
#from carto.auth import APIKeyAuthClient
#import os

#auth_client = APIKeyAuthClient(api_key=os.environ['CARTO_API_KEY'], base_url=carto_base_url)

# Utils

Generic helper functions used in the subsequent processing. For easy navigation each function seperated into a section with the function name.

## copy_gcs

In [None]:
import os
import subprocess

def copy_gcs(source_list, dest_list, opts=""):
  """
  Use gsutil to copy each corresponding item in source_list
  to dest_list.

  Example:
  copy_gcs(["gs://my-bucket/data-file.csv"], ["."])

  """
  for s, d  in zip(source_list, dest_list):
    cmd = f"gsutil -m cp -r {opts} {s} {d}"
    print(f"Processing: {cmd}")
    r = subprocess.call(cmd, shell=True)
    if r == 0:
        print("Task created")
    else:
        print("Task failed")
  print("Finished copy")

## get_cached_remote_zarr

In [None]:
import gcsfs
import zarr
import xarray as xr



def get_cached_remote_zarr(
    group,
    root,
    project_id = gc_project,
    token=f"/root/.{gc_creds}",
    force_consolidate=False):
  
  # Connect to GS
  gc = gcsfs.GCSFileSystem(project=project_id, token=token)
  store = gc.get_mapper(root, check=False, create=True)
  if force_consolidate:
    # consolidate metadata at root
    zarr.consolidate_metadata(store)
  # Check zarr is consolidated
  consolidated = gc.exists(f'{root}/.zmetadata')
  # Cache the zarr store
  #store = zarr.ZipStore(store, mode='r')
  cache = zarr.LRUStoreCache(store, max_size=4737418240)
  # Return cached zarr group
  return xr.open_zarr(cache, group=group, consolidated=consolidated)

## set_acl_to_public

In [None]:
import subprocess

# Set to asset permissions to public for https read
def set_acl_to_public(gs_path):
  """ 
  Set all Google Storage assets to puplic read access.

  Requires GS authentication

  Parameters
  ----------
  gs_path str
    The google storage path, note the "-r" option is used, setting the acl of all assets below this path
  """
  cmd = f"gsutil -m acl -r ch -u AllUsers:R {gs_path}"
  print(cmd)
  r = subprocess.call(cmd, shell=True)
  if r is 0:
    print("Set acl(s) sucsessful")
  else:
    print("Set acl(s) failed")  

#set_acl_to_public("gs://skydipper-water-quality/cloud-masks")

## to_geopandas

In [None]:
import geopandas as gpd
import shapely

def to_geopandas(ds, rounding_precision=False):
  df = ds.reset_coords().to_dataframe().dropna().reset_index()
  # Return as geopandas object, converting geometry to shapley objects
  geoms = [shapely.wkb.loads(g, hex=True) for g in df.geometry.values]
  # Adjust precision
  if rounding_precision:
    geoms = [shapely.wkt.loads(shapely.wkt.dumps(g, rounding_precision=rounding_precision)) for g in geoms]
  return gpd.GeoDataFrame(df, geometry = geoms)

## create_admin_dict

In [None]:
# Create gid lookup tables
import geopandas as gpd
import rtree

def create_admin_dict(gdfs, debug=False):
  """ Generates dictionary of admin to lower admin gid codes.

  Input should be a list of geopandas dfs, level 0 to 4."""
  
  # Buffer geometry
  gdfbs =[gdfs[i][['gid', 'geometry']] for i in range(0, len(gdfs) -1)]
  for gdfb in gdfbs:
    gdfb.loc[:,'geometry'] = gdfb.buffer(0.1).values 

  # create dict of conversions
  return {
    "admin0to1": gpd.sjoin(gdfbs[0], gdfs[1][['gid', 'geometry', 'geoname', 'admin_level']], op='contains').drop('geometry', axis=1),
    "admin0to2": gpd.sjoin(gdfbs[0], gdfs[2][['gid', 'geometry', 'geoname', 'admin_level']], op='contains').drop('geometry', axis=1),
    "admin0to3": gpd.sjoin(gdfbs[0], gdfs[3][['gid', 'geometry', 'geoname', 'admin_level']], op='contains').drop('geometry', axis=1),
    "admin0to4": gpd.sjoin(gdfbs[0], gdfs[4][['gid', 'geometry', 'geoname', 'admin_level']], op='contains').drop('geometry', axis=1),
    "admin1to2": gpd.sjoin(gdfbs[1], gdfs[2][['gid', 'geometry', 'geoname', 'admin_level']], op='contains').drop('geometry', axis=1),
    "admin1to3": gpd.sjoin(gdfbs[1], gdfs[3][['gid', 'geometry', 'geoname', 'admin_level']], op='contains').drop('geometry', axis=1),
    "admin1to4": gpd.sjoin(gdfbs[1], gdfs[4][['gid', 'geometry', 'geoname', 'admin_level']], op='contains').drop('geometry', axis=1),
    "admin2to3": gpd.sjoin(gdfbs[2], gdfs[3][['gid', 'geometry', 'geoname', 'admin_level']], op='contains').drop('geometry', axis=1),
    "admin2to4": gpd.sjoin(gdfbs[2], gdfs[4][['gid', 'geometry', 'geoname', 'admin_level']], op='contains').drop('geometry', axis=1),
    "admin3to4": gpd.sjoin(gdfbs[3], gdfs[4][['gid', 'geometry', 'geoname', 'admin_level']], op='contains').drop('geometry', axis=1),
  }

## show_color_blocks

In [None]:
def show_colors_as_blocks(colors, block_size=90):
        """
        Show colors in the IPython Notebook using ipythonblocks.
        Parameters
        ----------
        block_size : int, optional
            Size of displayed blocks.
        """
        from ipythonblocks import BlockGrid
        from PIL import ImageColor

        grid = BlockGrid(len(colors), 1, block_size=block_size)

        for block, color in zip(grid, colors):
            block.rgb = ImageColor.getcolor(color, "RGB")
        grid.show()
        print(f"\n {colors}:")

## create_breaks

In [None]:
import numpy as np
import jenkspy

def create_breaks(da, n, decimals, method='quantiles', null_value = -9999):
  
  if method == 'quantiles':
    q = np.linspace(0, 1, n)
    #print(q)
    out = da.quantile(q, skipna=True).values.round(decimals).tolist()
  
  if method == 'jenks':  
    out = np.round(jenkspy.jenks_breaks(da.values[np.logical_not(np.isnan(da.values))], nb_class=n), decimals).tolist()

  if null_value is not None:
    #print('Adding null value')
    out = [null_value] + out
  
  return out


## create_carto_css_cloropleth_ramp

In [None]:
def create_carto_css_cloropleth_ramp(data_var, color_ramp, breaks, null_color=None, line_width= 0.5, line_color= '#FFFFFF', line_opacity= 0.5):
  
  if type(null_color) is str:
    colors = [null_color] + color_ramp   
  colors = ",".join(colors)
  breaks = [str(b) for b in breaks]
  breaks = ",".join(breaks)
  #print(breaks)
  out = "#layer {"\
  f"polygon-fill: ramp([{data_var}], "\
  f"({colors}), "\
  f"({breaks}), "\
  "'>=')"\
  " } #layer::outline { "\
  f"line-width: {line_width}; line-color: {line_color}; line-opacity: {line_opacity};"\
  "}"
  return out

#data_var = 'max_tasmax'
#color_ramp = ['#FEE0D2', '#FCBBA1', '#FC9272', '#FB694A', '#EF3B2C', '#CB181D', '#67000D']
#breaks = create_breaks(da, 7, 1, 'jenks', add_null = True)
#create_carto_css_cloropleth_ramp(data_var, color_ramp, breaks, null_color='#F5F5F5', line_width= 0.5, line_color= '#FFFFFF', line_opacity= 0.5)

## create_mapbox_cloropleth_paint

In [None]:
def create_mapbox_cloropleth_paint(nbreaks):
  ramp = ['interpolate', ['linear'],['get', "{column_name}"]]
  for n in range(0, nbreaks):
    ramp = ramp + ["{" + f"break{n}" + "}"] + ["{" + f"color{n}" + "}"]
  return {'fill-color': ramp, 'fill-opacity': "{fill_opacity}"}        

# Example
# --------
#create_mapbox_cloropleth_paint(7)          

## create_layer_config

In [None]:
import json

def create_layer_config(
    layer_id,
    layer_type,
    layer_params,
    render_layers,
    provider_type,
    provider_account,
    provider_layer_sql = "SELECT * FROM {table_name}",
    sql_params = None):

  # check for sql_parans and add to sql
  if sql_params:
    for k in sql_params.keys():
      provider_layer_sql = provider_layer_sql + " {" + f"{k}" + "}"

  if layer_type == "vector":
    out =  {
        "id": layer_id,
        "params": layer_params,
        "source": {
            "type": "vector",
            "provider": {
                "type": provider_type,
                "account": provider_account,
                "layers": [{
                    "options": {
                        "sql": provider_layer_sql,
                        "type": "cartodb"
                      }
                }]
            }
        },
        "render": {
            "layers": render_layers,
            "type": "vector",
            "version": "3.0"
        }
    }
  if sql_params:
      # check for sql_parans and add to sql
      for k in sql_params.keys():
        provider_layer_sql = provider_layer_sql + " {" + f"{k}" + "}"
      # add sql_params
      out.update({"sqlParams": sql_params})  
  
  return out

# Example
# -------

# Create a layer where the data source is carto-skydipper
# and the layer is rendered as a mapboxGL chloropleth

# Set layer type
layer_type = "vector"
provider_type = "carto-skydipper"
provider_account = "skydipper"

# Set the layers parameters, each <key> in the config will be replaced by value
layer_params = {
    "table_name": 'historical_total_zs_nuts_level_234',
    "column_name" : 'max_tasmax',
    }

# Set the colors, breaks, fill opacity, line color and width for the cloropleth
colors = ['#FEE0D2', '#FCBBA1', '#FC9272', '#FB694A', '#EF3B2C', '#CB181D', '#67000D']
breaks = [0,2,4,5,7,10,15]
fill_opacity = 0.75

# Add to layers parameters
layer_params.update(zip([f"break{n}" for n in range(0, len(breaks))], breaks))
layer_params.update(zip([f"color{n}" for n in range(0, len(colors))], colors))
layer_params.update({"fill_opacity": fill_opacity})

# Generate the layer id
layer_id = f"map-box-chloropleth-{len(breaks)}"

# Set data SOURCE
# set the provider sql
provider_layer_sql = "SELECT * FROM {table_name}"
# add extra sql value
sql_params = None#{"where": {"admin_level": 2}}
             #,"and": { "experiment" : "{experiment}"}}

# Set RENDER for MapboxGL Chloropleth map
# generate mapbox paint object
paint_object = create_mapbox_cloropleth_paint(nbreaks = len(breaks))
# make the render_layers list 
render_layers = [{"paint": paint_object, "source-layer": "layer0", "type": "fill"}]
                        
# make layerConig dict
print(json.dumps(create_layer_config(layer_id, layer_type, layer_params, render_layers, provider_type, provider_account,  provider_layer_sql, sql_params), indent=4))            

{
    "id": "map-box-chloropleth-7",
    "params": {
        "table_name": "historical_total_zs_nuts_level_234",
        "column_name": "max_tasmax",
        "break0": 0,
        "break1": 2,
        "break2": 4,
        "break3": 5,
        "break4": 7,
        "break5": 10,
        "break6": 15,
        "color0": "#FEE0D2",
        "color1": "#FCBBA1",
        "color2": "#FC9272",
        "color3": "#FB694A",
        "color4": "#EF3B2C",
        "color5": "#CB181D",
        "color6": "#67000D",
        "fill_opacity": 0.75
    },
    "source": {
        "type": "vector",
        "provider": {
            "type": "carto-skydipper",
            "account": "skydipper",
            "layers": [
                {
                    "options": {
                        "sql": "SELECT * FROM {table_name}",
                        "type": "cartodb"
                    }
                }
            ]
        }
    },
    "render": {
        "layers": [
            {
                "paint":

## create_legend_config

In [None]:
def create_legend_config(names, colors):
  out = {"type": "basic", "items": [{ "color": f"{str(color)}", "name": f"{str(name)}"} for color, name in zip(colors, names)]}
  return out 

create_legend_config(['NA',0.0,0.0,1.6,2.0,2.5,3.2,4.2,5.2], ['#FEE0D2', '#FCBBA1', '#FC9272', '#FB694A', '#EF3B2C', '#CB181D', '#67000D'])

{'items': [{'color': '#FEE0D2', 'name': 'NA'},
  {'color': '#FCBBA1', 'name': '0.0'},
  {'color': '#FC9272', 'name': '0.0'},
  {'color': '#FB694A', 'name': '1.6'},
  {'color': '#EF3B2C', 'name': '2.0'},
  {'color': '#CB181D', 'name': '2.5'},
  {'color': '#67000D', 'name': '3.2'}],
 'type': 'basic'}

## create_interaction_config

In [None]:
def create_interaction_config(dvars, dtypes, formats, names):
  out = {"output": [{"format": f, "type": dtype, "property": name, "column": dvar} for dvar, dtype, f, name in zip(dvars, dtypes, formats, names)]}
  return out

#import json
#dvars = ["max_tasmax", "min_tasmin", "total_heatwave_alerts", "total_coldsnap_warnings", "total_tasmin_std"]
#dtypes = [dss[dvar].dtype.name for dvar in dvars]
#formats = [None for i in dvars]
#names = [dvar.replace("_", " ").capitalize() for dvar in dvars]
#json.dumps(create_interaction_config(dvars, dtypes, formats, names))  

# Processing

Data processing organised into sections.

## Geometries and GID look-up table

### Write admin lookup to CSV and geometries to GeoJSON 

In [None]:
import pprint
import gcsfs
from dask.diagnostics import ProgressBar, Profiler, ResourceProfiler, CacheProfiler, visualize
import json
import encodings
import numpy as np

p= "zonal_stats"
# Make name id JSON
gda = get_cached_remote_zarr(group = 'nuts-2016-lau-2018', root = "copernicus-climate/european-nuts-lau-geometries.zarr")
gdas = gda.where((gda.admin_level.isin([0, 2, 3, 4]))&(gda.iso3=='ESP'), drop=True)

gdf = to_geopandas(gdas, 6)
centroids = gdf.centroid

fs = gcsfs.GCSFileSystem(project=gc_project, token=f"/root/.{gc_creds}")
with Profiler() as prof, ResourceProfiler(dt=1) as rprof, CacheProfiler() as cprof:
  with ProgressBar():
    with fs.open(f"{gcs_prefix}/{p}/geoname_gid_lookup_esp_nuts_lau_levels_0234.json", 'w', encoding='utf-8') as f:
      out = {"locations":[
             {"geoname":geoname, "gid":gid, "admin_level":admin_level, "longitude": np.round(x,6), "latitude": np.round(y,6)}\
             for geoname, gid, admin_level, x, y \
             in zip(gdas.coords['geoname'].values, gdas.coords['gid'].values, gdas.coords['admin_level'].values.tolist(), centroids.x, centroids.y)]}
      json.dump(out, f)


#pprint.pprint(json.dumps(out, ensure_ascii=False),indent=4)

[########################################] | 100% Completed |  0.1s
[########################################] | 100% Completed |  0.1s


### Set ACLs to public

In [None]:
# Set ACLs to public
p="zonal_stats"
set_acl_to_public(f"{gcs_prefix}/{p}/")

gsutil -m acl -r ch -u AllUsers:R gs://copernicus-climate/zonal_stats/
Set acl(s) sucsessful


In [None]:
%%time
# Write CSV to GCS
import gcsfs
import pandas as pd
from dask.diagnostics import ProgressBar, Profiler, ResourceProfiler, CacheProfiler, visualize
import json

# Geometries
gda = get_cached_remote_zarr(group = 'nuts-2016-lau-2018', root = "copernicus-climate/european-nuts-lau-geometries.zarr")
print(gda)

p = "zonal_stats"
fs = gcsfs.GCSFileSystem(project=gc_project, token=f"/root/.{gc_creds}")
with Profiler() as prof, ResourceProfiler(dt=1) as rprof, CacheProfiler() as cprof:
  with ProgressBar():
    with fs.open(f"{gcs_prefix}/{p}/admin_lookup_esp_nuts_lau_levels_0to4.csv", 'w') as f:
      print("\nwriting Admin. lookup\n")
      # Create admin lookup dictionary and GeoJSON files
      levels = [0,1,2,3,4]
      gdfs = [to_geopandas(gda.where((gda.admin_level==l)&(gda.iso3=='ESP'), drop=True), rounding_precision=6) for l in levels]
      admin_dict = create_admin_dict([to_geopandas(gda.where((gda.admin_level==l)&(gda.iso3=='ESP'), drop=True), rounding_precision=6) for l in levels])
      pd.concat(admin_dict.values())[['admin_level', 'gid_left','gid_right', 'geoname']].to_csv(f, index=False)
      
    print("\nwriting GeoJSON\n")
    # Write to GeoJSON
    # FIXME: Geopandas does not play well with stream as path!
    pd.concat(gdfs).to_file("geometries_esp_nuts_lau_levels_0to4.geojson", driver="GeoJSON")
    copy_gcs(["geometries_esp_nuts_lau_levels_0to4.geojson"], [f"{gcs_prefix}/{p}/geometries_esp_nuts_lau_levels_0to4.geojson"])

<xarray.Dataset>
Dimensions:      (gid: 104568)
Coordinates:
    admin_level  (gid) int64 dask.array<chunksize=(104568,), meta=np.ndarray>
    geoname      (gid) object dask.array<chunksize=(26142,), meta=np.ndarray>
  * gid          (gid) object 'AL' 'CZ' 'DE' ... 'UK_W06000023' 'UK_W06000024'
    iso3         (gid) object dask.array<chunksize=(26142,), meta=np.ndarray>
Data variables:
    geometry     (gid) object dask.array<chunksize=(26142,), meta=np.ndarray>
Attributes:
    crs:                 EPSG:4326
    geospatial_lat_max:  75.814181
    geospatial_lat_min:  26.018616
    geospatial_lon_max:  69.103165
    geospatial_lon_min:  61.78629
    history:             Created by combining `ref-nuts-2016-01m` and `LAU-20...
    keywords:            Statistical units, NUTS, LAU
    summary:             This dataset represents the regions for levels 1, 2 ...
    title:               European Union Nomenclature of Territorial Units for...

writing Admin. lookup

[########################

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self.obj[item] = s



writing GeoJSON

Processing: gsutil -m cp -r  geometries_esp_nuts_lau_levels_0to4.geojson gs://copernicus-climate/zonal_stats/geometries_esp_nuts_lau_levels_0to4.geojson
Task created
Finished copy
CPU times: user 25.2 s, sys: 922 ms, total: 26.2 s
Wall time: 35.2 s


### Set ACLs to public

In [None]:
# Set ACLs to public
set_acl_to_public(f"{gcs_prefix}/{p}/")

gsutil -m acl -r ch -u AllUsers:R gs://copernicus-climate/zonal_stats/
Set acl(s) sucsessful


### Upload to Carto

In [None]:
# Upload to Carto
# FIXME how to automatically make public??
import requests
import os

# Set some parameters
p = 'zonal_stats'
tis = ['admin_lookup', 'geometries']
ends = ['csv', 'geojson']
upload_tasks = list()

for ti, e in zip(tis, ends):
  payload = {
    "api_key":os.environ['CARTO_API_KEY'],
    "url":f"{gcs_http_url}/{p}/{ti}_esp_nuts_lau_levels_0to4.{e}",
    "privacy":"public",
    "interval":86400*7
    }
  url = f"{carto_base_url}/api/v1/synchronizations"
  headers = {'Content-Type': 'application/json'}
  r = requests.post(url=url, json=payload, headers=headers)
  upload_tasks.append(r.json())

for task in upload_tasks:
  print(task)

{'data_import': {'endpoint': '/api/v1/imports', 'item_queue_id': 'ae1828ae-0c4d-43d9-bb9e-2e6af581ed1f'}, 'id': 'ccc2e948-9ff3-11ea-b692-16056af2ae5d', 'name': None, 'interval': 604800, 'url': 'https://storage.googleapis.com/copernicus-climate/zonal_stats/admin_lookup_esp_nuts_lau_levels_0to4.csv', 'state': 'queued', 'user_id': 'c7980b72-f84a-4229-a346-ecc742f86552', 'created_at': '2020-05-27T08:26:44.944+00:00', 'updated_at': '2020-05-27T08:26:45.312+00:00', 'run_at': '2020-06-03T08:26:44.940+00:00', 'ran_at': '2020-05-27T08:26:44.941+00:00', 'modified_at': None, 'etag': None, 'checksum': '', 'log_id': None, 'error_code': None, 'error_message': None, 'retried_times': 0, 'service_name': None, 'service_item_id': None, 'type_guessing': True, 'quoted_fields_guessing': True, 'content_guessing': False, 'visualization_id': None, 'from_external_source': False}
{'data_import': {'endpoint': '/api/v1/imports', 'item_queue_id': '71ae4d81-02c7-422c-85d7-0aaa3c0b8bc0'}, 'id': 'cd26b874-9ff3-11ea-b6

### Create Sky API datasets

In [None]:
import Skydipper as sky

In [None]:
# Remember Carto changes all '-' to '_' !
tis = ['admin_lookup', 'geometries']
datasets = list()
for ti in tis:
  atts = { 
    'name': f"{ti}_esp_nuts_lau_levels_0to4",
    'application': ['copernicusClimate'],
    'connectorType': 'rest',
    'provider': 'cartodb',
    'connectorUrl': f"http://35.233.41.65/user/skydipper/dataset/{ti}_esp_nuts_lau_levels_0to4",
    'tableName': f"{ti}_esp_nuts_lau_levels_0to4",
    'env': 'production'
    }
  #print(atts)
  ds = sky.Dataset(attributes=atts)
  datasets.append(ds)
  print(ds)

Dataset 29039f99-5300-4aa9-905b-632e963ee3f4 admin_lookup_esp_nuts_lau_levels_0to4
Dataset f681fa69-640b-4ee6-9f68-3a73cf749bf7 geometries_esp_nuts_lau_levels_0to4


In [None]:
datasets[0]

In [None]:
datasets[1]

## Create datasets for WIDGETS monthly climatic variables per location

### Write CSV tables to storage

In [None]:
%%time
# Write CSV to GCS
import gcsfs
import pandas as pd
from dask.diagnostics import ProgressBar, Profiler, ResourceProfiler, CacheProfiler, visualize

# Set some parameters
p = 'zonal_stats'
tis = ['historical', 'future-seasonal', 'future-longterm']

fs = gcsfs.GCSFileSystem(project=gc_project, token=f"/root/.{gc_creds}")
with Profiler() as prof, ResourceProfiler(dt=1) as rprof, CacheProfiler() as cprof:
  with ProgressBar():
    for ti in tis:
      print(f"writing {ti}")
      with fs.open(f"{gcs_prefix}/{p}/{ti}_monthly_zs_nuts-level-234.csv", 'w') as f:
        xr.merge([get_cached_remote_zarr(f"{ti}-monthly-zs-nuts-level-{l}", 'copernicus-climate/spain-zonal-stats.zarr') for l in [2,3,4]])\
        .to_dataframe().reset_index(drop=False).to_csv(f, index=False)

writing historical
[########################################] | 100% Completed |  0.5s
[########################################] | 100% Completed |  0.5s
[########################################] | 100% Completed | 17.3s
[########################################] | 100% Completed |  0.4s
[########################################] | 100% Completed |  0.5s
[########################################] | 100% Completed | 17.6s
[########################################] | 100% Completed |  0.5s
[########################################] | 100% Completed |  0.5s
[########################################] | 100% Completed | 16.9s
[########################################] | 100% Completed |  0.4s
[########################################] | 100% Completed |  0.5s
[########################################] | 100% Completed | 17.5s
[########################################] | 100% Completed |  0.4s
[########################################] | 100% Completed |  0.5s
[############################

### Set ACLs to public

In [None]:
# Set ACLs to public
set_acl_to_public(f"{gcs_prefix}/{p}/")

gsutil -m acl -r ch -u AllUsers:R gs://copernicus-climate/zonal_stats/
Set acl(s) sucsessful


### Upload to Carto

In [None]:
# Upload to Carto
# FIXME how to automatically make public??
import requests
import os

# Set some parameters
p = 'zonal_stats'
tis = ['historical', 'future-seasonal', 'future-longterm']

upload_tasks = list()
for ti in tis:
  #print(f"{gcs_http_url}/{p}/{ti}_monthly_zs_nuts-level-234.csv")
  payload = {
    "api_key":os.environ['CARTO_API_KEY'],
    "url":f"{gcs_http_url}/{p}/{ti}_monthly_zs_nuts-level-234.csv",
    "privacy":"public",
    "interval":86400*7
    }
  url = f"{carto_base_url}/api/v1/synchronizations"
  headers = {'Content-Type': 'application/json'}
  r = requests.post(url=url, json=payload, headers=headers)
  upload_tasks.append(r.json())
  for task in upload_tasks:
    print(task)

{'data_import': {'endpoint': '/api/v1/imports', 'item_queue_id': 'd8b5fccd-01bc-4aa9-a2b7-6856b9e611fc'}, 'id': '6f0a1bfa-bd58-11ea-8f23-869c81b9a1d7', 'name': None, 'interval': 604800, 'url': 'https://storage.googleapis.com/copernicus-climate/zonal_stats/historical_monthly_zs_nuts-level-234.csv', 'state': 'queued', 'user_id': 'c7980b72-f84a-4229-a346-ecc742f86552', 'created_at': '2020-07-03T18:10:10.671+00:00', 'updated_at': '2020-07-03T18:10:11.036+00:00', 'run_at': '2020-07-10T18:10:10.592+00:00', 'ran_at': '2020-07-03T18:10:10.664+00:00', 'modified_at': None, 'etag': None, 'checksum': '', 'log_id': None, 'error_code': None, 'error_message': None, 'retried_times': 0, 'service_name': None, 'service_item_id': None, 'type_guessing': True, 'quoted_fields_guessing': True, 'content_guessing': False, 'visualization_id': None, 'from_external_source': False}
{'data_import': {'endpoint': '/api/v1/imports', 'item_queue_id': 'd8b5fccd-01bc-4aa9-a2b7-6856b9e611fc'}, 'id': '6f0a1bfa-bd58-11ea-8f2

### Create Sky API datasets

In [None]:
import Skydipper as sky

In [None]:
# Remember Carto changes all '-' to '_' !
tis = ['historical', 'future_seasonal', 'future_longterm']
datasets = list()
for ti in tis:
  #print(f"{ti}_monthly_zs_nuts-level-234")

  atts = { 
    'name': f"{ti}_monthly_zs_nuts-level-234",
    'application': ['copernicusClimate'],
    'connectorType': 'rest',
    'provider': 'cartodb',
    'connectorUrl': f"http://35.233.41.65/user/skydipper/dataset/{ti}_monthly_zs_nuts_level_234",
    'tableName': f"{ti}_monthly_zs_nuts_level_234",
    'env': 'production'
    }
  #print(atts)
  ds = sky.Dataset(attributes=atts)
  datasets.append(ds)
  print(ds)

Dataset 6aa6e725-4725-4e5a-8d46-4196db9f8634 historical_monthly_zs_nuts-level-234
Dataset f9fc8dc6-128a-48ee-90b3-8d79a718f2f3 future_seasonal_monthly_zs_nuts-level-234
Dataset 47586c47-5c58-4f88-8fe9-25f683180dd5 future_longterm_monthly_zs_nuts-level-234


In [None]:
datasets[0]

In [None]:
datasets[1]

In [None]:
datasets[2]

### Create queries

In [None]:
# Access Carto queries response

def get_timeseries(theme, time_interval, gid = "ES11", start_date = "1980-01-01", end_date = "2100-01-01"):
  
  

  # Define SQL conditions
  # experiment is only future_longterm
  se = ""
  we = ""
  dvs = ""

  # choose dataset
  # for future use mean {data_var}_mean 
  # and standard deviation {data_var}_std
  if time_interval == "future_longterm":
    se = "experiment, "
    we = "AND experiment = 'rcp85'"
    dataset_id = 'bef42c82-2714-4ba0-8694-75e49916013a'
    table_name = 'future_longterm_monthly_zs_nuts_level_234'
    if theme == 'heatwaves':
      data_vars = ["tasmax", "heatwave_alarms", "heatwave_alerts", "heatwave_warnings"] 
    if theme == 'coldsnaps':
      data_vars = ["tasmin", "coldsnap_alarms", "coldsnap_alerts", "coldsnap_warnings"]
    dvs = [f"{data_var}_mean, {data_var}_std " for data_var in data_vars]
  
  if time_interval == "future_seasonal":
    dataset_id = 'e1cc3f3e-133a-4a14-b2c2-f3192ee213c3'
    table_name = "future_seasonal_monthly_zs_nuts_level_234"
    if theme == 'heatwaves':
      data_vars = ["tasmax", "heatwave_alarms", "heatwave_alerts", "heatwave_warnings"] 
    if theme == 'coldsnaps':
      data_vars = ["tasmin", "coldsnap_alarms", "coldsnap_alerts", "coldsnap_warnings"]
    dvs = [f"{data_var}_mean, {data_var}_std " for data_var in data_vars]
    
  if time_interval == "historical":
    dataset_id = '3a46bbff-73bc-4abc-bad6-11be6e99e2cb'
    table_name = 'historical_monthly_zs_nuts_level_234'
    if theme == 'heatwaves':
      data_vars = ["tasmax", "heatwave_alarms", "heatwave_alerts", "heatwave_warnings", "heatstress_extreme", "heatstress_strong", "heatstress_moderate"] 
    if theme == 'coldsnaps':
      data_vars = ["tasmin", "coldsnap_alarms", "coldsnap_alerts", "coldsnap_warnings", "coldstress_extreme", "coldstress_strong", "coldstress_moderate"]
    dvs = [f"{data_var}_mean " for data_var in data_vars]
    

  # Convert variables to string
  dvstring = ", ".join(dvs)
  #print(dvstring)

  sql = \
  f"SELECT gid, {se}time, {dvstring}"\
  f"FROM {table_name} "\
  f"WHERE gid = '{gid}' AND time between '{start_date}' AND '{end_date}' {we} "\
  "ORDER BY time"

  #print(sql)

  url = f"http://api.skydipper.com/v1/query/{dataset_id}/"
  params = {"sql": sql}
  headers = {'Authorization': f"Bearer {sky_api_token}"}
  r = requests.post(url=url, params=params, headers=headers)
  return r

In [None]:
import urllib
themes = ['heatwaves', 'coldsnaps']
time_intervals = ['historical', 'future_seasonal', 'future_longterm']
headers = {'Authorization': f"Bearer {sky_api_token}"}

print("\nHeader:\n")
print(headers)
print("\nAPI queries:\n")
for theme in themes:  
  print(f"\n{theme}:\n")
  for time_interval in time_intervals:
    r = get_timeseries(theme, time_interval)
    print(f"\n{time_interval}:\n")
    print(urllib.parse.unquote_plus(r.url))



Header:

{'Authorization': 'Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpZCI6IjVlMzA0OGIzYTY4NWYzMDAxMDhkZjYyNCIsInJvbGUiOiJBRE1JTiIsInByb3ZpZGVyIjoibG9jYWwiLCJlbWFpbCI6ImVkd2FyZC5tb3JyaXNAdml6enVhbGl0eS5jb20iLCJleHRyYVVzZXJEYXRhIjp7ImFwcHMiOlsic2t5ZGlwcGVyIiwibWFuZ3JvdmVBdGxhcyIsInNvaWxzUmV2ZWFsZWQiLCJjb3Blcm5pY3VzQ2xpbWF0ZSJdfSwiY3JlYXRlZEF0IjoxNTkwNDA0NzU1MTc3LCJpYXQiOjE1OTA0MDQ3NTV9.wRRJQCFtvCZzMTtucly2pmCL5WhsFBgBFDUo2CmJSaY'}

API queries:


heatwaves:


historical:


future_seasonal:


future_longterm:


coldsnaps:


historical:


future_seasonal:


future_longterm:



## Create datasets for WIDGETS daily PET climatology per month per location

In [None]:
tst = get_cached_remote_zarr(f"historical-hourly-petmax-quantiles-zs-nuts-level-3", 'copernicus-climate/spain-zonal-stats.zarr').chunk({'gid':-1})
tst.pet_mean.where(tst.pet_mean.notnull(), drop=True)

Unnamed: 0,Array,Chunk
Bytes,1.32 MB,109.82 kB
Shape,"(52, 12, 24, 11)","(52, 1, 24, 11)"
Count,49 Tasks,12 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 1.32 MB 109.82 kB Shape (52, 12, 24, 11) (52, 1, 24, 11) Count 49 Tasks 12 Chunks Type float64 numpy.ndarray",52  1  11  24  12,

Unnamed: 0,Array,Chunk
Bytes,1.32 MB,109.82 kB
Shape,"(52, 12, 24, 11)","(52, 1, 24, 11)"
Count,49 Tasks,12 Chunks
Type,float64,numpy.ndarray


### Write CSV tables to storage

In [None]:
%%time
# Write CSV to GCS
import gcsfs
import pandas as pd
from dask.diagnostics import ProgressBar, Profiler, ResourceProfiler, CacheProfiler, visualize

# Set some parameters
p = 'zonal_stats'

fs = gcsfs.GCSFileSystem(project=gc_project, token=f"/root/.{gc_creds}")
with Profiler() as prof, ResourceProfiler(dt=1) as rprof, CacheProfiler() as cprof:
  with ProgressBar():
    for l in [2,3,4]:
      with fs.open(f"{gcs_prefix}/{p}/historical-hourly-petmax-quantiles-zs-nuts-level-{l}.csv", 'w') as f:
        get_cached_remote_zarr(f"historical-hourly-petmax-quantiles-zs-nuts-level-{l}", 'copernicus-climate/spain-zonal-stats.zarr').chunk({'gid':-1})\
        .to_dataframe().reset_index(drop=False).to_csv(f, index=False)

[########################################] | 100% Completed |  0.6s
[########################################] | 100% Completed |  0.6s
[########################################] | 100% Completed |  3.8s
CPU times: user 1min 33s, sys: 3.21 s, total: 1min 36s
Wall time: 2min 3s


In [None]:
import pandas as pd
tst = pd.read_csv(f"{gcs_http_url}/{p}/historical-hourly-petmax-quantiles-zs-nuts-level-3.csv")
tst.dropna()

Unnamed: 0,admin_level,gid,hour,month,quantile,pet_mean
0,3,ES111,0,1,0.0,-17.537402
1,3,ES111,0,1,0.1,-10.289688
2,3,ES111,0,1,0.2,-8.853540
3,3,ES111,0,1,0.3,-7.729150
4,3,ES111,0,1,0.4,-6.814788
...,...,...,...,...,...,...
164731,3,ES640,23,12,0.6,-0.212689
164732,3,ES640,23,12,0.7,0.766985
164733,3,ES640,23,12,0.8,2.019611
164734,3,ES640,23,12,0.9,3.393640


### Set ACLs to public

In [None]:
# Set ACLs to public
p ="zonal_stats"
set_acl_to_public(f"{gcs_prefix}/{p}/")

gsutil -m acl -r ch -u AllUsers:R gs://copernicus-climate/zonal_stats/
Set acl(s) sucsessful


### Upload to Carto

In [None]:
# Upload to Carto
# FIXME how to automatically make public??
import requests
import os

# Set some parameters
p = 'zonal_stats'

upload_tasks = list()
for l in [2,3,4]:
  #print(f"{gcs_http_url}/{p}/historical-hourly-petmax-quantiles-zs-nuts-level-234.csv")
  payload = {
    "api_key":os.environ['CARTO_API_KEY'],
    "url":f"{gcs_http_url}/{p}/historical-hourly-petmax-quantiles-zs-nuts-level-{l}.csv",
    "privacy":"public",
    "interval":86400*7
    }
  url = f"{carto_base_url}/api/v1/synchronizations"
  headers = {'Content-Type': 'application/json'}
  r = requests.post(url=url, json=payload, headers=headers)
  upload_tasks.append(r.json())

for task in upload_tasks:
  print(task)

{'data_import': {'endpoint': '/api/v1/imports', 'item_queue_id': '4fd52958-3f60-48b6-b772-c7319debf949'}, 'id': 'bb04e074-bd5a-11ea-8f23-869c81b9a1d7', 'name': None, 'interval': 604800, 'url': 'https://storage.googleapis.com/copernicus-climate/zonal_stats/historical-hourly-petmax-quantiles-zs-nuts-level-2.csv', 'state': 'queued', 'user_id': 'c7980b72-f84a-4229-a346-ecc742f86552', 'created_at': '2020-07-03T18:26:37.061+00:00', 'updated_at': '2020-07-03T18:26:37.086+00:00', 'run_at': '2020-07-10T18:26:37.058+00:00', 'ran_at': '2020-07-03T18:26:37.058+00:00', 'modified_at': None, 'etag': None, 'checksum': '', 'log_id': None, 'error_code': None, 'error_message': None, 'retried_times': 0, 'service_name': None, 'service_item_id': None, 'type_guessing': True, 'quoted_fields_guessing': True, 'content_guessing': False, 'visualization_id': None, 'from_external_source': False}
{'data_import': {'endpoint': '/api/v1/imports', 'item_queue_id': 'e05c3ce1-ebf6-4ac2-99b9-ee3851755f79'}, 'id': 'bb3a8e18

### Create Sky API datasets

In [None]:
import Skydipper as sky

In [None]:
# Remember Carto changes all '-' to '_' !
datasets = list()
for l in [2,3,4]:
  #print(f"{ti}_monthly_zs_nuts-level-234")

  atts = { 
    'name': f"historical_hourly_petmax_quantiles_zs_nuts_level_{l}",
    'application': ['copernicusClimate'],
    'connectorType': 'rest',
    'provider': 'cartodb',
    'connectorUrl': f"http://35.233.41.65/user/skydipper/dataset/historical_hourly_petmax_quantiles_zs_nuts_level_{l}",
    'tableName': f"historical_hourly_petmax_quantiles_zs_nuts_level_{l}",
    'env': 'production'
    }
  #print(atts)
  ds = sky.Dataset(attributes=atts)
  datasets.append(ds)

In [None]:
datasets[0]

In [None]:
datasets[1]

In [None]:
datasets[2]

### Create queries

In [None]:
# Access Carto queries response
import requests

def get_pet_climatology(month = 8, gid = "ES11", admin_level= 2):
  
  # Set data_variables
  data_vars = ["pet_mean"]
  
  # Convert variables to string
  dvstring = ", ".join(data_vars)

  # Set table name
  table_name = f"historical_hourly_petmax_quantiles_zs_nuts_level_{admin_level}" 

  # Set dataset ID and table name
  if admin_level == 2:
    dataset_id = "3b07a5ef-05fe-4b0d-b6af-8dee4784714e"
    
  # Set dataset ID and table name
  if admin_level == 3:
    dataset_id = "0fcceb53-ac29-45c3-b0e3-7dab2970f448"
    
  # Create SQL query
  sql = \
  f"SELECT gid, month, hour, quantile, {dvstring} "\
  f"FROM {table_name} "\
  f"WHERE gid = '{gid}' AND month = {month} "\
  "ORDER BY hour"
  print(sql)

  # Create request
  url = f"http://api.skydipper.com/v1/query/{dataset_id}/"
  print(url)
  params = {"sql": sql}
  headers = {'Authorization': f"Bearer {sky_api_token}"}
  r = requests.post(url=url, params=params, headers=headers)
  return r

In [None]:
r = get_pet_climatology(month = 8, gid = "ES11", admin_level=3)
print(r.url)

SELECT gid, month, hour, quantile, pet_mean FROM historical_hourly_petmax_quantiles_zs_nuts_level_3 WHERE gid = 'ES11' AND month = 8 ORDER BY hour
http://api.skydipper.com/v1/query/0fcceb53-ac29-45c3-b0e3-7dab2970f448/
https://api.skydipper.com/v1/query/0fcceb53-ac29-45c3-b0e3-7dab2970f448/?sql=SELECT+gid%2C+month%2C+hour%2C+quantile%2C+pet_mean+FROM+historical_hourly_petmax_quantiles_zs_nuts_level_3+WHERE+gid+%3D+%27ES11%27+AND+month+%3D+8+ORDER+BY+hour


## Create datasets for MAPS total climatic variables per location

### Write CSV tables to storage

In [None]:
%%time
# Write CSV to GCS
import gcsfs
import pandas as pd
from dask.diagnostics import ProgressBar, Profiler, ResourceProfiler, CacheProfiler, visualize

# Set some parameters
p = 'zonal_stats'
tis = ['historical', 'future-seasonal', 'future-longterm']

fs = gcsfs.GCSFileSystem(project=gc_project, token=f"/root/.{gc_creds}")
with Profiler() as prof, ResourceProfiler(dt=1) as rprof, CacheProfiler() as cprof:
  with ProgressBar():
    for ti in tis:
      print(f"writing {ti}")
      with fs.open(f"{gcs_prefix}/{p}/{ti}_total_zs_nuts-level-234.csv", 'w') as f:
        get_cached_remote_zarr(f"{ti}-total-zs-nuts-level-234", 'copernicus-climate/spain-zonal-stats.zarr')\
        .to_dataframe().reset_index(drop=False).to_csv(f, index=False)

writing historical
[########################################] | 100% Completed | 15.4s
[########################################] | 100% Completed |  0.1s
[########################################] | 100% Completed |  0.1s
[########################################] | 100% Completed |  0.1s
[########################################] | 100% Completed |  0.1s
[########################################] | 100% Completed |  1.5s
[########################################] | 100% Completed |  0.5s
[########################################] | 100% Completed |  0.4s
[########################################] | 100% Completed | 15.3s
[########################################] | 100% Completed | 15.0s
[########################################] | 100% Completed | 15.0s
[########################################] | 100% Completed | 15.0s
[########################################] | 100% Completed | 15.2s
[########################################] | 100% Completed | 15.0s
[############################

### Set ACLs to public

In [None]:
# Set ACLs to public
set_acl_to_public(f"{gcs_prefix}/{p}/")

gsutil -m acl -r ch -u AllUsers:R gs://copernicus-climate/zonal_stats/
Set acl(s) sucsessful


### Upload to Carto

In [None]:
# Upload to Carto
# FIXME how to automatically make public??
import requests
import os

# Set some parameters
p = 'zonal_stats'
tis = ['historical', 'future-seasonal', 'future-longterm']

upload_tasks = list()
for ti in tis:
  #print(f"{gcs_http_url}/{p}/{ti}_monthly_zs_nuts-level-234.csv")
  payload = {
    "api_key":os.environ['CARTO_API_KEY'],
    "url":f"{gcs_http_url}/{p}/{ti}_total_zs_nuts-level-234.csv",
    "privacy":"public",
    "interval":86400*7
    }
  url = f"{carto_base_url}/api/v1/synchronizations"
  headers = {'Content-Type': 'application/json'}
  r = requests.post(url=url, json=payload, headers=headers)
  upload_tasks.append(r.json())

# print overview
for task in upload_tasks:
  print(task)

{'data_import': {'endpoint': '/api/v1/imports', 'item_queue_id': '62265d4f-bc2a-4572-b8cc-9ac69cbaed6b'}, 'id': '11f570a2-9feb-11ea-b692-16056af2ae5d', 'name': None, 'interval': 604800, 'url': 'https://storage.googleapis.com/copernicus-climate/zonal_stats/historical_total_zs_nuts-level-234.csv', 'state': 'queued', 'user_id': 'c7980b72-f84a-4229-a346-ecc742f86552', 'created_at': '2020-05-27T07:24:15.567+00:00', 'updated_at': '2020-05-27T07:24:16.084+00:00', 'run_at': '2020-06-03T07:24:15.564+00:00', 'ran_at': '2020-05-27T07:24:15.565+00:00', 'modified_at': None, 'etag': None, 'checksum': '', 'log_id': None, 'error_code': None, 'error_message': None, 'retried_times': 0, 'service_name': None, 'service_item_id': None, 'type_guessing': True, 'quoted_fields_guessing': True, 'content_guessing': False, 'visualization_id': None, 'from_external_source': False}
{'data_import': {'endpoint': '/api/v1/imports', 'item_queue_id': '4bc157fe-b7e1-4c52-a566-3e49a19ae9e5'}, 'id': '1297b5a6-9feb-11ea-b692-

### Create Sky API datasets

In [None]:
import Skydipper as sky

In [None]:
# Remember Carto changes all '-' to '_' !
tis = ['historical', 'future_seasonal', 'future_longterm']
datasets = list()
for ti in tis:
  atts = { 
    'name': f"{ti}_total_zs_nuts-level-234",
    'application': ['copernicusClimate'],
    'connectorType': 'rest',
    'provider': 'cartodb',
    'connectorUrl': f"http://35.233.41.65/user/skydipper/dataset/{ti}_total_zs_nuts_level_234",
    'tableName': f"{ti}_total_zs_nuts_level_234",
    'env': 'production'
    }
  #print(atts)
  ds = sky.Dataset(attributes=atts)
  datasets.append(ds)
  print(ds)

Dataset 4212100b-d1da-47b4-9fdd-e2564ca955bb historical_total_zs_nuts-level-234
Dataset df6f7198-3b05-4d14-919a-6726e34f1603 future_seasonal_total_zs_nuts-level-234
Dataset 5cc5ee88-13f1-464c-a30e-73d3d556a8cd future_longterm_total_zs_nuts-level-234


In [None]:
datasets[0]

In [None]:
datasets[1]

In [None]:
datasets[2]

### Add Metadata

In [None]:
"metadata": [{
                "id": "59a4226f7b6c000012baa6f5",
                "type": "metadata",
                "attributes": {
                    "dataset": "134caa0a-21f7-451d-a7fe-30db31a424aa",
                    "application": "gfw",
                    "resource": {
                        "id": "134caa0a-21f7-451d-a7fe-30db31a424aa",
                        "type": "dataset"
                    },
                    "language": "es",
                    "name": "",
                    "description": "",
                    "source": "",
                    "citation": "",
                    "license": "",
                    "info": {
                        "dataDownload": "",
                        "organization": "",
                        "source-long": "",
                        "short-description": "",
                        "caution": "",
                        "updateFrequence": "",
                        "dateContent": "",
                        "spatialResolution": "",
                        "geographicCoverage": "",
                        "function": "",
                        "subtitle": ""
                    },
                    "createdAt": "2017-08-28T14:02:23.744Z",
                    "updatedAt": "2017-08-28T14:02:23.744Z",
                    "status": "published"
                }
            }]

In [None]:
# Get list of subadmins

def get_gids(gid='ES', admin_level=2):
  url = f"http://api.skydipper.com/v1/query/29039f99-5300-4aa9-905b-632e963ee3f4/"
  sql = \
  f"SELECT gid_left, gid_right "\
  f"FROM admin_lookup_esp_nuts_lau_levels_0to4 "\
  f"WHERE gid_left = '{gid}' AND admin_level = {admin_level}"
  #print(sql)
  params = {"sql": sql}
  headers = {'Authorization': f"Bearer {sky_api_token}"}
  r = requests.post(url=url, params=params, headers=headers)
  admin_lookup = r.json().get('data')
  admin_lookup = [d.get('gid_right') for d in admin_lookup]
  return admin_lookup, r

a, r = get_gids()
print(urllib.parse.unquote_plus(r.url))  

https://api.skydipper.com/v1/query/29039f99-5300-4aa9-905b-632e963ee3f4/?sql=SELECT gid_left, gid_right FROM admin_lookup_esp_nuts_lau_levels_0to4 WHERE gid_left = 'ES' AND admin_level = 2


In [None]:
# Access Carto queries response

def get_map(theme, time_interval, admin_level=2, gid='ES'):
  
  # Get list of subadmins
  gids = get_gids(gid, admin_level)
  gids = [f"'{gid}'" for gid in gids]
  gidstring = ", ".join(gids)
  #print(gidstring)
  
  # Define SQL conditions
  # experiment is only future_longterm
  se = ""
  we = ""
  dvs = ""

  # choose dataset
  # for future use mean {data_var}_mean 
  # and standard deviation {data_var}_std
  if time_interval == "future_longterm":
    table_name = 'future_longterm_total_zs_nuts_level_234'
    dataset_id = "817e02ec-802c-4594-a755-8dca6891175a"
    se = f"{table_name}.experiment, "
    we = "AND experiment = 'rcp85'"
    if theme == 'heatwaves':
      data_vars = ["max_tasmax"]#, "heatwave_alarms", "heatwave_alerts", "heatwave_warnings"] 
    if theme == 'coldsnaps':
      data_vars = ["min_tasmin"]#, "coldsnap_alarms", "coldsnap_alerts", "coldsnap_warnings"]
    dvs = [f"{table_name}.{data_var} " for data_var in data_vars]
  
  if time_interval == "future_seasonal":
    dataset_id = "075eb3e5-77bb-4fd7-a6b9-3108ae6ba166"
    table_name = "future_seasonal_total_zs_nuts_level_234"
    if theme == 'heatwaves':
      data_vars = ["max_tasmax"]#, "heatwave_alarms", "heatwave_alerts", "heatwave_warnings"] 
    if theme == 'coldsnaps':
      data_vars = ["min_tasmin"]#, "coldsnap_alarms", "coldsnap_alerts", "coldsnap_warnings"]
    dvs = [f"{table_name}.{data_var} " for data_var in data_vars]
    
  if time_interval == "historical":
    dataset_id = "5d0bc927-6780-4f64-ba3d-dc241be6c26d"
    table_name = 'historical_total_zs_nuts_level_234'
    if theme == 'heatwaves':
      data_vars = ["max_tasmax"]#, "heatwave_alarms", "heatwave_alerts", "heatwave_warnings", "heatstress_extreme", "heatstress_strong", "heatstress_moderate"] 
    if theme == 'coldsnaps':
      data_vars = ["min_tasmin"]#, "coldsnap_alarms", "coldsnap_alerts", "coldsnap_warnings", "coldstress_extreme", "coldstress_strong", "coldstress_moderate"]
    dvs = [f"{table_name}.{data_var} " for data_var in data_vars]
    
  # Convert variables to string
  dvstring = ", ".join(dvs)
  #print(dvstring)

  #SELECT geometries_esp_nuts_lau_levels_0to4.the_geom, geometries_esp_nuts_lau_levels_0to4.geoname, historical_total_zs_nuts_level_234.gid,  historical_total_zs_nuts_level_234.max_tasmax FROM historical_total_zs_nuts_level_234 INNER JOIN geometries_esp_nuts_lau_levels_0to4 ON historical_total_zs_nuts_level_234.gid=geometries_esp_nuts_lau_levels_0to4.gid WHERE historical_total_zs_nuts_level_234.gid IN ('ES70', 'ES11', 'ES43', 'ES12', 'ES63', 'ES61', 'ES41', 'ES13', 'ES30', 'ES42', 'ES64', 'ES21', 'ES23', 'ES22', 'ES62', 'ES24', 'ES52', 'ES51', 'ES53')  ORDER BY historical_total_zs_nuts_level_234.gid
  sql = \
  f"SELECT geometries_esp_nuts_lau_levels_0to4.the_geom, geometries_esp_nuts_lau_levels_0to4.geoname, {table_name}.gid, {se} {dvstring}"\
  f"FROM {table_name} "\
  f"JOIN geometries_esp_nuts_lau_levels_0to4 ON {table_name}.gid=geometries_esp_nuts_lau_levels_0to4.gid "\
  f"WHERE {table_name}.gid IN ({gidstring}) {we} "\
  f"ORDER BY {table_name}.gid"

  print(sql)

  url = f"http://api.skydipper.com/v1/query/{dataset_id}/"
  params = {"sql": sql}
  headers = {'Authorization': f"Bearer {sky_api_token}"}
  r = requests.post(url=url, params=params, headers=headers)
  return r

In [None]:
import urllib
themes = ['heatwaves', 'coldsnaps']
time_intervals = ['historical', 'future_seasonal', 'future_longterm']
headers = {'Authorization': f"Bearer {sky_api_token}"}

print("\nHeader:\n")
print(headers)
print("\nAPI queries:\n")
for theme in themes:  
  print(f"\n{theme}:\n")
  for time_interval in time_intervals:
    r = get_map(theme, time_interval)
    print(f"\n{time_interval}:\n")
    print(urllib.parse.unquote_plus(r.url))



Header:

{'Authorization': 'Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpZCI6IjVlMzA0OGIzYTY4NWYzMDAxMDhkZjYyNCIsInJvbGUiOiJBRE1JTiIsInByb3ZpZGVyIjoibG9jYWwiLCJlbWFpbCI6ImVkd2FyZC5tb3JyaXNAdml6enVhbGl0eS5jb20iLCJleHRyYVVzZXJEYXRhIjp7ImFwcHMiOlsic2t5ZGlwcGVyIiwibWFuZ3JvdmVBdGxhcyIsInNvaWxzUmV2ZWFsZWQiLCJjb3Blcm5pY3VzQ2xpbWF0ZSJdfSwiY3JlYXRlZEF0IjoxNTkwNDA0NzU1MTc3LCJpYXQiOjE1OTA0MDQ3NTV9.wRRJQCFtvCZzMTtucly2pmCL5WhsFBgBFDUo2CmJSaY'}

API queries:


heatwaves:

SELECT geometries_esp_nuts_lau_levels_0to4.the_geom, geometries_esp_nuts_lau_levels_0to4.geoname, historical_total_zs_nuts_level_234.gid,  historical_total_zs_nuts_level_234.max_tasmax FROM historical_total_zs_nuts_level_234 JOIN geometries_esp_nuts_lau_levels_0to4 ON historical_total_zs_nuts_level_234.gid=geometries_esp_nuts_lau_levels_0to4.gid WHERE historical_total_zs_nuts_level_234.gid IN ('['ES70', 'ES11', 'ES43', 'ES12', 'ES63', 'ES61', 'ES41', 'ES13', 'ES30', 'ES42', 'ES64', 'ES21', 'ES23', 'ES22', 'ES62', 'ES24', 'ES52

In [None]:
ds = sky.Dataset(id_hash="5d0bc927-6780-4f64-ba3d-dc241be6c26d")
sql = "SELECT *  FROM historical_total_zs_nuts_level_234 JOIN geometries_esp_nuts_lau_levels_0to4 ON historical_total_zs_nuts_level_234.gid=geometries_esp_nuts_lau_levels_0to4.gid WHERE historical_total_zs_nuts_level_234.gid IN ('['ES70', 'ES11', 'ES43', 'ES12', 'ES63', 'ES61', 'ES41', 'ES13', 'ES30', 'ES42', 'ES64', 'ES21', 'ES23', 'ES22', 'ES62', 'ES24', 'ES52', 'ES51', 'ES53']', '<Response [200]>')  ORDER BY historical_total_zs_nuts_level_234.gid"
ds.query(sql=sql)

ValueError: ignored

## Create Sky API Dataset Layers

Create a single layer for each variable of each theme per admin level. For future-longterm, also per experiment.

+ define basic structure
+ get breaks for each variable
+ create cartocss
+ create layers

In [None]:
import Skydipper as sky

### Define color palletes

In [None]:
#import palettable as pal

# Define function to choose color ramps

print("\nDiverging blues to reds:\n")
color_ramp_blue_red = ["#08306B", "#0A519C", "#2171B5", "#4292C5", "#6BAED6", "#C6DBEF", "#FEE0D2", "#FCBBA1", "#FC9272", "#FB694A", "#EF3B2C", "#CB181D", "#67000D"]
show_colors_as_blocks(color_ramp_blue_red) 

# negative
color_ramp_blue_white = ["#08306B", "#0A519C", "#2171B5", "#4292C5", "#6BAED6", "#C6DBEF", "#F1EEF6"]
# positive
color_ramp_white_red = ["#FEE0D2", "#FCBBA1", "#FC9272", "#FB694A", "#EF3B2C", "#CB181D", "#67000D"]
# positive
color_ramp_white_blue = ["#F1EEF6", "#D0D1E6", "#A6BDDB", "#74A9CF", "#3690C0", "#0570B0", "#034E7B"]

def cr(dvar):
  out = ['Error no color ramp found for variable']
  if "cold" in dvar:
    out = color_ramp_white_blue
  if "heat" in dvar:
    out = color_ramp_white_red
  if "min" in dvar:
    out = color_ramp_blue_white
  if "max" in dvar:
    out = color_ramp_white_red  
  return out

for dvar in ["max_tasmax", "min_tasmin", "total_heatwave_alerts", "total_coldsnap_warnings", "total_tasmin_std"]:
  print(f"\n{dvar}:\n")
  show_colors_as_blocks(cr(dvar))           


Diverging blues to reds:




 ['#08306B', '#0A519C', '#2171B5', '#4292C5', '#6BAED6', '#C6DBEF', '#FEE0D2', '#FCBBA1', '#FC9272', '#FB694A', '#EF3B2C', '#CB181D', '#67000D']:

max_tasmax:




 ['#FEE0D2', '#FCBBA1', '#FC9272', '#FB694A', '#EF3B2C', '#CB181D', '#67000D']:

min_tasmin:




 ['#08306B', '#0A519C', '#2171B5', '#4292C5', '#6BAED6', '#C6DBEF', '#F1EEF6']:

total_heatwave_alerts:




 ['#FEE0D2', '#FCBBA1', '#FC9272', '#FB694A', '#EF3B2C', '#CB181D', '#67000D']:





 ['#F1EEF6', '#D0D1E6', '#A6BDDB', '#74A9CF', '#3690C0', '#0570B0', '#034E7B']:

total_tasmin_std:




 ['#08306B', '#0A519C', '#2171B5', '#4292C5', '#6BAED6', '#C6DBEF', '#F1EEF6']:


### Create layer attributes

In [None]:
%%time
# Create layer attributes
# for a VCS-API carto table rendered using MapBOXGL as a chloropleth
# admin_level is external selector in all layers
# experiment is also selectable for future-longterm

# Dataset specific params
tis = ['historical', 'future-seasonal', 'future-longterm']
provider_type = "carto-skydipper"
provider_account = "skydipper"

# chloropleth graphical params
nbreaks = 7
breaks_method = 'jenks'
fill_opacity = 0.75
paint_object = create_mapbox_cloropleth_paint(nbreaks)
# make the render_layers list 
render_layers = [{"paint": paint_object, "source-layer": "layer0", "type": "fill"}]

# loop through time_intervals to create layer attributes per dataset
ly_list = list()
for ti in tis:
    print(f"\nProcessing {ti}")
    scenario_text = ""
    if ti == "future-longterm":
      table_name = 'future_longterm_total_zs_nuts_level_234'
      dataset_id = "5cc5ee88-13f1-464c-a30e-73d3d556a8cd"
      scenario_text = " for scenario CMIP5 RCP45 and RCP85"
      provider_layer_sql = "SELECT * FROM {table_name} WHERE admin_level = {admin_level} AND experiment = {experiment}"
      sd = '2020'
      ed = '2100'
  
    if ti == "future-seasonal":
      dataset_id = "df6f7198-3b05-4d14-919a-6726e34f1603"
      table_name = "future_seasonal_total_zs_nuts_level_234"
      provider_layer_sql = "SELECT * FROM {table_name} WHERE admin_level = {admin_level}"
      sd = '2020-02-01'
      ed = '2020-07-30'

    if ti == "historical":
      dataset_id = "4212100b-d1da-47b4-9fdd-e2564ca955bb"
      table_name = 'historical_total_zs_nuts_level_234'
      provider_layer_sql = "SELECT * FROM {table_name} WHERE admin_level = {admin_level}"
      sd = '1980'
      ed = '2019'

    print(f"Dataset {dataset_id} {table_name}")
    
    # get zarr dataset
    dss = get_cached_remote_zarr(f"{ti}-total-zs-nuts-level-234", 'copernicus-climate/spain-zonal-stats.zarr')
    
    print("\n...getting data variable names\n")
    dvars = list(dss.data_vars.keys())
    dvars = [dvar for dvar in dvars if (dvar.startswith('min_')) or (dvar.startswith('max_')) or (dvar.startswith('total_'))]
    print(dvars)
    
    print("\n...creating breaks\n")
    breaks = [create_breaks(dss[dvar], 7, 1, method=breaks_method, null_value = None) for dvar in dvars]
    print(breaks)
    
    print("\n...creating legendConfig\n")
    legendConfigs = [create_legend_config(b, cr(dvar)) for b, dvar in zip(breaks, dvars)]
    print(legendConfigs)

    print("\n...creating interaction config\n")
    dvars_ic = list(dss.data_vars.keys())
    dvars_ic = [dvar for dvar in dvars_ic if (dvar.startswith('min_')) or (dvar.startswith('max_')) or (dvar.startswith('total_')) or (dvar.startswith('date_'))]
    dtypes = [dss[dvar].dtype.name for dvar in dvars_ic]
    formats = [None for i in dvars_ic]
    names = [dvar.replace("_", " ").capitalize() for dvar in dvars_ic]
    interactionConfig = create_interaction_config(dvars_ic, dtypes, formats, names)
    print(interactionConfig)

    print("\n...creating layer config\n")
    # Create a layer where the data source is carto-skydipper
    # and the layer is rendered as a mapboxGL chloropleth

    def mk_lc_params(b, dvar):
      # Set the layers parameters, each <key> in the config will be replaced by value
      lp = {
        "admin_level": 2,
        "experiment": "'rcp85'",
        "table_name": table_name,
        "column_name" : dvar,
        "fill_opacity": fill_opacity
      }
      # Add to layers parameters
      colors = cr(dvar)
      lp.update(zip([f"break{n}" for n in range(0, len(b))], b))
      lp.update(zip([f"color{n}" for n in range(0, len(colors))], colors))
      return lp
    
    layerConfigs = [create_layer_config(
        f"map-box-chloropleth-{nbreaks}",
        "vector",
        mk_lc_params(b, dvar),
        render_layers,
        provider_type,
        provider_account,
        provider_layer_sql,
        sql_params = None) for b, dvar in zip(breaks, dvars)]
                        
    # Create the layers
    print("\n...creating api layer attributes\n")
    atts_list = list()
    for dvar, layerConfig, legendConfig, in zip(dvars, layerConfigs, legendConfigs):
        # create name
        dvar_name = dvar.replace("_", " ")
        print(dvar_name)
        # time interval name
        time_name = ti.replace("-", " ").capitalize()
        new_atts = {
          "name": f"{time_name} {dvar_name} admin level 2, 3, and 4",
          "dataset": dataset_id,
          "description": f"{time_name} {dvar_name} averaged per admin. 2, 3, and 4 geometry between {sd} to {ed}{scenario_text}",
          "application": ["copernicusClimate"],
          "iso": ['ESP'],
          "env": "production",
          "provider": "cartodb",
          "layerConfig": layerConfig,
          "legendConfig": legendConfig,
          "interactionConfig": interactionConfig,
          "applicationConfig": {},
          "staticImageConfig": {}
        }
        atts_list.append(new_atts)
    # add to list
    ly_list.append(atts_list)

print("\ndone!\n")      


Processing historical
Dataset 4212100b-d1da-47b4-9fdd-e2564ca955bb historical_total_zs_nuts_level_234

...getting data variable names


...creating breaks

[[31.9, 38.9, 41.5, 43.5, 45.4, 47.3, 49.5, 54.9], [27.6, 33.4, 35.9, 37.6, 39.2, 41.0, 42.9, 46.1], [-33.0, -27.0, -22.0, -19.0, -16.5, -13.9, -10.4, -4.1], [-28.8, -19.8, -16.3, -13.5, -10.5, -7.3, -3.9, 2.7], [0.0, 0.0, 25.6, 29.1, 32.0, 35.0, 38.6, 46.0], [0.0, 0.0, 43.4, 47.9, 51.4, 54.9, 58.7, 65.9], [0.0, 0.0, 145.7, 151.7, 156.6, 161.2, 166.6, 179.9], [0.0, 2.6, 6.5, 10.8, 14.6, 18.9, 24.1, 34.9], [6.0, 13.8, 19.8, 24.8, 28.0, 31.3, 37.8, 56.6], [0.5, 4.3, 8.0, 11.7, 15.8, 19.4, 22.1, 27.0], [0.0, 0.7, 1.9, 3.5, 5.5, 7.9, 12.0, 18.4], [1.0, 11.5, 18.5, 24.6, 29.0, 32.9, 38.0, 45.6], [0.0, 3.8, 7.6, 11.8, 17.4, 23.2, 28.9, 34.8], [0.0, 6.7, 15.8, 21.9, 27.0, 31.4, 35.7, 44.7], [0.0, 0.0, 25.0, 34.7, 43.7, 51.0, 57.6, 68.2], [0.0, 0.0, 99.9, 116.1, 130.5, 142.6, 153.0, 170.7]]

...creating legendConfig

[{'type': 'basic', 'it

In [None]:
import pprint
print(len(ly_list))

# View first result
for att_list in ly_list:
    print(len(att_list))
    pprint.pprint(att_list[0], indent=2)

3
16
{ 'application': ['copernicusClimate'],
  'applicationConfig': {},
  'dataset': '4212100b-d1da-47b4-9fdd-e2564ca955bb',
  'description': 'Historical max petmax averaged per admin. 2, 3, and 4 '
                 'geometry between 1980 to 2019',
  'env': 'production',
  'interactionConfig': { 'output': [ { 'column': 'date_max_petmax',
                                       'format': None,
                                       'property': 'Date max petmax',
                                       'type': 'datetime64[ns]'},
                                     { 'column': 'date_max_tasmax',
                                       'format': None,
                                       'property': 'Date max tasmax',
                                       'type': 'datetime64[ns]'},
                                     { 'column': 'date_min_petmin',
                                       'format': None,
                                       'property': 'Date min petmin',
                 

### Remove previous layers

In [None]:
%%time
import requests

def rmv_layers_from_dataset(dataset_id):
  ds = sky.Dataset(id_hash=dataset_id)
  #print(ds.layers)
  lyids = [ly.id for ly in ds.layers]
  for layer_id in lyids:
      url = f"https://api.skydipper.com/v1/dataset/{dataset_id}/layer/{layer_id}"
      headers = {'Authorization': f"Bearer {sky_api_token}"}
      print(layer_id)
      r = requests.delete(url=url, headers=headers)
      print(r)

# Remove any previous layers
for dataset_id in ["5cc5ee88-13f1-464c-a30e-73d3d556a8cd", "df6f7198-3b05-4d14-919a-6726e34f1603"  , "4212100b-d1da-47b4-9fdd-e2564ca955bb"]:
  print(dataset_id)
  rmv_layers_from_dataset(dataset_id)


5cc5ee88-13f1-464c-a30e-73d3d556a8cd
df6f7198-3b05-4d14-919a-6726e34f1603
4212100b-d1da-47b4-9fdd-e2564ca955bb
CPU times: user 74.4 ms, sys: 5.2 ms, total: 79.6 ms
Wall time: 1.86 s


### Add layers to API

In [None]:
import Skydipper as sky

In [None]:
%%time
for att_list in ly_list:
  for att in att_list:
    #print(att)
    ly = sky.Layer(attributes=att)
    print(ly)

Layer 9a07929a-d9c7-4ed6-b4c6-3c6e6c4d5ce3 Historical max petmax admin level 2, 3, and 4
Layer 424a20e9-efc1-4813-b031-791a4707dbcb Historical max tasmax admin level 2, 3, and 4
Layer 9a061750-3202-4004-8ab7-cc4bd4a13bf5 Historical min petmin admin level 2, 3, and 4
Layer e0692e43-a495-49d3-876b-8bd3b83ab5be Historical min tasmin admin level 2, 3, and 4
Layer aa3e0b4e-4087-4089-bf78-c71632e7ab57 Historical total coldsnap alarms admin level 2, 3, and 4
Layer 1b3119a3-38fd-4f71-befa-f4dd0b97dfbe Historical total coldsnap alerts admin level 2, 3, and 4
Layer 9c737ec4-d9eb-452c-b3e9-8aa04ebee7ed Historical total coldstress extreme admin level 2, 3, and 4
Layer 7dfdd38d-6feb-42bf-a4c3-24f4dc4ca896 Historical total coldstress moderate admin level 2, 3, and 4
Layer 960e4aeb-1534-4e4a-b1d4-eb05cddc5a9b Historical total coldstress strong admin level 2, 3, and 4
Layer 36fbcf0e-e88f-412d-ab9f-e1710d18f267 Historical total heatstress extreme admin level 2, 3, and 4
Layer b2a4ea5e-2cc0-4f8a-90b0-6b

### Export layer summary

In [None]:
import Skydipper as sky

In [None]:
%%time
# Export layer summary
import json
import gcsfs

tis = ['historical', 'future-seasonal', 'future-longterm']
out = {}
for ti in tis:
  dataset_id = ""
  if ti == "future-longterm":
    dataset_id = "5cc5ee88-13f1-464c-a30e-73d3d556a8cd"
  if ti == "future-seasonal":
    dataset_id = "df6f7198-3b05-4d14-919a-6726e34f1603"  
  if ti == "historical":
    dataset_id = "4212100b-d1da-47b4-9fdd-e2564ca955bb"
  ds = sky.Dataset(id_hash=dataset_id)
  layers = ds.layers
  out[ti] = {"name":ds.attributes["name"],
          "id":ds.id,
          "layers": [{"name":ly.attributes["name"],
                      "id":ly.id,
                      "endpoint": f"https://api.skydipper.com/v1/layer/{ly.id}"} for ly in layers]
             }
  print(f"\n{ti}\n")
  print(ds, ":")
  for ly in layers:
    print(" --- ", ly)
    #print("     ", ly.attributes.get("description"))

# Export
#pprint.pprint(out)    
fs = gcsfs.GCSFileSystem(project=gc_project, token=f"/root/.{gc_creds}")
with fs.open(f"{gcs_prefix}/zonal_stats/layer_definitions.json", 'w') as f:
  json.dump(obj=out, fp=f)


historical

Dataset 4212100b-d1da-47b4-9fdd-e2564ca955bb historical_total_zs_nuts-level-234 :
 ---  Layer 9a07929a-d9c7-4ed6-b4c6-3c6e6c4d5ce3 Historical max petmax admin level 2, 3, and 4
 ---  Layer 424a20e9-efc1-4813-b031-791a4707dbcb Historical max tasmax admin level 2, 3, and 4
 ---  Layer 9a061750-3202-4004-8ab7-cc4bd4a13bf5 Historical min petmin admin level 2, 3, and 4
 ---  Layer e0692e43-a495-49d3-876b-8bd3b83ab5be Historical min tasmin admin level 2, 3, and 4
 ---  Layer aa3e0b4e-4087-4089-bf78-c71632e7ab57 Historical total coldsnap alarms admin level 2, 3, and 4
 ---  Layer 1b3119a3-38fd-4f71-befa-f4dd0b97dfbe Historical total coldsnap alerts admin level 2, 3, and 4
 ---  Layer 9c737ec4-d9eb-452c-b3e9-8aa04ebee7ed Historical total coldstress extreme admin level 2, 3, and 4
 ---  Layer 7dfdd38d-6feb-42bf-a4c3-24f4dc4ca896 Historical total coldstress moderate admin level 2, 3, and 4
 ---  Layer 960e4aeb-1534-4e4a-b1d4-eb05cddc5a9b Historical total coldstress strong admin leve

### Export demo layer summary

In [None]:
# Export demo layer summary
import json
fs = gcsfs.GCSFileSystem(project=gc_project, token=f"/root/.{gc_creds}")
with fs.open(f"{gcs_prefix}/zonal_stats/layer_definitions.json", 'r') as f:
  ls = json.load(f)
#print(ls)

def get_layer(ti, name):
  return [l for l in ls.get(ti).get('layers') if l.get('name') == name]

out = {
    "historical": {
        "heatwaves": {
            "layers":  get_layer('historical', "Historical max tasmax admin level 2, 3, and 4")
        },
        "coldsnaps": {
            "layers":  get_layer('historical', "Historical min tasmin admin level 2, 3, and 4") 
           
        },
        "thermalcomfort": {
            "layers": get_layer('historical', "Historical max petmax admin level 2, 3, and 4") + get_layer('historical', "Historical min petmin admin level 2, 3, and 4")
        }
    },
    "future-seasonal": {
        "heatwaves": {
            "layers": get_layer('future-seasonal', "Future seasonal max tasmax admin level 2, 3, and 4")
        },
        "coldsnaps": {
            "layers":  get_layer('future-seasonal', "Future seasonal min tasmin admin level 2, 3, and 4")
        }
    },
    "future-longterm": {
        "heatwaves": {
            "layers":  get_layer('future-longterm', "Future longterm max tasmax admin level 2, 3, and 4")
        },
        "coldsnaps": {
            "layers": get_layer('future-longterm', "Future longterm min tasmin admin level 2, 3, and 4")
        }
    }
}

# Export
print(json.dumps(out, indent=4))    
fs = gcsfs.GCSFileSystem(project=gc_project, token=f"/root/.{gc_creds}")
with fs.open(f"{gcs_prefix}/zonal_stats/demo_map_layer_definitions.json", 'w') as f:
  json.dump(obj=out, fp=f)

{
    "historical": {
        "heatwaves": {
            "layers": [
                {
                    "name": "Historical max tasmax admin level 2, 3, and 4",
                    "id": "424a20e9-efc1-4813-b031-791a4707dbcb",
                    "endpoint": "https://api.skydipper.com/v1/layer/424a20e9-efc1-4813-b031-791a4707dbcb"
                }
            ]
        },
        "coldsnaps": {
            "layers": [
                {
                    "name": "Historical min tasmin admin level 2, 3, and 4",
                    "id": "e0692e43-a495-49d3-876b-8bd3b83ab5be",
                    "endpoint": "https://api.skydipper.com/v1/layer/e0692e43-a495-49d3-876b-8bd3b83ab5be"
                }
            ]
        },
        "thermalcomfort": {
            "layers": [
                {
                    "name": "Historical max petmax admin level 2, 3, and 4",
                    "id": "9a07929a-d9c7-4ed6-b4c6-3c6e6c4d5ce3",
                    "endpoint": "https://api.skydip

### Set ACLs to public

In [None]:
# Set ACLs to public
p = "zonal_stats"
set_acl_to_public(f"{gcs_prefix}/{p}/")

gsutil -m acl -r ch -u AllUsers:R gs://copernicus-climate/zonal_stats/
Set acl(s) sucsessful
