# Open and run analysis on multiple polygons <img align="right" src="../Supplementary_data/dea_logo.jpg">

* **Compatability:** Notebook currently compatible with both the `NCI` and `DEA Sandbox` environments
* **Products used:** 
[ga_ls8c_ard_3](https://explorer.sandbox.dea.ga.gov.au/ga_ls8c_ard_3)

## Background
Many users need to run analyses on their own areas of interest. 
A common use case involves running the same analysis across multiple polygons in a vector file (e.g. ESRI Shapefile or GeoJSON). 
This notebook will demonstrate how to use a vector file and the Open Data Cube to extract satellite data from Digital Earth Australia corresponding to individual polygon geometries.

## Description
If we have a vector file containing multiple polygons, we can use the python package [geopandas](https://geopandas.org/) to open it as a `GeoDataFrame`. 
We can then iterate through each geometry and extract satellite data corresponding with the extent of each geometry. 
Further anlaysis can then be conducted on each resulting `xarray.Dataset`.

We can retrieve data for each polygon, perform an analysis like calculating NDVI and plot the data.

1. First we open the vector file as a `geopandas.GeoDataFrame`
2. Iterate through each polygon in the `GeoDataFrame`, and extract satellite data from DEA
3. Calculate NDVI as an example analysis on one of the extracted satellite timeseries
4. Plot NDVI for the polygon extent

***


## Getting started
To run this analysis, run all the cells in the notebook, starting with the "Load packages" cell. 

### Load packages
Please note the use of `datacube.utils` package `geometry`: 
this is important for saving the coordinate reference system of the incoming shapefile in a format that the Digital Earth Australia query can understand.

In [1]:
# Install the rioxarray if required

!pip install rioxarray



In [2]:
%matplotlib inline

import datacube
import rasterio.crs
import geopandas as gpd
import pandas as pd
import matplotlib.pyplot as plt
from datacube.utils import geometry
import numpy as np

import sys
sys.path.append('../Scripts')
# from dea_datahandling import load_ard
from dea_bandindices import calculate_indices
from dea_plotting import rgb, map_shapefile
from dea_temporaltools import time_buffer
from dea_spatialtools import xr_rasterize

from dea_coastaltools import tidal_tag
from dea_coastaltools import tidal_stats

## Packages from the polygon labelling workflow (Sentinel_2_Cloud_Labelling.ipynb)

# %matplotlib widget

import functools
import os
# import sys
# import datacube
from datacube.storage.masking import make_mask
import datacube.utils.cog
# import geopandas as gpd
import ipyleaflet
from IPython.display import display
import ipywidgets as widgets
import matplotlib.cm
import matplotlib.colors
# import matplotlib.pyplot as plt
# import numpy as np
import odc.ui
from odc.ui import with_ui_cbk
import rasterio.features
import rioxarray
from shapely.geometry import shape
import skimage.color as colour
import skimage.io
import sklearn.metrics
from tqdm.notebook import tqdm
import xarray
import pickle

# sys.path.append("../Scripts")
from dea_dask import create_local_dask_cluster
from dea_datahandling import load_ard, array_to_geotiff
from dea_plotting import display_map
# from dea_plotting import rgb



### Connect to the datacube
Connect to the datacube database to enable loading Digital Earth Australia data.

In [3]:
dc = datacube.Datacube(app='Analyse_multiple_polygons')
create_local_dask_cluster()

0,1
Client  Scheduler: tcp://127.0.0.1:43779  Dashboard: /user/cp/proxy/8787/status,Cluster  Workers: 1  Cores: 2  Memory: 14.18 GB


## Analysis parameters

* `time_of_interest` : Enter a time, in units YYYY-MM-DD, around which to load satellite data e.g. `'2019-01-01'`
* `time_buff` : A buffer of a given duration (e.g. days) around the time_of_interest parameter, e.g. `'30 days'`
* `vector_file` : A path to a vector file (ESRI Shapefile or GeoJSON)
* `attribute_col` : A column in the vector file used to label the output `xarray` datasets containing satellite images. Each row of this column should have a unique identifier
* `products` : A list of product names to load from the datacube e.g. `['ga_ls7e_ard_3', 'ga_ls8c_ard_3']`
* `measurements` : A list of band names to load from the satellite product e.g. `['nbart_red', 'nbart_green']`
* `resolution` : The spatial resolution of the loaded satellite data e.g. for Landsat, this is `(-30, 30)`
* `output_crs` : The coordinate reference system/map projection to load data into, e.g. `'EPSG:3577'` to load data in the Albers Equal Area projection
* `align` : How to align the x, y coordinates respect to each pixel. Landsat Collection 3 should be centre aligned `align = (15, 15)` if data is loaded in its native UTM zone projection, e.g. `'EPSG:32756'` 

In [4]:
# Setup the general query


time = ('2013-01-01', '2020-08-01')
products= ['ga_ls8c_ard_3']
bands = ['nbart_red', 'nbart_green', 'nbart_blue', 'nbart_nir', 'nbart_swir_1']
resolution = (-30, 30)
output_crs = 'EPSG:3577'
align = (0, 0)

# Query
query = {
#     'x': lon, # Use query.update({str(key):variable}) for the classification coords
#     'y': lat,
    'time': time,
    'measurements' : bands,
    'output_crs': 'EPSG:3577',
    'resolution': (-30, 30),
    'group_by': 'solar_day'
}

# Designate dask chunks
# It doesn't really matter how big the chunks we load are, as long as time ~ 1.
chunks = {"time": 1, "x": 3000, "y": 3000}

# Choice: polygon/class source
* ### If they are provided then use the *Predetermined polygons* workflow below
* ### If they are to be user nominated then use the *Polygon classification* workflow below

# *Predetermined polygons*
User defines:
* vector polygon file

Automatically defined:
* coords (from vector file extents)
* class ints

In [5]:
### Run this cell with predetermined vector file sets

# vector_file = '../Supplementary_data/Analyse_multiple_polygons/multiple_polys.shp'
vector_file = 'QISMCQ_polygons_testarea.shp'
attribute_col = 'id'

# Read in the polygon vector file
gdf = gpd.read_file(vector_file)

In [6]:
# Attribute each class with an integer value
val = (gdf['BRD_HAB'].unique()).tolist()

num_list = []
attr_key = []

d = 0
for x in range (len(gdf)):
    for d in range (len(val)):        
        if gdf['BRD_HAB'].values[x] == str(val[d]):
            num_list.append(d)
        # Create a key to interpret the integer attribute for each class
        for y in num_list:
            if y not in attr_key:
                    attr_key.append(y)

val = [[el] for el in val]
for x in attr_key:
    val[x].append(attr_key[x])
        
print ('The attribute values for each class are as follows: ' + str(val))

# Update the geodataframe of vector polygons with the integer attribution for each class
gdf['id'] = num_list

# Map the shapefiles from imported vector set
map_shapefile(gdf, attribute=attribute_col)

The attribute values for each class are as follows: [['Intertidal grass-herb-sedge-other succulent', 0], ['Intertidal mangroves and other trees & shrubs', 1], ['Intertidal consolidated substrate', 2], ['Intertidal seagrass', 3], ['Intertidal unconsolidated substrate', 4]]


Label(value='')

Map(center=[-24.085816122499978, 151.55457361750007], controls=(ZoomControl(options=['position', 'zoom_in_text…

In [None]:
# Load data for predetermined polygons - may need some debugging to correspond to above cell

# Dictionary to save results 
results = {}
# results2 = {}

# Loop through polygons in geodataframe and extract satellite data
for index, row in gdf.iterrows():
    
    print(f'Feature: {index + 1}/{len(gdf)}')
    print (gdf['BRD_HAB'].values[index])
    print (str(index))
    print (str(row))
    
    if not (str(row[attribute_col]) in results.keys()):
        results[str(row[attribute_col])] = {}
    
    # Extract the feature's geometry as a datacube geometry object
    geom = geometry.Geometry(geom=row.geometry, crs=gdf.crs)
    
    # Update the query to include our geopolygon
    query.update({'geopolygon': geom}) 
    
    # Load landsat
    ds = load_ard(dc=dc, 
                  products=products,
                  # min_gooddata=0.99,  # only take uncloudy scenes
                  ls7_slc_off = False,                  
                  **query)
    
            ## Tidally tag datasets
    ds = tidal_tag(ds, ebb_flow=True)
    
    # Generate a polygon mask to keep only data within the polygon
    mask = xr_rasterize(gdf.iloc[[index]], ds)
    
    # Mask dataset to set pixels outside the polygon to `NaN`
    ds = ds.where(mask)
    
    # Append results to a dictionary using the attribute
    # column as an key
#     results.update({str(row[attribute_col]): ds}) ## Original. I think it only saves one polygon per class
#     results.update({str(row[attribute_col]):{str(index): ds}}) # New dict of dicts with each polygon recorded under each class.
    results[str(row[attribute_col])][str(index)] = ds
    
#         ## Tidally tag datasets
#     ds1 = tidal_tag(ds, ebb_flow=True)
    
#         # Generate a polygon mask to keep only data within the polygon
#     mask1 = xr_rasterize(gdf.iloc[[index]], ds1)
    
#         # Mask dataset to set pixels outside the polygon to `NaN`
#     ds1 = ds1.where(mask1)
    
#     # Append results to a dictionary using the attribute column as key
#     results2[str(row[attribute_col])][str(index)] = ds1 

    print (row[attribute_col], index)
    print ('----------------------')

Feature: 1/394
Intertidal grass-herb-sedge-other succulent
0
OBJECTID                                                     11
CONSOL                                                        C
DOM_TYPE                                                      1
DOM_LABEL                   Grass-herb-sedge (undifferentiated)
CO_TYPES                                                   None
TIDE_ZONE                                            Intertidal
BRD_HAB             Intertidal grass-herb-sedge-other succulent
Shape_Leng                                           0.00706932
Shape_Area                                          1.27758e-06
geometry      POLYGON ((151.5643725000001 -24.12794199999996...
id                                                            0
Name: 0, dtype: object
Finding datasets
    ga_ls8c_ard_3
Applying pixel quality/cloud mask
Loading 159 time steps


In [None]:
# save results dict (optional)


try:
    resultsdict = open('resultsdict _' + str(time[0]), 'wb')
    pickle.dump(results, resultsdict)
    resultsdict.close()
# except:
#     print("something went wrong")

In [None]:
#  Open (pickled) results dict (optional)

# pickle.load('resultsdict _' + str(time[0]))

# *Polygon classification*
User defines:
* coords
* dataset_name for file naming


In [None]:
# # Define the coordinate extent for the area of interest

# # Gulf of Carpentaria
# # lon = (138.8998, 139.0989)
# # lat = (-16.8341, -16.9264)

# # Queensland East
# lon = (150.65, 150.8)
# lat = (-22.457, -22.654)

# # Update the query
# query.update({'x': lon})
# query.update({'y': lat})

# # Filename for saved datasets
# dataset_name = 'Test_05'


In [None]:
# # Confirm area of interest

# # Display the area of interest given the coords from the query
# display_map(x=query['x'], y=query['y'])#, crs=query['output_crs'])

In [None]:
# query

In [None]:
# # Load datasets

# # ds = dc.load(product=products,
# # #              progress_cbk=with_ui_cbk(),  # Add a progress bar like this if you decide to use eager loading.
# #              dask_chunks=chunks,
# # #              measurements=bands,
# #              **query)
# # Load landsat
# ds = load_ard(dc=dc, 
#               products=products,
#               min_gooddata=0.99,  # only take uncloudy scenes             
# #               group_by='solar_day',
#               **query)

#         ## Tidally tag datasets
# ds = tidal_tag(ds, ebb_flow=True)

# # Attribute and sort for tide height
# lowest = ds.tide_height.quantile([0.05]).values
# ds_lowtide = ds.where(ds.tide_height <= lowest, drop=True)

In [None]:
# # Load ITEMv2 and mask out everything above max tide height 
# # (later, update with dynamic DEACL high tide lines)

# # Remove 'measurements' key from query as it is not a valid 
# # measurement of the ITEMv2 product

# query.pop('measurements')
# query.pop('time')

# # Load ITEMv2 for the area of interest
# ds_ITEM = dc.load(
#     'item_v2',
# #     dask_chunks = chunks,
#     **query)

# # Filter dataset for ITEM values greater than 0 and less than 9
# ds_ITEM = ds_ITEM.where(ds_ITEM.relative > 0)
# ds_ITEM = ds_ITEM.where(ds_ITEM.relative < 9)

# # Remove the time dimension from ITEM to enable masking
# ds_ITEM = ds_ITEM.squeeze(dim='time', drop=True)

# # Optional: view the loaded, filtered ITEM dataset
# # %matplotlib inline # Confirm whether plot shows using %mpl widget
# ds_ITEM.relative.plot()

# ITEM integer interpretation
Single Band Integer Raster:

0 – Always water

1 – Exposed at lowest 0-10% of the observed tidal range

2 – Exposed at 10-20% of the observed tidal range

3 – Exposed at 20-30% of the observed tidal range

4 – Exposed at 30-40% of the observed tidal range

5 – Exposed at 40-50% of the observed tidal range

6 – Exposed at 50-60% of the observed tidal range

7 – Exposed at 60-70% of the observed tidal range

8 – Exposed at 70-80% of the observed tidal range

9 - Exposed at highest 80-100% of the observed tidal range (land)

-6666 – No Data

In [None]:
# # Three band colour combinations for viewing imagery
# rgb_bands = bands[:3]
# snr_bands = ['nbart_swir_1', 'nbart_nir', 'nbart_red'] # For false colour

# # Add ITEM mask as a coordinate array to imagery ds
# ds_lowtide.coords['ITEM'] = (('y', 'x'), ds_ITEM.relative)

# # Select image to classify
# # (view a filmstrip of low-tide images - with and without masking)

# # Plot unmasked true colour filmstrip of low tide datasets
# rgb(ds_lowtide, bands=['nbart_red', 'nbart_green', 'nbart_blue'], col='time')

# # Plot unmasked false colour filmstrip of low tide datasets
# rgb(ds_lowtide, bands=['nbart_swir_1', 'nbart_nir', 'nbart_red'], col='time')
# #     plt.title('Index: ' + str(x) + ', Time: ' + str(ds_lowtide.time[x].values))

# # Plot ITEM-masked true colour filmstrip of low tide datasets
# rgb(ds_lowtide.where(ds_lowtide.ITEM<9), bands=['nbart_red', 'nbart_green', 'nbart_blue'], col='time')
# # plt.title('title')

# # Plot ITEM-masked false colour filmstrip of low tide datasets
# rgb(ds_lowtide.where(ds_lowtide.ITEM<9), bands=['nbart_swir_1', 'nbart_nir', 'nbart_red'], col='time')
# # plt.title('title')

In [None]:
# # Plot images as above with adjusted titles
# for x in range (0,len(ds_lowtide.time)):
#     rgb(ds_lowtide.isel(time=x), bands=['nbart_red', 'nbart_green', 'nbart_blue'], size=4)
#     plt.title('Index: ' + str(x) + ', Time: ' + str(ds_lowtide.time[x].values))
#     rgb(ds_lowtide.isel(time=x), bands=['nbart_swir_1', 'nbart_nir', 'nbart_red'], size=4)
#     plt.title('Index: ' + str(x) + ', Time: ' + str(ds_lowtide.time[x].values))
#     rgb(ds_lowtide.where(ds_lowtide.ITEM<9).isel(time=x), bands=['nbart_red', 'nbart_green', 
#                                                          'nbart_blue'], size=4)
#     plt.title('Index: ' + str(x) + ', Time: ' + str(ds_lowtide.time[x].values))

In [None]:
# # Classify image - ths cell may not be necessary

# item_masked = ds_lowtide.where(ds_lowtide.ITEM<9)
# # item_masked

# # Based on the above imagery, choose a timeslice (based upon its index value in the figure title)
# # to classify

# index = 1
# # ds_lowtide.isel(time=index)


### Pinched from Sentinel_2_Cloud_Labelling.ipynb
Now let's label the nominated image. We'll use an ipyleaflet widget based on `Imagery_on_web_map.ipynb` and `interactive_polygons.ipynb`. 

Different classes are labelled separately below.

## This code sets up the widget:

In [None]:
# # Generate a list of low_tide images from which to classify
# dates = ds_lowtide.time.values

# selected_dates=[]
# for n in dates:
#     print (n)
#     selected_dates.append(n)

# print('Based on your index value identified above, the imagery to be classified will be: ' + str(selected_dates[index]))

In [None]:
# # Define the functionality to get the interactive map, overlay it with nominated image/s,
# # create the polygon classification tool and build buttons to flick between imagery colour combos

# # Set up the interactive map
# def get_interactive_map(times, index, cover_type='green', clamp=3000):
#     # Set up the map.
#     bbox = ds.geobox.extent.to_crs('EPSG:4326').boundingbox
#     zoom = odc.ui.zoom_from_bbox(bbox)
#     center = (bbox.bottom + bbox.top) * 0.5, (bbox.right + bbox.left) * 0.5
#     m = ipyleaflet.Map(
#         center=center,
#         zoom=zoom,
#         scroll_wheel_zoom=True,  # Allow zoom with the mouse scroll wheel
#         layout=widgets.Layout(
#             width='600px',   # Set Width of the map to 600 pixels, examples: "100%", "5em", "300px"
#             height='600px',  # Set height of the map
#         ))
#     # Add a false colour image
#     def add_image_layer_snr(time):
#         # Add the false colour image.

#         img_layer_snr = odc.ui.mk_image_overlay(
#             item_masked.sel(time=time).drop('time').drop('tide_height').drop('ebb_flow').drop('ITEM')\
#                 .rio.reproject(dst_crs='EPSG:3857', shape=ds_lowtide.nbart_red.shape[1:], 
#                 resampling=rasterio.warp.Resampling.bilinear), 
#             bands=snr_bands, #select snr_bands for false colour, rgb_bands for true colour
#             clamp=clamp,
#             fmt='jpeg')

#         m.add_layer(img_layer_snr)
        
#         return img_layer_snr
#     # Add a true colour image
#     def add_image_layer_rgb(time):
#         # Add the true colour image.

#         img_layer_rgb = odc.ui.mk_image_overlay(
#             item_masked.sel(time=time).drop('time').drop('tide_height').drop('ebb_flow').drop('ITEM')\
#                 .rio.reproject(dst_crs='EPSG:3857', shape=ds_lowtide.nbart_red.shape[1:], 
#                 resampling=rasterio.warp.Resampling.bilinear), 
#             bands=rgb_bands, #select snr_bands for false colour, rgb_bands for true colour
#             clamp=clamp,
#             fmt='jpeg')
        
#         m.add_layer(img_layer_rgb)
        
#         return img_layer_rgb

#     # Add images to the map for the nominated imagery date
#     idx = index
#     img_layer_snr = add_image_layer_snr(times[idx])
#     img_layer_rgb = add_image_layer_rgb(times[idx])
     
#     # Add the drawing controls for the classes
#     fill_colours = {'green': '#BEBEFF', 'beige': '#BEFFBE', 'brown': '#FFBEBE'}
#     feature_collection = {
#         'type': 'FeatureCollection',
#         'features': [],
#     }
#     for type_ in [cover_type]:  
#         draw_control = ipyleaflet.DrawControl()
#         draw_control.polygon = {
#             "shapeOptions": {
#                 "fillColor": fill_colours[type_],
#                 "color": fill_colours[type_],
#             },
#             "allowIntersection": False,
#             'title': type_,
#         }
#         # Disable polyline and circlemarker controls so that only polygon remains.
#         draw_control.polyline = {}
#         draw_control.circlemarker = {}
#         def handle_draw(self, action, geo_json):
#             geo_json['properties']['type'] = type_
#             geo_json['properties']['index'] = times[idx]
#             feature_collection['features'].append(geo_json),
#         draw_control.on_draw(handle_draw)
#         draw_control.edit = False
#         draw_control.remove = False
#         m.add_control(draw_control)
        
#     draw_control.edit = False  # Until syncing works.
#     draw_control.remove = False  # Until syncing works.
    
    
#     # Add button to show true colour image
#     button_next = widgets.Button(
#         description='True colour',
#         button_style='info',
#         icon='next')
#     def on_click(self):
#         nonlocal idx
#         nonlocal img_layer_rgb
#         nonlocal img_layer_snr
#         m.remove_layer(img_layer_snr)
#         img_layer_rgb = add_image_layer_rgb(times[idx])
#     button_next.on_click(on_click)

#         # Add button to show false colour image
#     button_back = widgets.Button(
#         description='False colour',
#         button_style='info',
#         icon='back')
#     def on_click(self):
#         nonlocal idx
#         nonlocal img_layer_rgb
#         nonlocal img_layer_snr
#         m.remove_layer(img_layer_rgb)
#         img_layer_snr = add_image_layer_snr(times[idx])
#     button_back.on_click(on_click)
    
#     buttons=[button_next, button_back]
    
#     return widgets.VBox([widgets.HBox(buttons), m]), feature_collection

# # Convert datetimes into strings following polygon identification
# def save_fc(fc, fn):
#     features = []
#     for feature in fc['features']:
#         feature = feature.copy()
#         feature['properties']['index'] = str(feature['properties']['index'])
#         features.append(feature)
        
#     if not os.path.exists(fn):
#         gdf = gpd.GeoDataFrame.from_features(features)
#         gdf.to_file(fn)
#     else:
#         raise RuntimeError('Label file already exists')


## You can then run the widget to get the interactive map, as well as a list that will be populated with polygons.

In [None]:
# # Label first class type

# map_, fc_green = get_interactive_map(selected_dates, index, cover_type='green')

# print('Draw a polygon around green cover classes of interest (e.g. seagrass) for further interrogation')
# print ('Imagery date: ' + str(selected_dates[index]))
# map_

In [None]:
# save_fc(fc_green, f'green_{dataset_name}.shp')
# gdf_green = gpd.read_file(f'green_{dataset_name}.shp')

### Label next class type

In [None]:
# print('Please label areas of beige (e.g. sand).')
# map_, fc_beige = get_interactive_map(selected_dates, index, cover_type='beige')
# map_

In [None]:
# save_fc(fc_beige, f'beige_{dataset_name}.shp')
# gdf_beige = gpd.read_file(f'beige_{dataset_name}.shp')

# Label next class type

In [None]:
# print('Please label areas of brown (e.g. mud)')
# map_, fc_brown = get_interactive_map(selected_dates, index, cover_type='brown')
# map_

In [None]:
# save_fc(fc_brown, f'brown_{dataset_name}.shp')
# gdf_brown = gpd.read_file(f'brown_{dataset_name}.shp')

## We now have 3 GeoDataFrames which we can easily combine into one for the whole dataset.

In [None]:
# # Combine geodataframes for each class

# gdf_brown = gpd.read_file(f'brown_{dataset_name}.shp')
# gdf_beige = gpd.read_file(f'beige_{dataset_name}.shp')
# gdf_green = gpd.read_file(f'green_{dataset_name}.shp')

In [None]:
# cover_labels_gdf = gpd.pd.concat([gdf_green, gdf_beige, gdf_brown], ignore_index=True)  # here, index means the numeric row index

In [None]:
# # pd.concat fails to produce a GDF in some versions of GPD; https://gis.stackexchange.com/questions/162659/joining-concat-list-of-similar-dataframes-in-geopandas
# assert isinstance(cover_labels_gdf, gpd.GeoDataFrame)

In [None]:
# plt.figure()
# for i in range(2):
#     ax = plt.subplot(1, 1, 1)
#     timestamp = sorted(set(cover_labels_gdf['index']))[i]
#     subgdf = cover_labels_gdf[cover_labels_gdf['index'] == timestamp]
#     subgdf[subgdf['type'] == 'green'].plot(alpha=0.5, ax=ax, color='blue')
#     subgdf[subgdf['type'] == 'beige'].plot(alpha=0.2, ax=ax, color='red')
#     subgdf[subgdf['type'] == 'brown'].plot(alpha=0.5, ax=ax, color='grey')
# plt.axis('off')

In [None]:
# gdf = subgdf

# attribute_col = 'id'


In [None]:
# # Need to attribute an integer per class (type)

# # Rename 'type' column to 'class' to match the style of this workflow
# gdf.rename(columns = {'type':'class'}, inplace = True)

# # Attribute each class with an integer value
# val = (gdf['class'].unique()).tolist()

# num_list = []
# attr_key = []

# d = 0
# for x in range (len(gdf)):
#     for d in range (len(val)):        
#         if gdf['class'].values[x] == str(val[d]):
#             num_list.append(d)
#         # Create a key to interpret the integer attribute for each class
#         for y in num_list:
#             if y not in attr_key:
#                     attr_key.append(y)

# val = [[el] for el in val]
# for x in attr_key:
#     val[x].append(attr_key[x])
        
# print ('The attribute values for each class are as follows: ' + str(val))

# # Update the geodataframe of vector polygons with the integer attribution for each class
# gdf['id'] = num_list

# # Set the crs of the newly classified geodataframe
# gdf = gdf.set_crs("EPSG:4326")

### Look at the structure of the vector file
Import the file and take a look at how the file is structured so we understand what we are iterating through. 
There are two polygons in the file:

In [None]:
# gdf = gpd.read_file(vector_file)
# # gdf.head()


We can then plot the `geopandas.GeoDataFrame` using the function `map_shapefile` to make sure it covers the area of interest we are concerned with:

In [None]:
map_shapefile(gdf, attribute=attribute_col)

### Create a datacube query object
We then create a dictionary that will contain the parameters that will be used to load data from the DEA data cube:

> **Note:** We do not include the usual `x` and `y` spatial query parameters here, as these will be taken directly from each of our vector polygon objects.

In [None]:
# # query = {'time': (time_buffer(time_of_interest, buffer=time_buff)),
# query = {'time': time,
#          'measurements': bands,
#          'resolution': resolution,
#          'output_crs': output_crs,
#          'align': align,
#          }

# query

## Loading satellite data

Here we will iterate through each row of the `geopandas.GeoDataFrame` and load satellite data.  The results will be appended to a dictionary object which we can later index to analyse each dataset.

In [None]:
# # This cell likely superceeded by new notebook structure

# # Dictionary to save results 
# results = {}
# # results2 = {}

# # Loop through polygons in geodataframe and extract satellite data
# for index, row in gdf.iterrows():
    
#     print(f'Feature: {index + 1}/{len(gdf)}')
#     print (gdf['class'].values[index])
#     print (str(index))
#     print (str(row))
    
#     if not (str(row[attribute_col]) in results.keys()):
#         results[str(row[attribute_col])] = {}
    
#     # Extract the feature's geometry as a datacube geometry object
#     geom = geometry.Geometry(geom=row.geometry, crs=gdf.crs)
    
#     # Update the query to include our geopolygon
#     query.update({'geopolygon': geom}) 
    
#     # Load landsat
#     ds = load_ard(dc=dc, 
#                   products=products,
#                   # min_gooddata=0.99,  # only take uncloudy scenes
#                   ls7_slc_off = False,                  
#                   group_by='solar_day',
#                   **query)
    
#             ## Tidally tag datasets
#     ds = tidal_tag(ds, ebb_flow=True)
    
#     # Generate a polygon mask to keep only data within the polygon
#     mask = xr_rasterize(gdf.iloc[[index]], ds)
    
#     # Mask dataset to set pixels outside the polygon to `NaN`
#     ds = ds.where(mask)
    
#     # Append results to a dictionary using the attribute
#     # column as an key
# #     results.update({str(row[attribute_col]): ds}) ## Original. I think it only saves one polygon per class
# #     results.update({str(row[attribute_col]):{str(index): ds}}) # New dict of dicts with each polygon recorded under each class.
#     results[str(row[attribute_col])][str(index)] = ds
    
# #         ## Tidally tag datasets
# #     ds1 = tidal_tag(ds, ebb_flow=True)
    
# #         # Generate a polygon mask to keep only data within the polygon
# #     mask1 = xr_rasterize(gdf.iloc[[index]], ds1)
    
# #         # Mask dataset to set pixels outside the polygon to `NaN`
# #     ds1 = ds1.where(mask1)
    
# #     # Append results to a dictionary using the attribute column as key
# #     results2[str(row[attribute_col])][str(index)] = ds1 

#     print (row[attribute_col], index)
#     print ('----------------------')

---
## Further analysis

Our `results` dictionary will contain `xarray` objects labelled by the unique `attribute column` values we specified in the `Analysis parameters` section:

In [None]:
# Create a second dictionary that contains only datasets filtered by nominated tide range

results2 = {}

for k in results:
    
    if not (str(k) in results2.keys()):
        results2[str(k)] = {}
    
    for kk in results[k]:
        ds = results[k][kk] #.where(ds.ebb_flow == "Ebb", drop=True)
        lowest_20 = ds.tide_height.quantile([0.20]).values ### Need a workaround for this as lowest 20% is calculated per ds and wont be consistent across all observations. However, perhaps this isn't a problem??
#         print (lowest_20)
        results2[k][kk] = ds.where(ds.tide_height <= lowest_20, drop=True)

In [None]:
results2.keys()

for k in results2.keys():
    print (val[int(k)], len(results2[k]))

Enter one of those values below to index our dictionary and conduct further analsyis on the satellite timeseries for that polygon.

In [None]:
class_key = '0'
# print (results[class_key].keys())
polygon_index_key = '0'

### Plot an RGB image     -     needs debugging
We can now use the `dea_plotting.rgb` function to plot our loaded data as a three-band RGB plot:

In [None]:
# # To show the results of a single polygon:

# # Would be useful to see the outline of the whole polygon to compare pixel retrievals

# rgb(results2[class_key][polygon_index_key], col='time', size=4)
# rgb(results2['1'][polygon_index_key], col='time', size=4)

In [None]:
# To show the results for every polygon in a given class:

# Would be useful to see the outline of the whole polygon to compare pixel retrievals

# for k in results2[class_key]:
#     rgb(results2[class_key][k], col = 'time')#, size=4)

### Calculate NDVI and plot
We can also apply analyses to data loaded for each of our polygons.
For example, we can calculate the Normalised Difference Vegetation Index (NDVI) to identify areas of growing vegetation:

In [None]:
# zonal_stats = 'mean'
# indice = 'NDVI'

In [None]:
# For 'results' ds, calculate NDVI for all pixels inside each polygon then calculate a zonal stat for the polygon
# This code works for a single time step for each polygon

polydrill = {}

for k in results:
    
    if not (str(k) in polydrill.keys()):
        polydrill[str(k)] = {}
    
    for kk in results[k]:
        
        ds = results[k][kk]
#         print(ds)
        
        # calculate ndvi for pixels inside the polygon
        ds = calculate_indices(ds, index='NDVI', collection='ga_ls_3')
        
        # calculate a zonal stat for the polygon
        ds = ds.NDVI.mean('y').mean('x')  ## CONFIRM that this is taking the polygon NDVI mean per timeslice
        
        polydrill[str(k)][str(kk)] = ds

In [None]:
# For 'results2' ds (low tide imagery only), calculate NDVI for all pixels inside each polygon then 
# calculate a zonal stat for the polygon
# This code works for a single time step for each polygon

polydrill2 = {}

for k in results2:
    
    if not (str(k) in polydrill2.keys()):
        polydrill2[str(k)] = {}
    
    for kk in results2[k]:
        
        ds = results2[k][kk]
#         print(ds)
       
        # To work around the calculate_indices function which was stalling on the additional 
        # coastal variables:
        # drop tide_height and ebb_flow variables
#         tide_height = ds['tide_height']
#         ebb_flow = ds['ebb_flow']
#         ds.drop_vars(names = ('tide_height', 'ebb_flow'))
#         print (ds)
        
        # calculate ndvi for pixels inside the polygon
        ds = calculate_indices(ds, index='NDVI', collection='ga_ls_3')
        
        # calculate a zonal stat for the polygon
        ds = ds.NDVI.mean('y').mean('x')  ## CONFIRM that this is taking the polygon NDVI mean per timeslice
        
        # reattach tide_height and ebb_flow variables
#         ds['tide_height']= tide_height
#         ds['ebb_flow'] = ebb_flow
        
        polydrill2[str(k)][str(kk)] = ds

In [None]:
# Show zonal mean NDVI per polygon, per timestep for every polygon in the class (all tide heights)

fig, axes = plt.subplots(nrows=(len(val)),sharex='all', figsize=(10,10))

for x in val:
    for kk in polydrill[str(x[1])]:
        polydrill[str(x[1])][kk].plot.line(marker='o', linewidth = 0, ax=axes[x[1]])
        axes[x[1]].set_title(x[0])

In [None]:
# Show zonal mean NDVI per polygon, per timestep for every low-tide polygon in the class

fig, axes = plt.subplots(nrows=(len(val)),sharex='all', figsize=(10,10))

for x in val:
    for kk in polydrill2[str(x[1])]:
        polydrill2[str(x[1])][kk].plot.line(marker='o', linewidth = 0, ax=axes[x[1]])
        axes[x[1]].set_title(x[0])

***

## Additional processing



# Testing cells

### Goal: to attach polygon (not pixel) NDVI value (mean, median - whatever is designated above)
### to the polygon set for plotting (heat map style)

In [None]:
# Map the shapefiles from imported vector set
# map_shapefile(gdf, attribute=attribute_col)

In [None]:
# results2.keys()

for k in results2.keys():
#     print (type(k))
    print (val[int(k)], len(results2[k]))
    
print (len(gdf))

In [None]:
# For every polygon, extract list pairs on the polygon id and temporal mean and std of the nominated indice

lsmean = []
lsstd = []

for k in polydrill2:
    for kk in polydrill2[k]:
        lsmean.append([int(kk), polydrill2[k][kk].mean()])
        lsstd.append([int(kk), polydrill2[k][kk].std()])

# Sort the list by polygon id to match up to the original polygon gdf
# Separate the sorted polygon ids from the indice statistic to build into a pd.DataFrame
lsmean=sorted(lsmean)
lsstd = sorted(lsstd)

indicemean = []
indicestd = []
polyid = []

for x in lsmean:
#     print (x[1])
    polyid.append(x[0])
    indicemean.append(x[1])
    
for x in lsstd:
    indicestd.append(x[1])

# Build a pd.DataFrame from the sorted polygon id and indice statistics. Nominate a name for the new column.

indexstats = pd.DataFrame(indicemean, index = polyid, columns = ['Indice mean'])
indexstats['Indice std'] = None
indexstats.loc[polyid, 'Indice std'] = indicestd

# Merge the indice statistic for each polygon into the original polygon gdf

gdf = gdf.merge(indexstats, on=indexstats.index)

In [None]:
# gdf

In [None]:
# map_shapefile(gdf_tests, attribute='Indice std')
# map_shapefile(gdf, attribute=attribute_col)

In [None]:
# Map the shapefiles from imported vector set. Nominate the new indice statistic column
# map_shapefile(gdf_test, attribute=attribute_col)
# map_shapefile(gdf, attribute='Indice std')

#Next steps:
# drop consolidated and unconsolidated polygons
# add variance metric on indice (e.g. NDVI sd)
# plot variane in nominated indice

### Plan:
- take polygon ndvi values from polydrill2 and calculate mean and sd across all combined timesteps
- attach ndvi mean and sd to the correct polygon id's in gdf
- plot:`map_shapefile(gdf, attribute=attribute_col)` using attribute=new_stdev(ndvi) column

In [None]:
# # Drop consolidated and unconsolidated substrate class polygons
# Update: this doesn't seem to be working. Continue to generate class dict from gdf then pop these two classes
# # x=[]
# # for index, row in gdf_test.iterrows():
# for x in range (len(gdf)):      
#     if gdf['BRD_HAB'].values[x] == "Intertidal consolidated substrate":
# #         print (gdf_test['BRD_HAB'].values[index])
#         gdf.drop(x, inplace=True)
#     else:
#         if gdf['BRD_HAB'].values[x] == "Intertidal unconsolidated substrate":
#             gdf.drop(x, inplace=True)
# #         print (row)
# #         x.append(index)
# #     print (row)
# # print(len(gdf))

In [None]:
# map_shapefile(gdf, attribute='Indice mean')

In [None]:
# gdf

In [None]:
##### Separate classes from the master gdf to view/plot individually

# Define function to append multiple values to single keys in dict
# Reference: https://thispointer.com/python-how-to-add-append-key-value-pairs-in-dictionary-using-dict-update/#6
def append_value(dict, key, value):
    # Check if key exists in dict or not
    if key in dict:
        # Key exists in dict
        # Check if type of value of key is list or not
        if not isinstance(dict[key], list):
            # If type is not a list then make it a list
            dict[key] = [dict[key]]
        # Append the value in list
        dict[key].append(value)
    else:
        # As key is not in dict, add key-value pair
        dict[key] = value
        
# create dict to store classes separately (to create class specific gdf's later on)
classdict = {}
classdict['columns'] = gdf.columns

# group gdf classes into dict keys (for later conversion to class specific gdfs)
for x in gdf['key_0']:
    for cover in gdf['BRD_HAB'].unique():
        if gdf['BRD_HAB'].values[x] == cover:
            append_value(classdict, cover, gdf.values[x])

In [None]:
# Remove columns, consolidated and unconsolidated substrate classes from this analysis

classdictkeys = classdict.pop('columns')
classdict.pop('Intertidal unconsolidated substrate')
classdict.pop('Intertidal consolidated substrate')


In [None]:
## find a way to avoid hard-coding these class names for their new gdf's
for key in classdict.keys():
        print (key)

# Hardcoded gdfs:
grasses = gpd.GeoDataFrame()
mangroves = gpd.GeoDataFrame()
seagrass = gpd.GeoDataFrame()

for item in classdictkeys:
#     print (item)
    grasses[item] = []
    mangroves[item] = []
    seagrass[item] = []

In [None]:
grasses['key_0'] = [x[0] for x in classdict['Intertidal grass-herb-sedge-other succulent']]
grasses['OBJECTID']= [x[1] for x in classdict['Intertidal grass-herb-sedge-other succulent']]
grasses['CONSOL']= [x[2] for x in classdict['Intertidal grass-herb-sedge-other succulent']]
grasses['DOM_TYPE']= [x[3] for x in classdict['Intertidal grass-herb-sedge-other succulent']]
grasses['DOM_LABEL']= [x[4] for x in classdict['Intertidal grass-herb-sedge-other succulent']]
grasses['CO_TYPES']= [x[5] for x in classdict['Intertidal grass-herb-sedge-other succulent']]
grasses['TIDE_ZONE']= [x[6] for x in classdict['Intertidal grass-herb-sedge-other succulent']]
grasses['BRD_HAB']= [x[7] for x in classdict['Intertidal grass-herb-sedge-other succulent']]
grasses['Shape_Leng']= [x[8] for x in classdict['Intertidal grass-herb-sedge-other succulent']]
grasses['Shape_Area']= [x[9] for x in classdict['Intertidal grass-herb-sedge-other succulent']]
grasses['geometry']= [x[10] for x in classdict['Intertidal grass-herb-sedge-other succulent']]
grasses['id']= [x[11] for x in classdict['Intertidal grass-herb-sedge-other succulent']]
grasses['Indice mean']= [x[12] for x in classdict['Intertidal grass-herb-sedge-other succulent']]
grasses['Indice std']= [x[13] for x in classdict['Intertidal grass-herb-sedge-other succulent']]

grasses.set_index('key_0', inplace=True)
grasses = grasses.set_crs("EPSG:4326")

# print(grasses['Indice std'].std())

map_shapefile(grasses, attribute='Indice std', continuous=True)

In [None]:
mangroves['key_0'] = [x[0] for x in classdict['Intertidal mangroves and other trees & shrubs']]
mangroves['OBJECTID']= [x[1] for x in classdict['Intertidal mangroves and other trees & shrubs']]
mangroves['CONSOL']= [x[2] for x in classdict['Intertidal mangroves and other trees & shrubs']]
mangroves['DOM_TYPE']= [x[3] for x in classdict['Intertidal mangroves and other trees & shrubs']]
mangroves['DOM_LABEL']= [x[4] for x in classdict['Intertidal mangroves and other trees & shrubs']]
mangroves['CO_TYPES']= [x[5] for x in classdict['Intertidal mangroves and other trees & shrubs']]
mangroves['TIDE_ZONE']= [x[6] for x in classdict['Intertidal mangroves and other trees & shrubs']]
mangroves['BRD_HAB']= [x[7] for x in classdict['Intertidal mangroves and other trees & shrubs']]
mangroves['Shape_Leng']= [x[8] for x in classdict['Intertidal mangroves and other trees & shrubs']]
mangroves['Shape_Area']= [x[9] for x in classdict['Intertidal mangroves and other trees & shrubs']]
mangroves['geometry']= [x[10] for x in classdict['Intertidal mangroves and other trees & shrubs']]
mangroves['id']= [x[11] for x in classdict['Intertidal mangroves and other trees & shrubs']]
mangroves['Indice mean']= [x[12] for x in classdict['Intertidal mangroves and other trees & shrubs']]
mangroves['Indice std']= [x[13] for x in classdict['Intertidal mangroves and other trees & shrubs']]

In [None]:
mangroves.set_index('key_0', inplace=True)
mangroves = mangroves.set_crs("EPSG:4326")

In [None]:
map_shapefile(mangroves, attribute='Indice std', continous=True)  # What is this plotting? max std definitely NOT 66

In [None]:
seagrass['key_0'] = [x[0] for x in classdict['Intertidal seagrass']]
seagrass['OBJECTID']= [x[1] for x in classdict['Intertidal seagrass']]
seagrass['CONSOL']= [x[2] for x in classdict['Intertidal seagrass']]
seagrass['DOM_TYPE']= [x[3] for x in classdict['Intertidal seagrass']]
seagrass['DOM_LABEL']= [x[4] for x in classdict['Intertidal seagrass']]
seagrass['CO_TYPES']= [x[5] for x in classdict['Intertidal seagrass']]
seagrass['TIDE_ZONE']= [x[6] for x in classdict['Intertidal seagrass']]
seagrass['BRD_HAB']= [x[7] for x in classdict['Intertidal seagrass']]
seagrass['Shape_Leng']= [x[8] for x in classdict['Intertidal seagrass']]
seagrass['Shape_Area']= [x[9] for x in classdict['Intertidal seagrass']]
seagrass['geometry']= [x[10] for x in classdict['Intertidal seagrass']]
seagrass['id']= [x[11] for x in classdict['Intertidal seagrass']]
seagrass['Indice mean']= [x[12] for x in classdict['Intertidal seagrass']]
seagrass['Indice std']= [x[13] for x in classdict['Intertidal seagrass']]

seagrass.set_index('key_0', inplace=True)
seagrass = seagrass.set_crs("EPSG:4326")

map_shapefile(seagrass, attribute='Indice std', continuous=True)

In [None]:
##### Monday: explore what is actually being plotted in the maps. Why is the polygon colouring different when continuous is set to True???
# It would be awesome to see a legend too
# Seagrass will include effects of tide range selection (here, it is set to bottom 20% which is likely why the NDVI values range down to -0.5)

In [None]:
# Testing cell

from ipyleaflet import LegendControl

m = Map(center=(-10,-45), zoom=4)

legend = LegendControl({"low":"#FAA", "medium":"#A55", "High":"#500"}, name="Legend", position="bottomright")
m.add_control(legend)

m

In [None]:
# import modules to enable dea_plotting.py functions

# Import required packages
import math
import folium
import calendar
import ipywidgets
import numpy as np
import geopandas as gpd
import matplotlib as mpl
import matplotlib.patheffects as PathEffects
import matplotlib.pyplot as plt
import matplotlib.animation as animation
from datetime import datetime
from pyproj import Proj, transform
from IPython.display import display
from matplotlib.colors import ListedColormap
from mpl_toolkits.axes_grid1.inset_locator import inset_axes
from mpl_toolkits.axes_grid1 import make_axes_locatable
from ipyleaflet import Map, Marker, Popup, GeoJSON, basemaps, Choropleth, LegendControl
from skimage import exposure
from branca.colormap import linear
from odc.ui import image_aspect

from matplotlib.animation import FuncAnimation
import pandas as pd
from pathlib import Path
from shapely.geometry import box
from skimage.exposure import rescale_intensity
from tqdm.auto import tqdm
import warnings

In [None]:
def _degree_to_zoom_level(l1, l2, margin=0.0):
    
    """
    Helper function to set zoom level for `display_map`
    """
    
    degree = abs(l1 - l2) * (1 + margin)
    zoom_level_int = 0
    if degree != 0:
        zoom_level_float = math.log(360 / degree) / math.log(2)
        zoom_level_int = int(zoom_level_float)
    else:
        zoom_level_int = 18
    return zoom_level_int

In [None]:
# # Testing cell

# from ipyleaflet import LegendControl

# # map_shapefile function from dea_plotting.py script

# def map_shapefile(gdf,
#                   attribute,
#                   continuous=False,
#                   colormap='YlOrRd_09',
#                   basemap=basemaps.Esri.WorldImagery,
#                   default_zoom=None,
#                   hover_col=True,
#                   **style_kwargs):
#     """
#     Plots a geopandas GeoDataFrame over an interactive ipyleaflet 
#     basemap, with features coloured based on attribute column values. 
#     Optionally, can be set up to print selected data from features in 
#     the GeoDataFrame. 

#     Last modified: February 2020

#     Parameters
#     ----------  
#     gdf : geopandas.GeoDataFrame
#         A GeoDataFrame containing the spatial features to be plotted 
#         over the basemap.
#     attribute: string, required
#         An required string giving the name of any column in the
#         GeoDataFrame you wish to have coloured on the choropleth.
#     continuous: boolean, optional
#         Whether to plot data as a categorical or continuous variable. 
#         Defaults to remapping the attribute which is suitable for 
#         categorical data. For continuous data set `continuous` to True.
#     colormap : string, optional
#         Either a string giving the name of a `branca.colormap.linear` 
#         colormap or a `branca.colormap` object (for example, 
#         `branca.colormap.linear.YlOrRd_09`) that will be used to style 
#         the features in the GeoDataFrame. Features will be coloured 
#         based on the selected attribute. Defaults to the 'YlOrRd_09' 
#         colormap.
#     basemap : ipyleaflet.basemaps object, optional
#         An optional `ipyleaflet.basemaps` object used as the basemap for 
#         the interactive plot. Defaults to `basemaps.Esri.WorldImagery`.
#     default_zoom : int, optional
#         An optional integer giving a default zoom level for the 
#         interactive ipyleaflet plot. Defaults to None, which infers
#         the zoom level from the extent of the data.
#     hover_col : boolean or str, optional
#         If True (the default), the function will print  values from the 
#         GeoDataFrame's `attribute` column above the interactive map when 
#         a user hovers over the features in the map. Alternatively, a 
#         custom shapefile field can be specified by supplying a string
#         giving the name of the field to print. Set to False to prevent 
#         any attributes from being printed.
#     **choropleth_kwargs :
#         Optional keyword arguments to pass to the `style` paramemter of
#         the `ipyleaflet.Choropleth` function. This can be used to 
#         control the appearance of the shapefile, for example 'stroke' 
#         and 'weight' (controlling line width), 'fillOpacity' (polygon 
#         transparency) and 'dashArray' (whether to plot lines/outlines
#         with dashes). For more information:
#         https://ipyleaflet.readthedocs.io/en/latest/api_reference/choropleth.html

#     """

#     def on_hover(event, id, properties):
#         with dbg:
#             text = properties.get(hover_col, '???')
#             lbl.value = f'{hover_col}: {text}'
            
#     # Verify that attribute exists in shapefile   
#     if attribute not in gdf.columns:
#         raise ValueError(f"The `attribute` {attribute} does not exist "
#                          f"in the geopandas.GeoDataFrame. "
#                          f"Valid attributes include {gdf.columns.values}.")
        
#     # If hover_col is True, use 'attribute' as the default hover attribute.
#     # Otherwise, hover_col will use the supplied attribute field name
#     if hover_col and (hover_col is True):
#         hover_col = attribute
        
#     # If a custom string if supplied to hover_col, check this exists 
#     elif hover_col and (type(hover_col) == str):
#         if hover_col not in gdf.columns:
#                 raise ValueError(f"The `hover_col` field {hover_col} does "
#                                  f"not exist in the geopandas.GeoDataFrame. "
#                                  f"Valid attributes include "
#                                  f"{gdf.columns.values}.")

#     # Convert to WGS 84 and GeoJSON format
#     gdf_wgs84 = gdf.to_crs(epsg=4326)
#     data_geojson = gdf_wgs84.__geo_interface__
    
#     # If continuous is False, remap categorical classes for visualisation
#     if not continuous:
        
#         # Zip classes data together to make a dictionary
#         classes_uni = list(gdf[attribute].unique())
#         classes_clean = list(range(0, len(classes_uni)))
#         classes_dict = dict(zip(classes_uni, classes_clean))
        
#         # Get values to colour by as a list 
#         classes = gdf[attribute].map(classes_dict).tolist()  
    
#     # If continuous is True then do not remap
#     else: 
        
#         # Get values to colour by as a list
#         classes = gdf[attribute].tolist()  

#     # Create the dictionary to colour map by
#     keys = gdf.index
#     id_class_dict = dict(zip(keys.astype(str), classes))  

#     # Get centroid to focus map on
#     lon1, lat1, lon2, lat2 = gdf_wgs84.total_bounds
#     lon = (lon1 + lon2) / 2
#     lat = (lat1 + lat2) / 2

#     if default_zoom is None:

#         # Calculate default zoom from latitude of features
#         default_zoom = _degree_to_zoom_level(lat1, lat2, margin=-0.5)

#     # Plot map
#     m = Map(center=(lat, lon),
#             zoom=default_zoom,
#             basemap=basemap,
#             layout=dict(width='800px', height='600px'))
    

    
#     # Define default plotting parameters for the choropleth map. 
#     # The nested dict structure sets default values which can be 
#     # overwritten/customised by `choropleth_kwargs` values
#     style_kwargs = dict({'fillOpacity': 0.8}, **style_kwargs)

#     # Get colormap from either string or `branca.colormap` object
#     if type(colormap) == str:
#         colormap = getattr(linear, colormap)
    
#     # Create the choropleth
#     choropleth = Choropleth(geo_data=data_geojson,
#                             choro_data=id_class_dict,
#                             colormap=colormap,
#                             style={**style_kwargs})
    
#     # If the vector data contains line features, they will not be 
#     # be coloured by default. To resolve this, we need to manually copy
#     # across the 'fillColor' attribute to the 'color' attribute for each
#     # feature, then plot the data as a GeoJSON layer rather than the
#     # choropleth layer that we use for polygon data.
#     linefeatures = any(x in ['LineString', 'MultiLineString'] 
#                        for x in gdf.geometry.type.values)
#     if linefeatures:
    
#         # Copy colour from fill to line edge colour
#         for i in keys:
#             choropleth.data['features'][i]['properties']['style']['color'] = \
#             choropleth.data['features'][i]['properties']['style']['fillColor']

#         # Add GeoJSON layer to map
#         feature_layer = GeoJSON(data=choropleth.data)
#         m.add_layer(feature_layer)
        
#     else:
        
#         # Add Choropleth layer to map
#         m.add_layer(choropleth)

#     # If a column is specified by `hover_col`, print data from the
#     # hovered feature above the map
#     if hover_col and not linefeatures:
        
#         # Use cholopleth object if data is polygon
#         lbl = ipywidgets.Label()
#         dbg = ipywidgets.Output()
#         choropleth.on_hover(on_hover)
#         display(lbl)
        
#     else:
        
#         lbl = ipywidgets.Label()
#         dbg = ipywidgets.Output()
#         feature_layer.on_hover(on_hover)
#         display(lbl)

#     # Set the legend conditions and round to 2 decimal places
#     legend = LegendControl({np.around(np.quantile(classes, 0.00),2): colormap.rgb_hex_str(0.00),
#                             np.around(np.quantile(classes, 0.25),2): colormap.rgb_hex_str(0.25),
#                             np.around(np.quantile(classes, 0.50),2): colormap.rgb_hex_str(0.50),
#                             np.around(np.quantile(classes, 0.75),2): colormap.rgb_hex_str(0.75),
#                             np.around(np.quantile(classes, 1.00),2): colormap.rgb_hex_str(1.00) }, 
#                            name=attribute, position="bottomleft")
#     m.add_control(legend)

#     # Display the map
#     display(m)
# #     return colormap
#     return len(id_class_dict)
# #     print (sorted(classes)[0], sorted(classes). ,sorted(classes)[-1])


In [None]:
# # Testing cell

map_shapefile(seagrass, attribute='Indice std', continuous=True)


In [None]:
colormap = 'YlOrRd_09'

if type(colormap) == str:
        colormap = getattr(linear, colormap)

In [None]:
colormap.rgb_hex_str(0.5)

In [None]:
# Get values to colour by as a list
classes = seagrass['Indice std'].tolist()
type(classes)

In [None]:
from statistics import median
median(classes)

In [None]:
sorted(classes)[-1]

In [None]:
np.around((np.quantile(classes, 1)),2)

In [None]:
mangroves['Indice std'].max()