# speciesgrids demo

_Originally written by Pieter Provoost and available in the [iobis/hackathon repository.](https://github.com/iobis/hackathon/blob/master/notebooks/Python/speciesgrids_demo.ipynb)_

This notebook demonstrates querying the speciesgrids data product using geopandas and duckdb. The examples use the `s3://obis-products/speciesgrids/h3_7/` remote datasource as default. To query a local copy instead for better performance, use the local file path.

## Regional species list using duckdb

In the example below we use duckdb to query the gridded product with H3 resolution 7 to obtain a regional species list. The [geometry for our region of interest](https://wktmap.com/?b40bc054) is encoded as WKT and used in the duckdb query.

In [None]:
# First install modules
!pip install duckdb pyarrow geopandas lonboard seaborn

In [None]:
import duckdb
import pyarrow.dataset as ds

wkt = "POLYGON ((2.694397 51.187951, 2.694397 51.271367, 3.013 51.271367, 3.013 51.187951, 2.694397 51.187951))"

con = duckdb.connect()
con.sql("""
    install spatial;
    load spatial;
""")
dataset = ds.dataset("s3://obis-products/speciesgrids/h3_7/", format="parquet")
con.register("dataset", dataset)

df = con.execute(f"""
	select
        species,
        sum(records) as records,
        min(min_year) as min_year,
        max(max_year) as max_year,
        max(source_obis) as source_obis,
        max(source_gbif) as source_gbif
    from dataset
    where st_intersects(st_geomfromwkb(geometry), st_geomfromtext('{wkt}')) 
    group by species
    order by sum(records) desc
""").fetchdf()
df

## Species distributions using geopandas

This example uses geopandas to query the gridded product with H3 resolution 7 to obtain species distributions for the genus Gadus, and lonboard to visualize the distributions on a map.

In [None]:
import geopandas
import lonboard
import seaborn as sns

filters = [("genus", "==", "Gadus")]
gdf = geopandas.read_parquet("s3://obis-products/speciesgrids/h3_7/", filters=filters)[["cell", "records", "geometry", "species"]]

def generate_colors(unique_species):
    palette = sns.color_palette("Paired", len(unique_species))
    rgb_colors = [[int(r*255), int(g*255), int(b*255)] for r, g, b in palette]
    color_map = dict(zip(unique_species, rgb_colors))
    colors = lonboard.colormap.apply_categorical_cmap(gdf["species"], color_map)
    return colors

point_layer = lonboard.ScatterplotLayer.from_geopandas(gdf)
point_layer.get_radius = 10000
point_layer.radius_max_pixels = 2
point_layer.get_fill_color = generate_colors(gdf["species"].unique())
lonboard.Map([point_layer])