# JupyterGIS demo

## Outline

* Aggregate gridded data based on vector regions (e.g. neighborhoods)
  * Not straightforward to do in Python
    * Design:
      * Start in a Notebook, prepared with Maryamâ€™s expertise
      * Loading GeoPandas, tools for Zonal Statistics
      * Programmatically create .jGIS document, add input data sources and output data sources.
      * Demonstrate collaboration of JGIS alongside Notebook. Annotation, ad layer from catalog, etc.

### Resources

* https://www.geopythontutorials.com/notebooks/xarray_zonal_stats.html?utm_source=chatgpt.com#data-pre-processing
* Carl's class
  * https://espm-288.carlboettiger.info/tutorials/python/spatial-2.html
  * https://espm-288.carlboettiger.info/tutorials/python/spatial-1.html
  * https://espm-288.carlboettiger.info/tutorials/python/spatial-3.html
  * https://espm-288.carlboettiger.info/tutorials/python/spatial-4.html
* https://carpentries-incubator.github.io/geospatial-python/10-zonal-statistics.html
* https://medium.com/data-science/zonal-statistics-algorithm-with-python-in-4-steps-382a3b66648a
* https://automating-gis-processes.github.io/CSC18/lessons/L6/zonal-statistics.html

## From geopythontutorials.com

https://www.geopythontutorials.com/notebooks/xarray_zonal_stats.html?utm_source=chatgpt.com

New dependencies

* rioxarray
* geocube
* xarray-spatial

### Download the data

In [None]:
import os

data_folder = "data"

def download(url, data_folder):
    filename = os.path.join(data_folder, os.path.basename(url))
    if not os.path.exists(filename):
        from urllib.request import urlretrieve
        local, _ = urlretrieve(url, filename)
        print('Downloaded ' + local)

raster_file = 'chirps-v2.0.2021.tif'
zones_file = 'cb_2021_us_county_500k.zip'

files = [
    'https://data.chc.ucsb.edu/products/CHIRPS-2.0/global_annual/tifs/' + raster_file,
    'https://www2.census.gov/geo/tiger/GENZ2021/shp/' + zones_file,
]

for file in files:
  download(file, data_folder)

### Data pre-processing

In [None]:
import geopandas as gpd

zones_file_path = os.path.join(data_folder, zones_file)

zones_df = gpd.read_file(zones_file_path)
# TODO: Louisiana instead?
california_df  = zones_df[zones_df['STATE_NAME'] == 'California'].copy()
california_df.iloc[:5, :5]

In [None]:
california_df['GEOID'] = california_df.GEOID.astype(int)

In [None]:
import rioxarray as rxr

raster_filepath = os.path.join(data_folder, raster_file)
raster = rxr.open_rasterio(raster_filepath, mask_and_scale=True)
clipped = raster.rio.clip(california_df.geometry)
clipped

In [None]:
precipitation = clipped.sel(band=1)
precipitation

In [None]:
from geocube.api.core import make_geocube

california_raster = make_geocube(
    vector_data=california_df,
    measurements=['GEOID'],
    like=precipitation,
)
california_raster

In [None]:
from xrspatial import zonal_stats

stats_df = zonal_stats(zones=california_raster.GEOID, values=precipitation)
stats_df.iloc[:5]

In [None]:
stats_df['GEOID'] = stats_df['zone'].astype(int)

In [None]:
joined = california_df.merge(stats_df[['GEOID', 'mean']], on='GEOID')
joined.iloc[:5, -5:]

In [None]:
import matplotlib.pyplot as plt

fig, ax = plt.subplots(1, 1)
fig.set_size_inches(10,10)

legend_kwds={
           'orientation': 'horizontal',  # Make the legend horizontal
           'shrink': 0.5,  # Reduce the size of the legend bar by 50%
           'pad': 0.05,  # Add some padding around the legend
           'label': 'Precipitation (mm)',  # Set the legend label (optional)
       }
joined.plot(ax=ax, column='mean', cmap='Blues',
          legend=True, legend_kwds=legend_kwds)
ax.set_axis_off()
ax.set_title('Total Precipitation 2021 for California Counties')
plt.show()

In [None]:
joined.explore()

## From Carl's class

New dependencies (don't add to environment, this is just for accessing data):

* ibis-duckdb
* odc-stac

New dependencies (add to environment):

* exactextract

### Setting up raster data (NDVI)

https://espm-288.carlboettiger.info/tutorials/python/spatial-3.html

In [31]:
!export CURL_CA_BUNDLE=/etc/ssl/certs/ca-certificates.crt

In [32]:
import ibis
from ibis import _

con = ibis.duckdb.connect(extensions=["spatial"])
# TODO: Why doesn't this work?
# "does not exist in the file system, and is not recognized as a supported dataset name"
#redlines = (
#    con
#    .read_geo("/vsicurl/https://dsl.richmond.edu/panorama/redlining/static/mappinginequality.gpkg")
#    .filter(_.city == "New Haven", _.residential)
#)
redlines = (
    con
    .read_geo("./mappinginequality.gpkg")
    .filter(_.city == "New Haven", _.residential)
)
city =  redlines.execute()
box = city.total_bounds
box

IOException: IO Error: GDAL Error (4): Failed to open file /home/jovyan/workshop-open-source-geospatial/modules/06-geojupyter/mappinginequality.gpkg: {"exception_type":"IO","exception_message":"Cannot open file \"/home/jovyan/workshop-open-source-geospatial/modules/06-geojupyter/mappinginequality.gpkg\": No such file or directory","errno":"2"}

LINE 1: ... "ibis_read_geo_mi2kfggcinh4xhgc2c7kig6cl4" AS SELECT * FROM ST_READ('/home/jovyan/workshop-open-source-geospatial/modules...
                                                                        ^

In [35]:
from pystac_client import Client

items = (
  Client.
  open("https://earth-search.aws.element84.com/v1").
  search(
    collections = ['sentinel-2-l2a'],
    bbox = box,
    datetime = "2024-06-01/2024-09-01",
    query={"eo:cloud_cover": {"lt": 20}}).
  item_collection()
)
items

In [36]:
import odc.stac

### Zonal statistics

https://espm-288.carlboettiger.info/tutorials/python/spatial-4.html