## Rough Agenda

* 00:00 – 00:15 Welcome and introductions
* 00:15 – 00:30 Technical setup in Colab or Jupyter
* 00:30 – 00:55 Introduction to spatial data types
* 00:55 – 01:30 Hands on with spatial data – simple mapping and joins
* 01:30 – 01:45 Coordinate Systems
* 01:45 – 02:00 Break
* 02:00 – 02:45 Rasters and zonal statistics
* 02:45 – 03:15 Putting the pieces together – spatial joins, aggregations, and mapping
* 03:15 – 03:30 Wrap up and further resources 

# Goals and Approach
This tutorial assumes familiarity with tabular data, table joins, and general data science and data structures that research software engineers will, by and large, understand. It will focus on the addition of spatial capabilities to that toolset and intentionally chooses geopandas and related libraries for data processing so as to act as a layer on top of other tools for working with data rather than a whole new suite of concepts. It won't explain concepts like tabular subsets (often called selections in GUI-based GIS tools) in depth, because these are common to data science approaches, but it will explain new capabilities provided by spatial data.

## How is Spatial Data Different?
For most experienced analysts of tabular and image data, spatial data will be familiar, but it provides a few conceptual differences:
1. Tabular fields with spatial information can be connected to other tables by their spatial information leading to new kinds of joins based not on keys, but on *location*, with varied criteria for matching. This is a deeply powerful capability leading to a whole new class of questions and answers
2. Image data - not just visible light images, but also gridded datasets - have geographic referencing allowing them to be overlaid in space with other image or tabular data.

## Spatial Data Questions
* **What exists** at a certain location?
* **Where** (or when) are certain conditions satisfied?
* **What has changed** in a place over time?
* What **spatial patterns** exist?
* **What if** this condition occurred at this place? (modeling, hypothesis testing)
* **Where** do variables interact?
* We can try to answer questions of “why” and “how” too, but they require reformulating the question


# Spatial Data Packages in Python
To start out, let's review a few of our friends in working with spatial data

* **GDAL**: Geospatial Data Abstraction Library - software underlying most of the spatial operations we'll do - you usually don't need to call it directly, but you usually *do* need it installed
* **fiona**: Python package for reading/writing spatial data formats
* **geopandas**: A package that extends Pandas dataframes so that they also have spatial data support (including spatial operations) - crossing data science approaches with spatial information!
* **folium**: A package for making simple maps. We'll use it in this tutorial in order to visualize results of our code within the notebook

# Spatial Data Types

In [35]:
# Let's start by just showing a basic Leaflet map with none of our own data on it

import folium
m = folium.Map(location=(37.365, -120.424), zoom_start=16)  # initialize a folium object over UC Merced
# we set its center to a latitude and longitude (note coordinate order. Lat,Lon is roughly Y,X not X,Y - roughly because these are angular coordinates, not cartesian)
# zoom levels are powers of 2, more zoomed in the higher the number

m   # show the map in the notebook

In [None]:
# show a folium map of point data, line data, and polygon data

# Making Simple Maps

# Spatial Joins

# Coordinate Systems

# Raster Data

# Raster to Vector Joins -> Zonal Statistics!

# Putting the pieces together – spatial joins, aggregations, and mapping

# Further Resources
## Other tutorials
Here are some additional tutorials and resources that may be useful as points of reference:

1. [Data Carpentry's Geospatial Workshop](https://datacarpentry.org/geospatial-workshop/)

## Housekeeping and installations
First, some housekeeping. On linux/mac machines, dependencies are easy to build, but on Windows machines, things get weird quickly.

For this notebook, we'll make sure it runs fine in the binder environment (yay!), but we want you to have some notes so that you don't throw your computer out the window the first time you try to apply what you've learned outside of the notebook.

Anaconda/conda environments are often a solution, but I've seen `geopandas` break conda environments (entirely) as often as I've successfully installed it, so I recommend a different approach. Use your preferred environment manager, and install the wheels directly from Christopher Gohlke, who builds current wheels for current Python versions of common spatial software, after which it's safe to `pip install` geopandas.

### Linux/Mac
Here's what that looks like. On Linux/Mac, `python -m pip install geopandas` should get you the whole stack, though you may need to install some system packages for gdal (gdal-bin, gdal-dev) first.

If you want to use geopandas on one of these environments though, you should be safe to take a look at both the:
1. [`gdal` installation instructions](https://pypi.org/project/GDAL/) and
2. the [`geopandas` installation instructions](https://geopandas.org/en/stable/getting_started/install.html)

### On Windows



All of that is subject to change - these aren't definitive, but are meant to give you a spot to start after this workshop. The spatial python ecosystem is dispersed, so you're not stuck trying to work with geopandas, but it's powerful and may work well with many workflows RSEs already use.

https://github.com/nickrsan/spatial_resources/blob/main/installing_spatial_python_windows.md