# SnowExSQL Database 

 
__Tutorial Author Micah'__: [Micah Sandusky](https://github.com/micah-prime)

__Tutorial Author Micah_o__: [Micah Johnson](https://github.com/micahjohnson150)

[SnowEx](https://snow.nasa.gov/campaigns/snowex) has introduced a unique opportunity to study SWE in a way that's unprecedented, but with more data comes new challenges. 

![examples](./images/snowex_database/data_examples.png)


<!-- 
<img src="https://snowexsql.readthedocs.io/en/latest/_images/gallery_overview_example_12_0.png" alt="Grand Mesa Overview" width="1000px"> -->

**The SnowEx database is a resource that shortcuts the time it takes to ask cross dataset questions**

      
- Standardizing diverse data
- Cross referencing data
- Provenance!
- Added GIS functionality
- Connect w/ ArcGIS or QGIS!
- **CITABLE** 

    * [*2022- Estimating snow accumulation and ablation with L-band interferometric synthetic aperture radar (InSAR)*](https://tc.copernicus.org/articles/17/1997/2023/tc-17-1997-2023-discussion.html)
    * [*2024 - Thermal infrared shadow-hiding in GOES-R ABI imagery: snow and forest temperature observations from the SnowEx 2020 Grand Mesa field campaign*](https://tc.copernicus.org/articles/18/2257/2024/)
      
      

## What's in it?

* Snow pits - Density, hardness profiles, grain types + sizes
* Manual snow depths - TONS of depths (Can you say spirals?)
* Snow Micropenetrometer (SMP) profiles - (Subsampled to every 100th)
* Snow depth + SWE rasters from ASO Inc.
* GPR
* Pit site notes
* Camera Derived snow depths
* Snow off DEM from USGS 3DEP 
* And almost all the associated metadata

## Technically, what is it?

* PostgreSQL database
* PostGIS extension
* Supports vector and raster data
* And a host of GIS operations
* AND NOW WITH API!


### So what's the catch?
New tech can create barriers...

```{figure} ./images/snowex_database/pits_not_bits.jpg
:scale: 20 %
:alt: pits not bits
```

### TL;DR Do less wrangling, do more crunching. 

## How do I get at this magical box of data ?
* [SQL](https://www.postgresql.org/docs/13/tutorial-sql.html) 
* [snowexsql](https://github.com/SnowEx/snowexsql/) <span style="font-size:20pt;"> **&#8592; ðŸ˜Ž**</span>


### Welcome to API Land

In [None]:
from snowexsql.api import PointMeasurements

df = PointMeasurements.from_filter(type="depth", instrument='pit ruler', limit=100)
df.plot(column='value', cmap='jet', vmin=10, vmax=150)
df

### Old Ways / Advanced Users 
Advanced queries can be made using SQL or SQAlchemy under the hood. 

See previous presentations

Engine objects, session objects, and a crash course in ORM, oh my! 
* [Hackweek 2021](https://snowex-2021.hackweek.io/tutorials/database/index.html)
* [Hackweek 2022](https://snowex-2022.hackweek.io/tutorials/database/index.html)

# How is the Database Structured?

The goal of the database is to hold as much of the SnowEx data in one place and make it easier to 
do research with. With that in mind follow the steps below to see how the the data base is structured.


## Where do datasets live (i.e. tables)?

Data in the database lives in 1 of 4 places. 


```{figure} ./images/snowex_database/structure.png
:scale: 50 %
:alt: Structure of the snowex db

Layout of the database tables

```

The 4th table is a table detailing the site information. Lots and lots of metadata for which the API has not been written yet.

So how does this look in python?

In [None]:
from snowexsql.api import PointMeasurements, LayerMeasurements, RasterMeasurements

## How are tables structured?
Each table consists of rows and columns. Below are the available columns!


In [None]:
# Import the class reflecting the points table in the db
from snowexsql.api import PointMeasurements as measurements

# Grab one measurement to see what attributes are available
df = measurements.from_filter(type="depth", limit=1)

# Print out the results nicely
print("These are the available columns in the table:\n \n* {}\n".format('\n* '.join(df.columns)))

**Try this:** Using what we just did, but swap out PointMeasurements for LayerMeasurements.


**Question:** Did you collect any data? What is it? What table do you think it would go in?

For more detail, checkout the readthedocs page on [database structure](https://snowexsql.readthedocs.io/en/latest/database_structure.html) to see how data gets categorized. 

## Bonus Step: Learning to help yourself
[snowexsql](https://github.com/SnowEx/snowexsql/) has a host of resources for you to  help your self. First when you are looking for something be sure to check the snowexsql's docs.
There you will find notes on the database structure. datasets, and of course our new API! 

### Database Usage/Examples
* [snowexsql Code](https://github.com/SnowEx/snowexsql/) 
* [snowexsql Documentation](https://snowexsql.readthedocs.io/en/latest/) 

### Database Building/Notes
* [snowex_db Code](https://github.com/SnowEx/snowex_db/) 
* [snowex_db Documentation](https://snowex_db.readthedocs.io/en/latest/) 

## Recap 
You just explored the database structure and discussed how they differ.

**You should know:**
* Which table a dataset might live in
* What columns you can work with (or how to get the available columns)
* Some resources to begin helping yourself.

If you don't feel comfortable with these, you are probably not alone, let's discuss it!

# Forming Queries through the API!

Get familiar with the tools available for querying the database. The simplest way is to use the api classes 
* [`snowexsql.api.PointMeasurements`](https://github.com/SnowEx/snowexsql/blob/830fa76de8cf13c5101e1b4b663c1b399f81d7e6/snowexsql/api.py#L185)
* [`snowexsql.api.LayerMeasurements`](https://github.com/SnowEx/snowexsql/blob/830fa76de8cf13c5101e1b4b663c1b399f81d7e6/snowexsql/api.py#L262)

* Each class has to very useful functions
  1. [`from_filter`](https://github.com/SnowEx/snowexsql/blob/830fa76de8cf13c5101e1b4b663c1b399f81d7e6/snowexsql/api.py#L192)
  2. [`from_area`](https://github.com/SnowEx/snowexsql/blob/830fa76de8cf13c5101e1b4b663c1b399f81d7e6/snowexsql/api.py#L210)

## Useful Function - `from_filter`

Use the from filter function to find density profiles


In [None]:
# Import in our two classes to access the db
from snowexsql.api import LayerMeasurements
from datetime import datetime 

# Find some density pit measurements at the Boise site in december 2019.
df = LayerMeasurements.from_filter(
    type="density",
    site_name="Boise River Basin",
    date_less_equal=datetime(2020, 1, 1),
    date_greater_equal=datetime(2019, 12, 1),
)

# Plot Example!
df.plot()

# Show off the dataframe
df

# Analysis Example - Find the bulk density 
df['value'] = df['value'].astype(float)
print(df[['site_id', 'value']].groupby(by='site_id').mean())

## Useful Function - `from_area`
Find specific surface area within a certain distance of a pit.

In [None]:
# Import our api class
from snowexsql.api import LayerMeasurements
from datetime import datetime
import geopandas as gpd 

# import some gis functionality 
from shapely.geometry import Point 

# Find some SSA measurements within a distance of a known point
pnt = Point(740820.624625,4.327326e+06)
df = LayerMeasurements.from_area(pt=pnt, crs=26912, buffer=500,
    type='specific_surface_area')

# plot up the results
ax = df.plot()

# plot the site so we can see how close everything is.
site = gpd.GeoDataFrame(geometry=[pnt], crs=26912)
site.plot(ax=ax, marker='^', color='magenta')

# show off the dataframe
df

## How do I know what to filter on?
We got tools for that! Each class has a host of functions that start with `all_*` these function return the unique value in that column. 

 * `all_types` - all the data types e.g. depth, swe, density...
 * `all_instruments` - all instruments available in the table
 * `all_dates` - all dates listed in the table
 * `all_site_names` - all the site names available in the table. e.g. Grand Mesa

In [None]:
from snowexsql.api import PointMeasurements

# Instantiate the class to use the properties!
measurements = PointMeasurements()

# Get the unique data names/types in the table
results = measurements.all_types
print('Available types = {}'.format(', '.join([str(r) for r in results])))

# Get the unique instrument in the table
results = measurements.all_instruments
print('\nAvailable Instruments = {}'.format(', '.join([str(r) for r in results])))

# Get the unique dates in the table
results = measurements.all_dates
print('\nAvailable Dates = {}'.format(', '.join([str(r) for r in results])))

# Get the unique site names in the table
results = measurements.all_site_names
print('\nAvailable sites = {}'.format(', '.join([str(r) for r in results])))

### More specific filtering options
Sometimes we need a bit more filtering to know more about what I can filter on. Questions like "What dates was the SMP used?" are a bit more complicated than "Give me all the dates for snowex"

The good news is, we have tool for that! `from_unique_entries` is your friend!

In [None]:
# import layer measurements
from snowexsql.api import LayerMeasurements

# Query dates where SMP was used
LayerMeasurements.from_unique_entries(['date'], instrument='snowmicropen')

## Query Nuances
### Limit size 
To avoid accidental large queries, we have added some bumper rails. By default if you ask for more than 1000 records then an error will pop up unless you explicitly say you want more. 

**Try This**: Do a large query. Run the code block below without the limit keyword argument ("kwarg"):

In [None]:
# Import PointMeasurements
from snowexsql.api import PointMeasurements

# Query db using a vague filter or on a huge dataset like GPR but remove the limit kwarg
df = PointMeasurements.from_filter(type='two_way_travel', limit=100)

# Show the dataframe
df



We have added this on the db to allow you to explore without accidentally pulling the entire SnowEx universe down. If you know you want a large query (defined as > 1000) then use the `limit = ####` option in the `from_filter` or `from_area` function.

**Warning** - It is better to filter using other things besides the limit because the limit is not intelligent. It will simply limit the query by the order of entries that were submitted AND fits your filter. So if you encounter this then consider how to tighten up the filter.

### List of Criteria
You can use lists in your requests too!

In [None]:
# Import layer measurements
from snowexsql.api import LayerMeasurements

# Grab all the data that used the one of these instruments (hint hint SSA)
ssa_instruments = ["IS3-SP-15-01US", "IRIS",  "IS3-SP-11-01F"]

# Query the DB (throw a limit for safety)
LayerMeasurements.from_filter(instrument=ssa_instruments, limit=100)

### Greater than or Less than
Sometimes we want to isolate certain ranges of value or even dates. The `greater_equal` and `less_equal` terms can be added on to `value` or `dates`. 

* `date_greater_equal`
* `date_less_equal`
* `value_greater_equal`
* `value_less_equal`
 

In [None]:
# Import the point measurements class
from snowexsql.api import PointMeasurements

# Filter values > 100 cm from the pulse ecko GPR
df = PointMeasurements.from_filter(value_greater_equal=100, type='depth', instrument='pulse EKKO Pro multi-polarization 1 GHz GPR', limit=100)

# Show off the dataframe
df

## Recap 
You just came in contact with the new API tools. We can use each API class to pull from specific tables and filter the data. 
**You should know:**
* How to build queries using `from_filter`, `from_area`, `from_unique_entries`
* Determine what values to filter on
* Manage the limit error
* Filtering on greater and less than
  
If you don't feel comfortable with these, you are probably not alone, let's discuss it!

# Exercise: Visualize a Manual Depth Spiral

During the SnowEx campaigns a TON of manual snow depths were collected, past surveys for hackweek showed an overhelming interest in the manual 
snow depths dataset. This tutorial shows how easy it is to get at that data in the database while learning how to build queries

**Goal**: Visualize a small subset of snow depth, ideally a full spiral (mostly cause they are cool!)

**Approach**: 
1. Determine the necessary details for isolating manual depths
2. Find a pit where many spirals were done. 
3. Buffer on the pit location and grab all manual snow depths

## Process


In [None]:
from snowexsql.api import LayerMeasurements
data_type = 'depth'

### Step 1: Find a pit of interest

In [None]:
# Pick the first one we find
site_id = LayerMeasurements().all_site_ids[0]

# Query the database, we only need one point to get a site id and its geometry
site_df = LayerMeasurements.from_filter(site_id=site_id, limit=1)

# Print it out 
site_df

### Step 2: Collect Snow Depths

In [None]:
# We import the points measurements because snow depths is a single value at single location and date
from snowexsql.api import PointMeasurements 

# Filter the results to within 100m within the point from our pit
df = PointMeasurements.from_area(pt=site_df.geometry[0], type=data_type, buffer=200)
df

### Step 3: Plot it!

In [None]:
# Get the Matplotlib Axes object from the dataframe object, color the points by snow depth value
ax = df.plot(column='value', legend=True, cmap='PuBu')
site_df.plot(ax=ax, marker='^', color='m')

# Use non-scientific notation for x and y ticks
ax.ticklabel_format(style='plain', useOffset=False)

# Set the various plots x/y labels and title.
ax.set_title(f'{len(df.index)} Manual Snow depths collected at {site_id}')
ax.set_xlabel('Easting [m]')
ax.set_ylabel('Northing [m]');


**Try This:**

A. Go back and add a filter to reduce to just one spiral. What would you change to reduce this?

B. Try to filtering to add more spirals. What happens?


## Recap 
You just plotted snow depths and reduce the scope of the data by using `from_area` on it

**You should know:**

* Manual depths are neat.
* filter using from area is pretty slick.
* We can use LayerMeasurements to get site details easily. 


If you don't feel comfortable with these, you are probably not alone, let's discuss it!
