# MAC Final Project: Dataset Overview and Use Case Examples
## EDS 220, Fall 2022


## Using Microsoft Planetary Computer Datasets to Examine Fire and Ground Water in California

#### Authors
- Meagan Brown, UC Santa Barbara, meagan_brown@ucsb.edu
- Andre Dextre, UC Santa Barbara, adextre@ucsb.edu
- Carlo Broderick, UC Santa Barbara, carlobroderick@ucsb.edu

## Table of Contents


[1. Purpose](#purpose)

[2. Dataset Description](#overview)

[3. Data I/O](#io)

[4. Metadata Display and Basic Visualization](#display)

[5. Use Case Examples](#usecases)

[6. Create Binder Environment](#binder)

[7. References](#references)

<a id=‘purpose’></a>
### Notebook Purpose
Demonstrate the use of Microsoft Planetary Computer API using groundwater and wildfire as use cases.



<a id=‘overview’></a>
### Dataset Description
This portion of the notebook contains a summary description of the MPC environmental dataset used in this notebook.

### gridMET
gridMET is a dataset of daily high-spatial resolution (~4-km, 1/24th degree) surface meteorological data covering the contiguous US from 1979-yesterday provided and maintained by the Climatology Lab out of UC Merced. The Climatology Lab generates the data from interpolating gridded climate data from PRISM (https://www.prism.oregonstate.edu/) and regional reanalysis from NLDAS-2 (https://ldas.gsfc.nasa.gov/nldas/NLDAS2forcing.php). The data is available through the Microsoft Planetary computer API and as a netCDF from the lab’s website. Microlimates on the scale os < 4km^2 and wind data on the scale of <32 km^2 are below the resolution threshold and will be difficult to analyse using this data set.
### MODIS Snow Cover 8-day
The MODIS Snow Cover 8-day dataset was created by the National Snow and Ice Data Center (a part of CIRES at the University of Colorado Boulder). The dataset provides global coverage with spatial resolution of snow cover extent observed over an eight-day period within 10degx10deg MODIS sinusoidal grid tiles. Tiles are generated by compositing 500 m observations from the ‘MODIS Snow Cover Daily L3 Global 500m Grid’ data set. Snow Cover data ranges from 02/18/2000 to present. The file format use to store the data is GeoTIFF (COG). We will access the data using Microsoft Planetary Computer which is a similar API to Google Earth Engine. There are no known issues with data quality that may affect our results.
### MODIS Thermal Anomalies/Fire 8-Day
The MODIS Thermal Anomalies/Fire 8-Day dataset was created by NASA LP DAAC at the USGS EROS Center (https://lpdaac.usgs.gov/products/mod14a2v061/) and Microsoft’s Planetary Computer (https://planetarycomputer.microsoft.com/dataset/modis-14A2-061).
This datasets expands from 02/18/2000 - Present and contains Global data  on thermal anomalies/fires at a 1 km spatial resolution. The file format use to store the data is GeoTIFF (COG) and HDF. We will access the data using Microsoft Planetary Computer which is a similar API to Google Earth Engine. There are no known issues with data quality that may affect our results.


<a id='io'></a> 
### Dataset Input/Output 

1) Import all necessary packages 
- pystac-client
- planetary-computer
- geopandas
- rich

2) Parameters:
- data are stored Microsoft Planetary Computer and can be accesed via the API outlined in mpc_example.ipynb notebook and can be browsed at: https://planetarycomputer.microsoft.com/
- Data temporal availability
-- MODIS Snow Cover 8-day (02/18/2000 – Present)
-- MODIS Thermal Anomalies/Fire 8-Day  (02/18/2000 – Present)
-- gridMET (01/01/1979 – 12/31/2020)
- California lat and lon

3) Examples of reading in data from Microsoft Planetary Computer are shown in the mpc_example.ipynb notebook.

In [2]:
# import libraries
%matplotlib inline
import pystac_client
import planetary_computer
import geopandas
import rich.table
from IPython.display import Image

In [3]:
# Connect with Microsoft Planetary Computer (MPC)
catalog = pystac_client.Client.open(
    "https://planetarycomputer.microsoft.com/api/stac/v1",
    modifier=planetary_computer.sign_inplace,
)

In [4]:
# specify presets 
time_range = "2020-12-01/2020-12-31"
bbox = [-122.2751, 47.5469, -121.9613, 47.7458]
source = "landsat-c2-l2"

search = catalog.search(collections=[source], bbox=bbox, datetime=time_range)
items = search.get_all_items()
len(items)

8

<a id='display'></a> 
### Metadata Display and Basic Visualization



In [5]:
# extract a data frame
df = geopandas.GeoDataFrame.from_features(items.to_dict(), crs="epsg:4326")

In [6]:
# filter out clouds
selected_item = min(items, key=lambda item: item.properties["eo:cloud_cover"])
print(selected_item)

<Item id=LC08_L2SP_047027_20201204_02_T1>


In [7]:
# create table of bands and descriptions
table = rich.table.Table("Asset Key", "Description")
for asset_key, asset in selected_item.assets.items():
    table.add_row(asset_key, asset.title)

table

In [8]:
#Generate Image
selected_item.assets["rendered_preview"].to_dict()
Image(url=selected_item.assets["rendered_preview"].href, width=500)

<a id='usecases'></a> 
### Use Case Examples

This is the "meat" of the notebook, and what will take the majority of the time to present in class. This section should provide:
1) A plain-text summary (1-2 paragraphs) of the use case example you have chosen: include the target users and audience, and potential applicability. 

2) Markdown and code blocks demonstrating how one walks through the desired use case example. This should be similar to the labs we've done in class: you might want to demonstrate how to isolate a particularly interesting time period, then create an image showing a feature you're interested in, for example.

3) A discussion of the results and how they might be extended on further analysis. For example, if there are data quality issues which impact the results, you could discuss how these might be mitigated with additional information/analysis.

Just keep in mind, you'll have roughly 20 minutes for your full presentation, and that goes surprisingly quickly! Probably 2-3 diagnostics is the most you'll be able to get through (you could try practicing with your group members to get a sense of timing).


<a id='binder'></a> 
### Create Binder Environment

The last step is to create a Binder environment for your project, so that we don't have to spend time configuring everyone's environment each time we switch between group presentations. Instructions are below:

 - Assemble all of the data needed in your Github repo: Jupyter notebooks, a README file, and any datasets needed (these should be small, if included within the repo). Larger datasets should be stored on a separate server, and access codes included within the Jupyter notebook as discussed above. 
 
 - Create an _environment_ file: this is a text file which contains information on the packages needed in order to execute your code. The filename should be "environment.yml": an example that you can use for the proper syntax is included in this template repo. To determine which packages to include, you'll probably want to start by displaying the packages loaded in your environment: you can use the command `conda list -n [environment_name]` to get a list.
 
 More information on environment files can be found here:
 https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#

 - Create Binder. Use http://mybinder.org to create a  URL for your notebook Binder (you will need to enter your GitHub repo URL). You can also add a Launch Binder button directly to your GitHub repo, by including the following in your README.md:

```
launch with myBinder
[![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/<path to your repo>)
```

<a id='references'></a> 
### References

List relevant references. Here are some additional resources on creating professional, shareable notebooks you may find useful:
