# Geospatial Python
## Accessing satellite imagery using Python
Setup: https://carpentries-incubator.github.io/geospatial-python/index.html

Instruction: https://carpentries-incubator.github.io/geospatial-python/05-access-data.html

Objectives:
* Search public [SpatioTemporal Asset Catalog (STAC)](https://github.com/radiantearth/stac-api-spec/tree/release/v1.0.0) repositories of satellite imagery using Python.
* Inspect search result’s metadata.
* Download (a subset of) the assets available for a satellite scene.
* Open satellite imagery as raster data and save it to disk.

Before executing the code cells, be sure to replace the "_____" as appropriate

In [None]:
# first import necessary libraries
import rioxarray # to open and download remote raster data
from pystac_client import Client # to query STAC API endpoint
from shapely.geometry import Point # to create a point 

In [None]:
# get the source url (top right button) from https://radiantearth.github.io/stac-browser/#/external/earth-search.aws.element84.com/v1
# to access the STAC catalog items
api_url = "_____"

# open the api
client = Client.open(api_url)
# see documentation https://pystac-client.readthedocs.io/en/stable/


In [None]:
# perform metadata search limited to 10 results from Sentinel-2, Level 2A, to retrieve Cloud Optimized GeoTiffs (COGs)

# store a variable pointing to the collection of interest
# Note: collection ID is taken from Sentinel-2 Level 2A - https://radiantearth.github.io/stac-browser/#/external/earth-search.aws.element84.com/v1/collections/sentinel-2-c1-l2a
collection = "_____" 
'''
includes Sentinel-2 data products 
pre-processed at level 2A (bottom-of-atmosphere reflectance) 
and saved in Cloud Optimized GeoTIFF (COG) format:
'''

# create a point to intersect from, note values are in format x (long), y (lat) https://shapely.readthedocs.io/en/stable/reference/shapely.Point.html
point = Point("_____", "_____")  # AMS (Amsterdam Airport Schiphol) coordinates, use https://www.google.com/maps

search = client.search(
    #collections=[collection],
    intersects=point,
    max_items=10,
)

In [None]:
# show the number of scenes (i.e. the portion of the footage recorded by the satellite)
print(search.matched())

In [None]:
# store the metadata of the search results
items = search.item_collection()

In [None]:
# get the length of items
print(len(items))

In [None]:
# loop over the items to get there ids
for item in items:
    print(item)

In [None]:
#  inspect the metadata associated with the first item of the search result
item = items["_____"]
print(item.datetime)
print(item.geometry)
print(item.properties)

In [None]:
'''
EXERCISE: Search the sentinel-2-l2a collection for all the available scenes that satisfy the following criteria: 
- intersect a provided bounding box (use ±0.01 deg in lat/lon from the previously defined point); 
- have been recorded between 20 March 2020 and 30 March 2020; 
- have a cloud coverage less than 15. Note: the eo extension (https://github.com/stac-extensions/eo) is implemented in some collections allowing it to be queried against

* get the count
* save the results to json
'''
bbox = point.buffer("_____").bounds

search = client.search(
    collections=[collection],
    bbox=bbox,
    datetime="2020-03-20/2020-03-30",
    query=["eo:cloud_cover<15"]
)
print(search.matched())
items = search.item_collection()
items.save_object("search.json") # json file saved alongside notebook

## Access the assets


In [None]:
# first item's assets
assets = items[0].assets  

print(assets.keys())

In [None]:
# print a minimal description of the available assets
for key, asset in assets.items():
    print(f"{key}: {asset.title}")

In [None]:
# show one metadata value
print(assets["thumbnail"])
print(assets["thumbnail"].href)

In [None]:
# open nir with the rioxarray library
nir_href = assets["nir"].href
nir = rioxarray.open_rasterio(nir_href)
print(nir)

In [None]:
# save whole tif image to disk - this may take awhile
nir.rio.to_raster("_____")

In [None]:
# save portion of an image to disk
nir[0,1500:2200,1500:2200].rio.to_raster("nir_subset.tif")

In [None]:
'''
Exercise: 
Using pystac_client, connect to the STAC endpoint https://lpdaac.usgs.gov/products/hlsl30v002/ 
- search for all assets of the Landsat 8 collection (HLSL30.v2.0) 
- from February to March 2021, 
- intersecting the point with longitude/latitude coordinates (-73.97, 40.78) deg.
* Visualize an item’s thumbnail (asset key browse).

'''
cmr_api_url = "https://cmr.earthdata.nasa.gov/stac/LPCLOUD"
client = Client.open(cmr_api_url)

# setup search
search = client.search(
    collections=["_____"],
    intersects=Point("_____", "_____"),
    datetime="2021-02-01/2021-03-30",
) # nasa cmr cloud cover filtering is currently broken: https://github.com/nasa/cmr-stac/issues/239

# retrieve search results
items = search.item_collection()
print(len(items))

In [None]:
# sort by cloud cover and show details for the least cloudy image
items_sorted = sorted(items, key=lambda x: x.properties["eo:cloud_cover"])
item = items_sorted[0]
print(item)

In [None]:
# show the image url
print(item.assets["browse"].href)

### Final note: Public metadata does not mean public data.
Consider getting a free NASA Earthdata account here https://urs.earthdata.nasa.gov/
And creating a netrc file for access here https://git.earthdata.nasa.gov/projects/LPDUR/repos/daac_data_download_python/browse/EarthdataLoginSetup.py

Then using 
*import os<br>
os.environ["GDAL_HTTP_COOKIEFILE"] = "./cookies.txt"<br>
os.environ["GDAL_HTTP_COOKIEJAR"] = "./cookies.txt"*
