# SpatioTemporal Asset Catalogs (STAC)

This lab will demonstrate how to search for and download geospatial data in the cloud. 

It will introduce <a href="https://stacspec.org/en/" target="_blank">SpatioTemporal Asset Catalogs (STAC)</a>, a specification that makes it easy to query and search through large collections of geospatial data assets stored in the cloud. 

You will also learn to use the <a href="https://pystac-client.readthedocs.io/en/stable/index.html" target="_blank">pystac_client</a> package which provides tools for working with STAC in Python.

We're going to use the pystac_client package to query a range of STAC Catalogs hosted in the cloud. We'll complete the following tasks:

* Find the least cloudy Sentinel-2 image for a field in Western Australia using the Microsoft Planetary Computer.
* Find the least cloud Landsat image for a field in Western Australia using the Microsoft Planetary Computer.
* Find and download a time-series of cloud free satellite images during a growing season.
* Find and download meteorological data from cloud-based data catalogs.

## Setup

### Run the labs

You can run the labs locally on your machine or you can use cloud environments provided by Google Colab. **If you're working with Google Colab be aware that your sessions are temporary and you'll need to take care to save, backup, and download your work.**

<a href="https://colab.research.google.com/github/geog3300-agri3003/coursebook/blob/main/docs/notebooks/week-6_2.ipynb" target="_blank">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

### Download data

If you need to download the data for this lab, run the following code snippet.  

In [None]:
import os
import subprocess

if "data_lab-6" not in os.listdir(os.getcwd()):
    subprocess.run('wget "https://github.com/geog3300-agri3003/lab-data/raw/main/data_lab-6.zip"', shell=True, capture_output=True, text=True)
    subprocess.run('unzip "data_lab-6.zip"', shell=True, capture_output=True, text=True)
    if "data_lab-6" not in os.listdir(os.getcwd()):
        print("Has a directory called data_lab-6 been downloaded and placed in your working directory? If not, try re-executing this code chunk")
    else:
        print("Data download OK")

### Working in Colab

If you're working in Google Colab, you'll need to install the required packages that don't come with the colab environment.

In [None]:
if 'google.colab' in str(get_ipython()):
    !pip install xarray[complete]
    !pip install rioxarray
    !pip install mapclassify
    !pip install rasterio
    !pip install planetary-computer
    !pip install pystac-client
    !pip install odc-stac
    !pip install adlfs

## SpatioTemporal Asset Catalogs (STAC)

First, let's briefly outline what the STAC specification is before completing some data querying and downloading tasks to make the concepts concrete.

**spatiotemporal asset:** this is a file comprising geospatial data for a location and point in time. For example, this could be Landsat or Sentinel-2 satellite images stored in the cloud such as in Microsoft Azure or Amazon Web Services. This is a file that we can download and use the data in our analysis and applications. However, if you look at <a href="https://planetarycomputer.microsoft.com/catalog" target="_blank">Microsoft's Planetary Computer Data Catalog</a>, <a href="https://aws.amazon.com/marketplace/search/results?trk=868d8747-614e-4d4d-9fb6-fd5ac02947a8&sc_channel=el&FULFILLMENT_OPTION_TYPE=DATA_EXCHANGE&CONTRACT_TYPE=OPEN_DATA_LICENSES&filters=FULFILLMENT_OPTION_TYPE%2CCONTRACT_TYPE" target="_blank">Amazon Web Services Open Data</a>, or the <a href="https://explorer.sandbox.dea.ga.gov.au/stac/" target="_blank">Digital Earth Australia Open Data Cube</a> you will see there are lots of spatiotemporal assets available (for free). The challenge is searching through these collections of assets to find the data you need and downloading it. The STAC specification provides a solution for this. 

The STAC specification comprises:

* **STAC Item** - a GeoJSON feature that represents a spatiotemporal asset with links to the spatiotemporal asset and additional metadata fields (e.g. bounding box, thumbnail, datetime, cloud cover).
* **STAC Catalog** - a JSON file of links to STAC Items to support querying and retrieving STAC Items. STAC Catalogs can comprise sub-catalogs that group together related data within a larger structure. For example, Microsoft's Planetary Computer might create a STAC Catalog for all of its spatiotemporal assets and organise these assets in sub-catalogs (e.g. a catalog for Landsat 7, Landsat 8, Sentinel-2, SRTM DEM etc.).
* **STAC Collection** - an extension of a STAC Catalog with additional metadata properties (e.g. extents, licences, providers) to describe STAC Items within the collection. 
* **STAC API** - an API that allows clients to query a STAC collection, search for STAC Items, and retrieve their links for downloading. The search endpoint is designed to receive queries of STAC Catalogs that filter on location, date, and time as well as other fields. It returns a GeoJSON FeatureCollection object with of STAC Items that meet the search criteria. 

### Tips

These are some tips for working with STAC here.

* use rectangular bounding boxes or area-of-interest geometries to quickly identify STAC Items that intersect with their extent.
* for exploratory work use small areas-of-interest to minimise the size of searches of STAC Collections and the amount of data transmitted over the network. 

### Useful links

* <a href="https://stacspec.org/en" target="_blank">STAC website</a>: the STAC homepage with details about STAC, tutorials, and links to STAC catalogs.
* <a href="https://radiantearth.github.io/stac-browser/#/" target="_blank">STAC Browser</a>: a web browser to search for STAC catalogs.
* <a href="https://stacindex.org/" target="_blank">STAC Index</a>: an index of STAC catalogs and tutorials.
* <a href="https://planetarycomputer.microsoft.com/catalog" target="_blank">Microsoft Planetary Computer Catalog</a>: Microsoft Planetary Computer's STAC catalogs.

### Import modules

In [None]:
import os
import json
import geopandas as gpd
import pandas as pd
import numpy as np
import xarray as xr
import odc.stac
import pystac_client
import planetary_computer as pc
import plotly.express as px
import plotly.io as pio
from skimage import io

from pystac.extensions.eo import EOExtension as eo

# setup renderer
if 'google.colab' in str(get_ipython()):
    pio.renderers.default = "colab"
else:
    pio.renderers.default = "jupyterlab"

## Sentinel-2 and Microsoft Planetary Computer

To provide an introducion to the STAC specification and using it to search for spatiotemporal assets, we'll use it to query Microsoft's Planetary Computer to find a cloud free Sentinel-2 satellite image for a field in Western Australia. 

We'll be using the <a href="https://pystac-client.readthedocs.io/en/stable/" target="_blank">pystac_client</a> package which is a STAC Python Client providing classes for working with STAC Catalogs and APIs.

First, we need to create a `pystac_client.Client` object which contains the methods and attributes to interact with a given STAC Catalog. Using the `pystac_client.Client.open()` method we can open a STAC Catalog or API and read the root catalog. 

The `pystac_client.Client.open()` method requires a `url` which points to the STAC catalog or api. The `url` for the Microsoft Planetary Computer STAC API is `"https://planetarycomputer.microsoft.com/api/stac/v1"`.  

In [None]:
# open a connection to the Microsoft Planetary Computer's root STAC catalog
pc_catalog = pystac_client.Client.open(
    url="https://planetarycomputer.microsoft.com/api/stac/v1",
    # modifier=planetary_computer.sign_inplace
)

A `pystac_client` object has a `search()` method that can be used to specify a query to search a STAC Collection for STAC Items that meet certain conditions. The `search()` method has the following parameters that can be used to define scope of the query:

* `max_items` - maximum number of items to return from the search. 
* `bbox` - a list or tuple of of bounding box coordinates. STAC Items that intersect the bounding box will be returned. 
* `intersects` - a str or dict representation of a GeoJSON geometry or Shapely `geometry`. STAC Items that intersect the geometry will be returned. 
* `datetime` - a single datetime or datetime range used to filter STAC Items. 
* `query` - list of JSON or query parameters using the STAC API query extension. 

You can see the full details for the `search()` method <a href="https://pystac-client.readthedocs.io/en/stable/api.html#pystac_client.Client.search" target="_blank">here</a>.

#### Area of interest

Before we can `search()` the Planetary Computer STAC Catalog we need to create the geographic extent for our query. 

We're going to start by reading in a geometry for the field boundary stored in a shapefile. We need to convert the shapefile to one of:

* bounding box coordinates
* a GeoJSON geometry
* a Shapely `geometry`

We'll demonstrate how to do each of these conversions for your reference. 

First let's read the data from file. Then, we'll compute the <a href="https://geopandas.org/en/stable/docs/reference/api/geopandas.GeoSeries.envelope.html" target="_blank">`envelope`</a> of the field's geometry. The envelope is the smallest rectangular geometry to cover the field's geometry. It is often beneficial to pass in simpler geometries than more complex shapes for identifying STAC Items that intersect with an area-of-interest.

In [None]:
# load field boundary from shapefile
data_path = os.path.join(os.getcwd(), "data_lab-6", "BF66_bdy.shp")
aoi = gpd.read_file(data_path)

# add the field boundary to a map object
m = aoi.explore()
aoi_env = aoi["geometry"].envelope
# draw envelope in red
aoi_env.explore(m=m, color="red", style_kwds={"fillOpacity": 0})

A `GeoSeries` is a sequence of Shapely `geometry` objects. Thus, we can just extract the first and only element of the `aoi_env` `GeoSeries` to obtain a Shapely `geometry`.

In [None]:
# get Shapely geometry object
aoi_shapely = aoi_env[0]
print(aoi_shapely)

The process to obtain a GeoJSON str or dict representation of the envelope is more involved. First, we use the `GeoPandas` `to_json()` method to convert the `GeoSeries` to a GeoJSON FeatureCollection in str format. 

Then, we use the `json.loads()` to function to parse the JSON string data to a Python dict. 

Finally, we can subset the `geometry` property out of the dict.

In [None]:
aoi_json = json.loads(aoi_env.to_json())
print("AOI Envelope as GeoJSON FeatureCollection")
print("")
print(aoi_json)
aoi_geometry = dict(aoi_json["features"][0])["geometry"]
print("")
print("AOI Envelope as GeoJSON Geometry")
print("")
print(aoi_geometry)

Finally, it is simple to obtain a list of coordinates for the bounding box by using the `total_bounds` property of the `GeoSeries` and converting it to a list object.

See the GeoPandas <a href="https://geopandas.org/en/stable/docs/reference/api/geopandas.GeoSeries.total_bounds.html" target="_blank">`total_bounds` docs</a>.

In [None]:
bbox = aoi_env.total_bounds.tolist()
bbox

#### Datetime

Let's specify a datetime range to search. Here, we'll look for all Sentinel-2 STAC Items that intersect our area-of-interest for the month of October 2019. 

In [None]:
time_of_interest = "2019-10-01/2019-11-01"

#### Extensions

The STAC specification permits extensions which allow for more detailed descriptions of STAC Items in a collection. A commonly used extension is the <a href="https://github.com/stac-extensions/eo" target="_blank">`Electro-Optical Extension Specification`</a> for describing snapshots of the Earth for a point-in-time and designed for data that's captured for one or more wavelengths of the electromagnetic spectrum (i.e. remote sensing data).

It includes the following item properties:

* `eo:bands`: an array of available bands (i.e. different spectral wavebands for a remote sensing image).
* `eo:cloud_cover`: an estimate of cloud cover for the STAC Item.
* `eo:snow_cover`: an estimate of snow and ice cover for the STAC Item.

The `eo:cloud_cover` property could be useful to help with searching a STAC Collection for cloud free scenes.

We can set up a query of `eo` properties as: `{"eo:cloud_cover": {"lt": 10}}`. This will find all STAC Items with a property of `eo:cloud_cover` less than 10%.

#### Search

We're now ready to search the Planetary Computer STAC Catalog's `sentinel-2-l2a` for all images with low cloud cover in October 2019 that intersect our area-of-interest. 

The `s2_search` object is an `ItemSearch` instance which represents the search of a STAC API. We can retrieve the STAC Items returned by the search as an `ItemCollection` using the `item_collection()` method.

We can print the `ItemCollection` and interactively explore its contents. This helpfully illustrates the structure of the STAC specification. Our search of the `sentinel-2-l2a` collection returned 2 STAC Items. Each STAC Item corresponds to a Sentinel-2 image.

We can explore each of the STAC Items and see that it has several metadata properties (e.g. Bounding Box, Datetime, platform, proj:epsg, eo:cloud_cover), it also has an Assets slot which stores links to the underlying data referenced by the STAC Item. In this case it is a cloud-optimised GeoTIFF files stored in Microsoft Azure. 

In [None]:
# Search the Planetary Computers S2 Catalog
s2_search = pc_catalog.search(
    collections=["sentinel-2-l2a"],
    bbox=bbox,
    datetime=time_of_interest,
    query={"eo:cloud_cover": {"lt": 10}},
)

# Check how many items were returned
s2_items = s2_search.item_collection()
print(f"Returned {len(s2_items)} Items")

In [None]:
s2_items

### Download data

Now we've completed a search of the STAC API and identified that there are two Sentinel-2 images that meet our search criteria, we're in a position to download these images and use their data. 

As these are optical images of the Earth's surface, we'd like to use the least cloudy image.  We can write a small routine to find the STAC Item with the lowest `eo:cloud_cover` value and download that item. 

We imported the `EOExtension` module as `eo` at the start of the notebook. We can call the `eo.ext()` method on a STAC Item to extend it with properties from the `eo` extension. This allows us to get the `eo` item properties such as `cloud_cover`. 

Let's loop over all the STAC Items in our search, retrieve their `eo:cloud_cover` value, and append that value to a list. 

In [None]:
# empty list
cloud_cover = []
for i in s2_items:
    cloud_cover.append(eo.ext(i).cloud_cover)

Next, we'll find the minimum cloud cover value and that STAC Item's position in our `ItemCollection` `s2_items`. 

In [None]:
min_cloud_cover = min(cloud_cover)
min_cloud_cover_idx = cloud_cover.index(min_cloud_cover)
print(f"The STAC Item with lowest cloud cover had {min_cloud_cover}% cloud cover")
print(f"The index postion of the STAC Item with lowest cloud cover in our ItemCollection is {min_cloud_cover_idx}")

Let's subset the the STAC Item with the lowest cloud cover from our `ItemCollection`. This should give us a single STAC Item which we can inspect. 

In [None]:
least_cloudy_s2 = s2_items[min_cloud_cover_idx]
least_cloudy_s2

Now we've identified the STAC Item with the lowest cloud cover, we need to download it. This is where we head to the Assets property of the STAC Item where we see a series of `href` properties with hyperlinks to where that data is physically stored (here, this is in Azure Blob Storage as cloud-optimised GeoTIFF files). 

We can print out the list of Assets associated with the STAC Item.

In [None]:
# print assets properties of STAC Item
least_cloudy_s2.assets.keys()

In [None]:
# lets look at the property for B02 - blue band reflectance
least_cloudy_s2.assets["B02"]

The `href` points to a cloud-optmised GeoTIFF (COG) file stored in Azure Blob Storage (i.e. in the cloud). A COG file is similar to a regular GeoTIFF file, but it can receive HTTP requests to retrieve portions of data that correspond to a geographic extent and at a particular zoom level. 

Planet (a commercial CubeSat company that make use of STAC and GeoTIFFs in their products) have a <a href="An Introduction to Cloud Optimized GeoTIFFS (COGs) Part 1: Overview" target="_blank">blog</a> post that introduce COGs.

#### Recap quiz

<details>
    <summary><b>Why do these features of a cloud-optimised GeoTIFF make them more suited to working with big geospatial datasets than regular GeoTIFF files?</b></summary>
As geospatial datasets increase in size (e.g. satellites capturing data with ever finer spatial resolutions and with a higher cadence) the amount of data we'd need to store and read into memory increases. This might exceed our computer's capacity or result in long runtimes for our program. COGs allow us to just read the data that corresponds to our area-of-interest and not the entire file. This means we can make use of the larger storage capacity of cloud providers and just retrieve the data we need. 
</details>

<p></p>

To download the data for the least cloudy Sentinel-2 data image we can use the `load()` function from the <a href="https://odc-stac.readthedocs.io/en/latest/_api/odc.stac.load.html" target="_blank">odc-stac package</a>.

The `load()` function has the following parameters:

* `items` - an iterable of STAC items to download.
* `bands` - a list of bands to download. Defaults to all if an argument is not passed into the function.
* `bbox` - a bounding box of latitude and longitude values to download data for in the format: `[min(lon), min(lat), max(lon), max(lat)]`.
* `geopolygon` - a geometry to download data for which can be a GeoJSON dict, geopandas GeoDataFrame, or shapely object.
* `path_url` - transforms the URL describing the assets location. This is useful for working with the Planetary Computer to sign the link.

#### Signing links

To download data from the Planetary Computer the link needs to be "signed". This allows Microsoft to manage traffic and use of the Planetary Computer's resources in the cloud. 

The `planetary_computer` package was imported as `pc` and has a `sign()` function we can use to sign links.

For a full list of parameters that you can use to control how `load()` downloads data from STAC inspect the <a href="https://odc-stac.readthedocs.io/en/latest/_api/odc.stac.load.html" target="_blank">docs</a>.  

The data is downloaded into an `xarray.Dataset` object. 

In [None]:
s2_xr = odc.stac.load(
    [least_cloudy_s2], patch_url=pc.sign, bbox=bbox
)
s2_xr

The Sentinel-2 data that we have downloaded is stored as a collection of `xarray.DataArray` objects within a larger container called `xarray.Dataset`. An `xarray.Dataset` can store many `xarray.DataArray` objects that share `dims` and `coordinates`. Here, we have different arrays of different `Variables` that correspond to the same locations and time-periods but different spectral bands.

If we want to stack the arrays corresponding to spectral bands into a 3D-array stored within a `xarray.DataArray` object, we can use the `to_array()` method.

In [None]:
s2_xr =s2_xr.to_array()
s2_xr

#### Recap quiz

**Can you visualise the array visible green reflectance values from `xarray.DataArray` `s2_xr`?**

**Green reflectance is referenced by the Variable label `B03`.**

In [None]:
## ADD CODE HERE

<details>
    <summary><b>answer</b></summary>

```python
s2_xr.sel(variable="B03").plot(robust=True)
```
<p></p>

Or, if we're being precise we can also select the array to visualise by variable and time labels. This is also necessary if we're visualising the data using `imshow()` which requires either a 2D array or 3D array with three bands (to visualise as an RGB image).
```python
s2_xr.sel(variable="B03", time="2019-10-21T02:04:51.024000000").plot.imshow(robust=True)
```

</details>

#### Recap quiz

**Often we don't need all the spectral bands in a satellite image and it's good to be efficient with the amount of data we download. You can call the `keys()` method on a STAC `item` to get a list of bands (e.g. `least_cloudy_s2.assets.keys()`). Can you use this information and the `bands` parameter of `odc.stac.load()` to download only the red and near infrared Sentinel-2 bands?**

**You can find a table listing the Sentinel-2 bands <a href="https://planetarycomputer.microsoft.com/dataset/sentinel-2-l2a" target="_blank">here</a>.**

In [None]:
## ADD CODE HERE 

<details>
    <summary><b>answer</b></summary>

```python
s2_r_nir_xr = odc.stac.load(
    [least_cloudy_s2], bands=["B04", "B08"], patch_url=pc.sign, bbox=bbox
).to_array()
s2_r_nir_xr
```
</details>

## Landsat and Microsoft Planetary Computer

<a href="" target="_blank">Landsat</a> is a series of satellite missions run by the US Geological Survey (USGS) and NASA. It has been operational since 1972 and the current Landsat mission is Landsat 9. Landsat satellites provide measures of spectral reflectance in the visible, near infrared, shortwave infrared, and thermal portions of the electromagnetic spectrum. The visible and near infrared bands have a spatial resolution of 30 m. 

The Landsat level 2 STAC Collection in the Planetary Computer is labelled `"landsat-c2-l2"`. We can search this STAC Collection for Landsat scenes the intersect the bounding box for our field and for a specified time frame:

In [None]:
# Search the Planetary Computers Landsat Catalog
landsat_search = pc_catalog.search(
    collections=["landsat-c2-l2"],
    bbox=bbox,
    datetime=time_of_interest,
    query={
        "eo:cloud_cover": {"lt": 10},
        "platform": {"in": ["landsat-8", "landsat-9"]}
    },
)

# Check how many items were returned
landsat_items = landsat_search.item_collection()
print(f"Returned {len(landsat_items)} Items")

Similar to the example working with Sentinel-2 data above, we can query each `item`'s cloud cover property to find the least cloudy image. 

In [None]:
# empty list
cloud_cover = []
for i in landsat_items:
    cloud_cover.append(eo.ext(i).cloud_cover)

min_cloud_cover = min(cloud_cover)
min_cloud_cover_idx = cloud_cover.index(min_cloud_cover)
print(f"The STAC Item with lowest cloud cover had {min_cloud_cover}% cloud cover")
print(f"The index postion of the STAC Item with lowest cloud cover in our ItemCollection is {min_cloud_cover_idx}")

Next we can select the least cloud `item` and disply it's attributes. 

In [None]:
least_cloudy_item = landsat_items[min_cloud_cover_idx]
least_cloudy_item

We can print out the band names and their description too. This is useful if we want to subset particular bands from the image for further analysis or visualisation. 

In [None]:
max_key_length = len(max(least_cloudy_item.assets, key=len))
for key, asset in least_cloudy_item.assets.items():
    print(f"{key.rjust(max_key_length)}: {asset.title}")

Finally, let's read the red, green, blue, and near infrared bands for the least cloudy Landsat image covering the field we're working with. 

In [None]:
landsat_xr = odc.stac.load(
    [least_cloudy_item], bands=["blue", "green", "red", "nir08"], patch_url=pc.sign, bbox=bbox
).isel(time=0).to_array()
landsat_xr

#### Recap quiz

**Can you visualise the least cloudy Landsat image as true colour (RGB) composite image?**

**You can refer to the <a href="https://planetarycomputer.microsoft.com/dataset/landsat-c2-l2#Example-Notebook" target="_blank">Planetary Computer examples</a> to help with this task.**

In [None]:
## ADD CODE HERE

<details>
    <summary><b>answer</b></summary>

```python
landsat_xr.sel(variable=["red", "green", "blue"]).plot.imshow(robust=True)
```
</details>

#### Recap quiz

**Can you compute the NDVI for the least cloud Landsat image and visualise it?**

**You can refer to the <a href="https://planetarycomputer.microsoft.com/dataset/landsat-c2-l2#Example-Notebook" target="_blank">Planetary Computer examples</a> to help with this task.**

In [None]:
## ADD CODE HERE

<details>
    <summary><b>answer</b></summary>

```python
red = landsat_xr.sel(variable="red").astype("float")
nir = landsat_xr.sel(variable="nir08").astype("float")
ndvi = (nir - red) / (nir + red)
ndvi.plot.imshow(robust=True)
```
</details>

## Visualising a time-series of satellite images

We can use the `odc.stac.load()` function to download many satellite images of the same location captured on different dates. Let's visualise all the relatively cloud free Landsat 8 images captured during the 2019 growing season for the field in Western Australia we're working with. First, we need to expand the time period we're searching for images in.

In [None]:
season_of_interest = "2019-05-01/2019-10-31"

We can pass this expanded time period in the the `search()` method of the `pc_catalog()` object to search the Planetary Computers `landsat-c2-l2` collection for all Landsat 8 scenes that intersect out bounding box between May and October 2019. 

In [None]:
# Search the Planetary Computers Landsat Catalog
landsat_search = pc_catalog.search(
    collections=["landsat-c2-l2"],
    bbox=bbox,
    datetime=season_of_interest,
    query={
        "eo:cloud_cover": {"lt": 10},
        "platform": {"in": ["landsat-8"]}
    },
)

# Check how many items were returned
landsat_items = landsat_search.item_collection()
print(len(landsat_items))

The first argument to `odc.stac.load()` is an iterable object (e.g. a list) of STAC `items`. We can pass in the `landsat_items` `ItemCollection` object which stores a series of STAC `items`. 

In [None]:
landsat_xr = odc.stac.load(
    landsat_items, bands=["blue", "green", "red", "nir08"], patch_url=pc.sign, bbox=bbox
)

This has returned to us an `xarray.Dataset` with four variables (one for each of the spectral bands we requested) and each variable is a 3D array with x, y, and time dimensions. 

In [None]:
landsat_xr

Let's convert the `xarray.Dataset` object to an `xarray.DataArray` object which stores a 4D array with x, y, variable, and time dimensions. 

In [None]:
landsat_arr = landsat_xr.to_array()
landsat_arr

If your refer back to week 2, we used facet plots where we plot data on many subplots that share axes. Plotting a time series of satellite images is a good use for a faceted plot. We can create a different subplot for each time point and keep the x and y axes representing geographic location the same. An `xarray.DataArray`'s `plot.imshow()` method has a `col` argument that we can pass a dimension into for creating faceted plots; here, we'll pass in the `"time"` dimension. 

In [None]:
landsat_arr.sel(variable=["red", "green", "blue"]).plot.imshow(col="time", col_wrap=2)

## Downloading meteorological data from STAC Collections

Alongside remote sensing images, meteorological data is often organised within STAC Collections. There are lots of examples of how to access meteorological data from the <a href="https://planetarycomputer.microsoft.com/catalog" target="_blank">Planetary Computer's Data Catalog</a>. You are encouraged to try out a few of them.

Here is a short example of how we can retrieve air temperature data covering the field in Western Australia from the <a href="https://planetarycomputer.microsoft.com/dataset/era5-pds#overview" target="_blank">ERA5 climate reanalysis product</a>.

First, let's search the `era5-pds` catalog for all data in May 2019 and extract the first item in the collection returned to us. We can see that the assets key points to the location of data representing many climate variables (`href` - a URL to where the data file is stored in the cloud).

In [None]:
search = pc_catalog.search(
    collections=["era5-pds"], datetime="2019-05", query={"era5:kind": {"eq": "an"}}
)
items = search.item_collection()

print(len(items))
item = items[0]
item

To download data from the Planetary Computer we need to sign it, as discussed above. Let's also subset out the air temperature asset.

In [None]:
signed_item = pc.sign(item)
air_temp = signed_item.assets["air_temperature_at_2_metres"]

We can directly read the air temperature data into an `xarray.DataArray` object using the `xr.open_dataset()` function as the data is stored in zarr format (refer back to week 3 for a refresher on zarr data).  

In [None]:
ds = xr.open_dataset(air_temp.href, **air_temp.extra_fields["xarray:open_kwargs"])

In [None]:
ds

Let's get the coordinates for the centre of the field, and extract the air temperature data for that location and plot it. 

In [None]:
# get the coordinates for the field's centroid
x, y = aoi_env.centroid[0].coords.xy
print(x, y)

In [None]:
ds["air_temperature_at_2_metres"].sel(lon=x[0], lat=y[0], method="nearest").plot()

#### Recap quiz

**What does setting `method="nearest"` enable in the `sel()` method?**

<details>
    <summary><b>answer</b></summary>

It let's us select the `xarray.DataArray` values closed to the nearest lat and lon values we also use for selection.   
</details>