# Accessing Satellite Imagery

## Searching Stac Catalog
[This link](https://radiantearth.github.io/stac-browser/#/) takes you to a site to discover datasets

In [1]:
#specify url
api_url = 'https://earth-search.aws.element84.com/v0'

You can query a STAC API endpoint from Python using the pystac_client library:

In [2]:
from pystac_client import Client

client = Client.open(api_url)

we ask for scenes belonging to the sentinel-s2-l2a-cogs collection. This dataset includes Sentinel-2 data products pre-processed at level 2A (bottom-of-atmosphere reflectance) and saved in Cloud Optimized GeoTIFF (COG) format:

In [3]:
#specif collection
collection = 'sentinel-s2-l2a-cogs'

We also ask for scenes intersecting a geometry defined using the shapely library (in this case, a point):

In [4]:
from shapely.geometry import Point

point = Point(4.89, 52.37)

Note: at this stage, we are only dealing with metadata, so no image is going to be downloaded yet. But even metadata can be quite bulky if a large number of scenes match our search! For this reason, we limit the search result to 10 items:

In [5]:
search = client.search(
    collections=[collection],
    intersects=point,
    max_items=10,
)

We submit the query and find out how many scenes match our search criteria (please note that this output can be different as more data is added to the catalog):

In [6]:
print(search.matched())

791


Finally, we retrieve the metadata of the search results:

In [7]:
items = search.get_all_items()

In [8]:
#check size
print(len(items))

10


which is consistent with the maximum number of items that we have set in the search criteria. We can iterate over the returned items and print these to show their IDs:

In [9]:
for item in items:
    print(item)

<Item id=S2A_31UFU_20230502_0_L2A>
<Item id=S2A_31UFU_20230425_0_L2A>
<Item id=S2A_31UFU_20230422_0_L2A>
<Item id=S2B_31UFU_20230420_0_L2A>
<Item id=S2B_31UFU_20230417_0_L2A>
<Item id=S2A_31UFU_20230415_0_L2A>
<Item id=S2A_31UFU_20230412_0_L2A>
<Item id=S2B_31UFU_20230410_0_L2A>
<Item id=S2B_31UFU_20230407_0_L2A>
<Item id=S2A_31UFU_20230405_0_L2A>


Each of the items contains information about the scene geometry, its acquisition time, and other metadata that can be accessed as a dictionary from the properties attribute.

Let’s inspect the metadata associated with the first item of the search results:

In [10]:
item = items[0]
print(item.datetime)
print(item.geometry)
print(item.properties)

2023-05-02 10:46:26+00:00
{'type': 'Polygon', 'coordinates': [[[6.071664488869862, 52.22257539160586], [4.807969632231231, 52.248710365643134], [5.234343690941738, 53.22867343801652], [6.1417542968794585, 53.20819279121764], [6.071664488869862, 52.22257539160586]]]}
{'datetime': '2023-05-02T10:46:26Z', 'platform': 'sentinel-2a', 'constellation': 'sentinel-2', 'instruments': ['msi'], 'gsd': 10, 'view:off_nadir': 0, 'proj:epsg': 32631, 'sentinel:utm_zone': 31, 'sentinel:latitude_band': 'U', 'sentinel:grid_square': 'FU', 'sentinel:sequence': '0', 'sentinel:product_id': 'S2A_MSIL2A_20230502T103621_N0509_R008_T31UFU_20230502T150052', 'sentinel:data_coverage': 66.95, 'eo:cloud_cover': 88.85, 'sentinel:valid_cloud_cover': True, 'sentinel:processing_baseline': '05.09', 'sentinel:boa_offset_applied': True, 'created': '2023-05-02T20:03:12.581Z', 'updated': '2023-05-02T20:03:12.581Z'}


### Exercise: Search satellite scenes using metadata filters
Search for all the available Sentinel-2 scenes in the sentinel-s2-l2a-cogs collection that satisfy the following criteria:

- intersect a provided bounding box (use ±0.01 deg in lat/lon from the previously defined point);
- have been recorded between 20 March 2020 and 30 March 2020;
- have a cloud coverage smaller than 10% (hint: use the query input argument of client.search).

In [11]:
bbox = point.buffer(0.01).bounds

search = client.search(
    collections=[collection],
    bbox=bbox,
    datetime='2020-03-20/2020-03-30',
    query=['eo:cloud_cover<10'],
)

print(search.matched())

4


 Save the search results in GeoJSON format.

In [12]:
items = search.get_all_items()
#items.save_object('search.json')

## Access the assets
So far we have only discussed metadata - but how can one get to the actual images of a satellite scene (the “assets” in the STAC nomenclature)? These can be reached via links that are made available through the item’s attribute assets.So far we have only discussed metadata - but how can one get to the actual images of a satellite scene (the “assets” in the STAC nomenclature)? These can be reached via links that are made available through the item’s attribute assets.


In [13]:
assets = items[0].assets
print(assets.keys())

dict_keys(['thumbnail', 'overview', 'info', 'metadata', 'visual', 'B01', 'B02', 'B03', 'B04', 'B05', 'B06', 'B07', 'B08', 'B8A', 'B09', 'B11', 'B12', 'AOT', 'WVP', 'SCL'])


In [14]:
# We can print a minimal description of the available assets:
for key, asset in assets.items():
    print(f'{key}: {asset.title}')

thumbnail: Thumbnail
overview: True color image
info: Original JSON metadata
metadata: Original XML metadata
visual: True color image
B01: Band 1 (coastal)
B02: Band 2 (blue)
B03: Band 3 (green)
B04: Band 4 (red)
B05: Band 5
B06: Band 6
B07: Band 7
B08: Band 8 (nir)
B8A: Band 8A
B09: Band 9
B11: Band 11 (swir16)
B12: Band 12 (swir22)
AOT: Aerosol Optical Thickness (AOT)
WVP: Water Vapour (WVP)
SCL: Scene Classification Map (SCL)


Among the others, assets include multiple raster data files (one per optical band, as acquired by the multi-spectral instrument), a thumbnail, a true-color image (“visual”), instrument metadata and scene-classification information (“SCL”). Let’s get the URL links to the actual asset:

In [15]:
print(assets['thumbnail'].href)

https://roda.sentinel-hub.com/sentinel-s2-l1c/tiles/31/U/FU/2020/3/28/0/preview.jpg


This can be used to download the file

Remote raster data can be directly opened via the rioxarray library

In [16]:
import rioxarray

b01_href = assets['B01'].href
b01 = rioxarray.open_rasterio(b01_href)
print(b01)

<xarray.DataArray (band: 1, y: 1830, x: 1830)>
[3348900 values with dtype=uint16]
Coordinates:
  * band         (band) int32 1
  * x            (x) float64 6e+05 6.001e+05 6.002e+05 ... 7.097e+05 7.098e+05
  * y            (y) float64 5.9e+06 5.9e+06 5.9e+06 ... 5.79e+06 5.79e+06
    spatial_ref  int32 0
Attributes:
    AREA_OR_POINT:       Area
    OVR_RESAMPLING_ALG:  AVERAGE
    _FillValue:          0
    scale_factor:        1.0
    add_offset:          0.0


In [17]:
#save image to disk
#b01.rio.to_raster('B01.tif')

### Exercise: Downloading Landsat 8 Assets
In this exercise we put in practice all the skills we have learned in this episode to retrieve images from a different mission: Landsat 8. In particular, we browse images from the Harmonized Landsat Sentinel-2 (HLS) project, which provides images from NASA’s Landsat 8 and ESA’s Sentinel-2 that have been made consistent with each other. The HLS catalog is indexed in the NASA Common Metadata Repository (CMR) and it can be accessed from the STAC API endpoint at the following URL: https://cmr.earthdata.nasa.gov/stac/LPCLOUD.

Using pystac_client, search for all assets of the Landsat 8 collection (HLSL30.v2.0) from February to March 2021, intersecting the point with longitude/latitute coordinates (-73.97, 40.78) deg.
Visualize an item’s thumbnail (asset key browse).

In [18]:
# specify url
api_url = 'https://cmr.earthdata.nasa.gov/stac/LPCLOUD'

# use pystac to query library
client = Client.open(api_url)

#specify collection
collection = 'HLSL30.v2.0'

In [19]:
#create search parameter
search = client.search(
    collections=[collection],
    intersects=Point(-73.97, 40.78),
    datetime='2021-02-01/2021-03-31',
)

print(search.matched())

5


In [20]:
assets = items[0].assets
for key, asset in assets.items():
    print(f'{key}: {asset.title}')

thumbnail: Thumbnail
overview: True color image
info: Original JSON metadata
metadata: Original XML metadata
visual: True color image
B01: Band 1 (coastal)
B02: Band 2 (blue)
B03: Band 3 (green)
B04: Band 4 (red)
B05: Band 5
B06: Band 6
B07: Band 7
B08: Band 8 (nir)
B8A: Band 8A
B09: Band 9
B11: Band 11 (swir16)
B12: Band 12 (swir22)
AOT: Aerosol Optical Thickness (AOT)
WVP: Water Vapour (WVP)
SCL: Scene Classification Map (SCL)


In [22]:
#save search results
items = search.get_all_items()

#sort and select by cloud cover
items_sorted =  sorted(items, key=lambda x: x.properties['eo:cloud_cover'])
item = items_sorted[0]
print(item)

<Item id=HLS.L30.T18TWL.2021039T153324.v2.0>


In [23]:
print(item.assets['browse'].href)

https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-public/HLSL30.020/HLS.L30.T18TWL.2021039T153324.v2.0/HLS.L30.T18TWL.2021039T153324.v2.0.jpg
