# 2.3 Data Access and Basic Processing Pangeo Lazy Loading

<center><img src="https://raw.githubusercontent.com/EO-College/cubes-and-clouds/main/icons/cnc_3icons_process_circle.svg"
     alt="Cubes & Clouds logo"
     style="float: center; margin-right: 10px; margin-left: 10px; max-height: 250px;" /></center>

<img src="https://raw.githubusercontent.com/pangeo-data/pangeo.io/refs/heads/main/public/Pangeo-assets/pangeo_logo.png"
     alt="Pangeo logo"
     style="float: center; margin-right: 10px; max-height: 80px;"/>

The exercise will use the Pangeo ecosystem to access and process data.

## Lazy data loading with Pangeo ecosystem

When accessing data using an API, most of the time the data is **lazily** loaded.

It means that only the metadata is loaded, so that it is possible to know about the data dimensions and their extents (spatial and temporal), the available bands and other additional information.

Let's start with a call to the STAC Catalogue Python Libraries `pystac_client` for lazily loading some Sentinel-2 data from a public STAC Collection.

We need to specify an Area Of Interest (AOI) to get only part of the Collection, otherwise our code would try to load the metadata of all Sentinel-2 tiles available in the world!

In [None]:
import pystac_client
import stackstac

In [None]:
#                West, South, East, North
spatial_extent = [11.1, 46.1, 11.5, 46.5]
temporal_extent = ["2015-01-01","2022-01-01"]

**Running this cell may take up to 2 minutes**

In [None]:
URL = "https://earth-search.aws.element84.com/v1"
catalog = pystac_client.Client.open(URL)
items = catalog.search(
    bbox=spatial_extent,
    datetime=temporal_extent,
    collections=["sentinel-2-l2a"]
).item_collection()

Calling  `stackstac.stack()` method for the `items`, the data will be lazily loaded and an `xArray.DataArray` object returned.

Running the next cell will show the selected data content with the dimension names and their extent.

In [None]:
datacube = stackstac.stack(items, bounds_latlon=spatial_extent)
datacube

From the output of the previous cell you can notice something really interesting: **the size of the selected data is more than 3 TB!**

But you should have noticed that it was too quick to download this huge amount of data.

This is what lazy loading allows: getting all the information about the data in a quick manner without having to access and download all the available files.

**Quiz hint: look carefully at the dimensions of the loaded datacube!**