# Loading data from Chile data cube

* **Prerequisites:** Users of this notebook should have a basic understanding of:
    * How to run a [Jupyter notebook](01_Jupyter_notebooks.ipynb)
    * Inspecting available [Products and measurements](02_Products_and_measurements.ipynb)

## Background
Loading data from the Chile instance of the [Open Data Cube](https://www.opendatacube.org/) requires the construction of a data query that specifies the what, where, and when of the data request.
Each query returns a [multi-dimensional xarray object](http://xarray.pydata.org/en/stable/) containing the contents of your query.
It is essential to understand the `xarray` data structures as they are fundamental to the structure of data loaded from the datacube.
Manipulations, transformations and visualisation of `xarray` objects provide datacube users with the ability to explore and analyse datasets, as well as pose and answer scientific questions.

## Description
This notebook will introduce how to load data from the Chile datacube through the construction of a query and use of the `dc.load()` function.
Topics covered include:

* Loading data using `dc.load()`
* Interpreting the resulting `xarray.Dataset` object
    * Inspecting an individual `xarray.DataArray`
* Customising parameters passed to the `dc.load()` function
    * Loading specific measurements
    * Loading data for coordinates in a custom coordinate reference system (CRS)
    * Projecting data to a new CRS and spatial resolution 
    * Specifying a specific spatial resampling method
* Loading data using a reusable dictionary query
* Loading matching data from multiple products using `like`
* Adding a progress bar to the data load

***

## Getting started
To run this introduction to loading data from the datacube, run all the cells in the notebook starting with the "Load packages" cell. For help with running notebook cells, refer back to the [Jupyter Notebooks notebook](01_Jupyter_notebooks.ipynb).

### Load packages
First we need to load the `datacube` package.
This will allow us to query the datacube database and load some data. 
The `with_ui_cbk` function from `odc.ui` will allow us to show a progress bar when loading large amounts of data.

In [1]:
import datacube
from odc.ui import with_ui_cbk

### Connect to the datacube
We then need to connect to the datacube database.
We will then be able to use the `dc` datacube object to load data.
The `app` parameter is a unique name used to identify the notebook that does not have any effect on the analysis.

In [2]:
dc = datacube.Datacube(app="03_Loading_data")

## Loading data using `dc.load()`

Loading data from the datacube uses the [dc.load()](https://datacube-core.readthedocs.io/en/latest/dev/api/generate/datacube.Datacube.load.html) function.

The function requires the following minimum arguments:

* `product`: A specific product to load (to revise products, see the [Products and measurements](02_Products_and_measurements.ipynb) notebook).
* `x`: Defines the spatial region in the *x* dimension. By default, the *x* and *y* arguments accept queries in a geographical co-ordinate system WGS84, identified by the EPSG code *4326*.
* `y`: Defines the spatial region in the *y* dimension. The dimensions ``longitude``/``latitude`` and ``x``/``y`` can be used interchangeably.
* `time`: Defines the temporal extent. The time dimension can be specified using a tuple of datetime objects or strings in the "YYYY", "YYYY-MM" or "YYYY-MM-DD" format. 

Let's run a query to load 2018 data from Landsat 8 over Santiago
. 
For this example, we can use the following parameters:

* `product`: `usgs_espa_ls8c1_sr`
* `x`=`(-71.1, -71.5)`
* `y`=`(-29.5, -30)`,
* `time`: `("2020-01-01", "2020-12-31")`

Run the following cell to load all datasets from the `usgs_espa_ls8c1_sr` product that match this spatial and temporal extent:

In [3]:
ds = dc.load(product="usgs_espa_ls8c1_sr",
             x=(-71.1, -71.5),
             y=(-29.5, -30),
             output_crs = "EPSG:32719",
             time = ("2020-01-01", "2020-12-31"),
             resolution = (-25, 25),
             dask_chunks={"time": 1}
            )
ds

Unnamed: 0,Array,Chunk
Bytes,229.29 MB,7.17 MB
Shape,"(32, 2249, 1593)","(1, 2249, 1593)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray
"Array Chunk Bytes 229.29 MB 7.17 MB Shape (32, 2249, 1593) (1, 2249, 1593) Count 64 Tasks 32 Chunks Type int16 numpy.ndarray",1593  2249  32,

Unnamed: 0,Array,Chunk
Bytes,229.29 MB,7.17 MB
Shape,"(32, 2249, 1593)","(1, 2249, 1593)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,229.29 MB,7.17 MB
Shape,"(32, 2249, 1593)","(1, 2249, 1593)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray
"Array Chunk Bytes 229.29 MB 7.17 MB Shape (32, 2249, 1593) (1, 2249, 1593) Count 64 Tasks 32 Chunks Type int16 numpy.ndarray",1593  2249  32,

Unnamed: 0,Array,Chunk
Bytes,229.29 MB,7.17 MB
Shape,"(32, 2249, 1593)","(1, 2249, 1593)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,229.29 MB,7.17 MB
Shape,"(32, 2249, 1593)","(1, 2249, 1593)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray
"Array Chunk Bytes 229.29 MB 7.17 MB Shape (32, 2249, 1593) (1, 2249, 1593) Count 64 Tasks 32 Chunks Type int16 numpy.ndarray",1593  2249  32,

Unnamed: 0,Array,Chunk
Bytes,229.29 MB,7.17 MB
Shape,"(32, 2249, 1593)","(1, 2249, 1593)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,229.29 MB,7.17 MB
Shape,"(32, 2249, 1593)","(1, 2249, 1593)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray
"Array Chunk Bytes 229.29 MB 7.17 MB Shape (32, 2249, 1593) (1, 2249, 1593) Count 64 Tasks 32 Chunks Type int16 numpy.ndarray",1593  2249  32,

Unnamed: 0,Array,Chunk
Bytes,229.29 MB,7.17 MB
Shape,"(32, 2249, 1593)","(1, 2249, 1593)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,229.29 MB,7.17 MB
Shape,"(32, 2249, 1593)","(1, 2249, 1593)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray
"Array Chunk Bytes 229.29 MB 7.17 MB Shape (32, 2249, 1593) (1, 2249, 1593) Count 64 Tasks 32 Chunks Type int16 numpy.ndarray",1593  2249  32,

Unnamed: 0,Array,Chunk
Bytes,229.29 MB,7.17 MB
Shape,"(32, 2249, 1593)","(1, 2249, 1593)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,229.29 MB,7.17 MB
Shape,"(32, 2249, 1593)","(1, 2249, 1593)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray
"Array Chunk Bytes 229.29 MB 7.17 MB Shape (32, 2249, 1593) (1, 2249, 1593) Count 64 Tasks 32 Chunks Type int16 numpy.ndarray",1593  2249  32,

Unnamed: 0,Array,Chunk
Bytes,229.29 MB,7.17 MB
Shape,"(32, 2249, 1593)","(1, 2249, 1593)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,229.29 MB,7.17 MB
Shape,"(32, 2249, 1593)","(1, 2249, 1593)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray
"Array Chunk Bytes 229.29 MB 7.17 MB Shape (32, 2249, 1593) (1, 2249, 1593) Count 64 Tasks 32 Chunks Type int16 numpy.ndarray",1593  2249  32,

Unnamed: 0,Array,Chunk
Bytes,229.29 MB,7.17 MB
Shape,"(32, 2249, 1593)","(1, 2249, 1593)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,229.29 MB,7.17 MB
Shape,"(32, 2249, 1593)","(1, 2249, 1593)"
Count,64 Tasks,32 Chunks
Type,uint16,numpy.ndarray
"Array Chunk Bytes 229.29 MB 7.17 MB Shape (32, 2249, 1593) (1, 2249, 1593) Count 64 Tasks 32 Chunks Type uint16 numpy.ndarray",1593  2249  32,

Unnamed: 0,Array,Chunk
Bytes,229.29 MB,7.17 MB
Shape,"(32, 2249, 1593)","(1, 2249, 1593)"
Count,64 Tasks,32 Chunks
Type,uint16,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,114.65 MB,3.58 MB
Shape,"(32, 2249, 1593)","(1, 2249, 1593)"
Count,64 Tasks,32 Chunks
Type,uint8,numpy.ndarray
"Array Chunk Bytes 114.65 MB 3.58 MB Shape (32, 2249, 1593) (1, 2249, 1593) Count 64 Tasks 32 Chunks Type uint8 numpy.ndarray",1593  2249  32,

Unnamed: 0,Array,Chunk
Bytes,114.65 MB,3.58 MB
Shape,"(32, 2249, 1593)","(1, 2249, 1593)"
Count,64 Tasks,32 Chunks
Type,uint8,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,229.29 MB,7.17 MB
Shape,"(32, 2249, 1593)","(1, 2249, 1593)"
Count,64 Tasks,32 Chunks
Type,uint16,numpy.ndarray
"Array Chunk Bytes 229.29 MB 7.17 MB Shape (32, 2249, 1593) (1, 2249, 1593) Count 64 Tasks 32 Chunks Type uint16 numpy.ndarray",1593  2249  32,

Unnamed: 0,Array,Chunk
Bytes,229.29 MB,7.17 MB
Shape,"(32, 2249, 1593)","(1, 2249, 1593)"
Count,64 Tasks,32 Chunks
Type,uint16,numpy.ndarray


### Interpreting the resulting `xarray.Dataset`
The variable `ds` has returned an `xarray.Dataset` containing all data that matched the spatial and temporal query parameters inputted into `dc.load`.

*Dimensions* 

* Identifies the number of timesteps returned in the search (`time: 1`) as well as the number of pixels in the `x` and `y` directions of the data query.

*Coordinates* 

* `time` identifies the date attributed to each returned timestep.
* `x` and `y` are the coordinates for each pixel within the spatial bounds of your query.

*Data variables*

* These are the measurements available for the nominated product. 
For every date (`time`) returned by the query, the measured value at each pixel (`y`, `x`) is returned as an array for each measurement.
Each data variable is itself an `xarray.DataArray` object ([see below](#Inspecting-an-individual-xarray.DataArray)). 

*Attributes*

* `crs` identifies the coordinate reference system (CRS) of the loaded data. 

### Inspecting an individual `xarray.DataArray`
The `xarray.Dataset` we loaded above is itself a collection of individual `xarray.DataArray` objects that hold the actual data for each data variable/measurement. 
For example, all measurements listed under _Data variables_ above (e.g. `blue`, `green`, `red`, `nir`, `swir1`, `swir2`) are `xarray.DataArray` objects.

We can inspect the data in these `xarray.DataArray` objects using either of the following syntaxes:
```
ds["measurement_name"]
```
or:
```
ds.measurement_name
```

Being able to access data from individual data variables/measurements allows us to manipulate and analyse data from individual satellite bands or specific layers in a dataset. 
For example, we can access data from the near infra-red satellite band (i.e. `nir`):

In [4]:
ds.red

Unnamed: 0,Array,Chunk
Bytes,229.29 MB,7.17 MB
Shape,"(32, 2249, 1593)","(1, 2249, 1593)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray
"Array Chunk Bytes 229.29 MB 7.17 MB Shape (32, 2249, 1593) (1, 2249, 1593) Count 64 Tasks 32 Chunks Type int16 numpy.ndarray",1593  2249  32,

Unnamed: 0,Array,Chunk
Bytes,229.29 MB,7.17 MB
Shape,"(32, 2249, 1593)","(1, 2249, 1593)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray


Note that the object header informs us that it is an `xarray.DataArray` containing data for the `nir` satellite band. 

Like an `xarray.Dataset`, the array also includes information about the data's **dimensions** (i.e. `(time: 1, y: 801, x: 644)`), **coordinates** and **attributes**.
This particular data variable/measurement contains some additional information that is specific to the `nir` band, including details of array's nodata value (i.e. `nodata: -999`).

> **Note**: For a more in-depth introduction to `xarray` data structures, refer to the [official xarray documentation](http://xarray.pydata.org/en/stable/data-structures.html)

## Customising the `dc.load()` function

The `dc.load()` function can be tailored to refine a query.

Customisation options include:

* `measurements:` This argument is used to provide a list of measurement names to load, as listed in `dc.list_measurements()`. 
For satellite datasets, measurements contain data for each individual satellite band (e.g. near infrared). 
If not provided, all measurements for the product will be returned.
* `crs:` The coordinate reference system (CRS) of the query's `x` and `y` coordinates is assumed to be `WGS84`/`EPSG:4326` unless the `crs` field is supplied, even if the stored data is in another projection or the `output_crs` is specified. 
The `crs` parameter is required if your query's coordinates are in any other CRS.
* `group_by:` Satellite datasets based around scenes can have multiple observations per day with slightly different time stamps as the satellite collects data along its path.
These observations can be combined by reducing the `time` dimension to the day level using `group_by=solar_day`.
* `output_crs` and `resolution`: To reproject or change the resolution the data, supply the `output_crs` and `resolution` fields.    
* `resampling`: This argument allows you to specify a custom spatial resampling method to use when data is reprojected into a different CRS. 

Example syntax on the use of these options follows in the cells below.

> For help or more customisation options, run `help(dc.load)` in an empty cell or visit the function's [documentation page](https://datacube-core.readthedocs.io/en/latest/dev/api/generate/datacube.Datacube.load.html)


### Specifying measurements
By default, `dc.load()` will load *all* measurements in a product.

To load data from the `red`, `green` and `blue` satellite bands only, we can add `measurements=["red", "green", "blue"]` to our query:

In [5]:
# Note the optional inclusion of the measurements list
ds_rgb = dc.load(product="usgs_espa_ls8c1_sr",
                 measurements=["red", "green", "blue"],
                 x=(-71.1, -71.5),
                 y=(-29.5, -30),
                 output_crs = "EPSG:32719",
                 time = ("2020-01-01", "2020-12-31"),
                 resolution = (-25, 25),
                 dask_chunks={"time": 1}
                )

ds_rgb

Unnamed: 0,Array,Chunk
Bytes,229.29 MB,7.17 MB
Shape,"(32, 2249, 1593)","(1, 2249, 1593)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray
"Array Chunk Bytes 229.29 MB 7.17 MB Shape (32, 2249, 1593) (1, 2249, 1593) Count 64 Tasks 32 Chunks Type int16 numpy.ndarray",1593  2249  32,

Unnamed: 0,Array,Chunk
Bytes,229.29 MB,7.17 MB
Shape,"(32, 2249, 1593)","(1, 2249, 1593)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,229.29 MB,7.17 MB
Shape,"(32, 2249, 1593)","(1, 2249, 1593)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray
"Array Chunk Bytes 229.29 MB 7.17 MB Shape (32, 2249, 1593) (1, 2249, 1593) Count 64 Tasks 32 Chunks Type int16 numpy.ndarray",1593  2249  32,

Unnamed: 0,Array,Chunk
Bytes,229.29 MB,7.17 MB
Shape,"(32, 2249, 1593)","(1, 2249, 1593)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,229.29 MB,7.17 MB
Shape,"(32, 2249, 1593)","(1, 2249, 1593)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray
"Array Chunk Bytes 229.29 MB 7.17 MB Shape (32, 2249, 1593) (1, 2249, 1593) Count 64 Tasks 32 Chunks Type int16 numpy.ndarray",1593  2249  32,

Unnamed: 0,Array,Chunk
Bytes,229.29 MB,7.17 MB
Shape,"(32, 2249, 1593)","(1, 2249, 1593)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray


Note that the *Data variables* component of the `xarray.Dataset` now includes only the measurements specified in the query (i.e. the `red`, `green` and `blue` satellite bands).

### Loading data for coordinates in any CRS
By default, `dc.load()` assumes that your query `x` and `y` coordinates are provided in degrees in the `WGS84/EPSG:4326` CRS.
If your coordinates are in a different coordinate system, you need to specify this using the `crs` parameter.

In the example below, we load data for a set of `x` and `y` coordinates defined in WGS84 UTM zone 19S (`EPSG:32719`), and ensure that the `dc.load()` function accounts for this by including `crs="EPSG:32719"`:


In [6]:
# Note the new `x` and `y` coordinates and `crs` parameter
ds_custom_crs = dc.load(product="usgs_espa_ls8c1_sr",
                        time=("2020-01-01", "2020-12-31"),
                        x=(335713, 355713),
                        y=(6287592, 6307592),
                        crs="EPSG:32719",
                        output_crs = "EPSG:32719",
                        resolution = (-25, 25),
                        dask_chunks={"time": 1}
                       )

ds_custom_crs

Unnamed: 0,Array,Chunk
Bytes,19.25 MB,1.28 MB
Shape,"(15, 801, 801)","(1, 801, 801)"
Count,30 Tasks,15 Chunks
Type,int16,numpy.ndarray
"Array Chunk Bytes 19.25 MB 1.28 MB Shape (15, 801, 801) (1, 801, 801) Count 30 Tasks 15 Chunks Type int16 numpy.ndarray",801  801  15,

Unnamed: 0,Array,Chunk
Bytes,19.25 MB,1.28 MB
Shape,"(15, 801, 801)","(1, 801, 801)"
Count,30 Tasks,15 Chunks
Type,int16,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,19.25 MB,1.28 MB
Shape,"(15, 801, 801)","(1, 801, 801)"
Count,30 Tasks,15 Chunks
Type,int16,numpy.ndarray
"Array Chunk Bytes 19.25 MB 1.28 MB Shape (15, 801, 801) (1, 801, 801) Count 30 Tasks 15 Chunks Type int16 numpy.ndarray",801  801  15,

Unnamed: 0,Array,Chunk
Bytes,19.25 MB,1.28 MB
Shape,"(15, 801, 801)","(1, 801, 801)"
Count,30 Tasks,15 Chunks
Type,int16,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,19.25 MB,1.28 MB
Shape,"(15, 801, 801)","(1, 801, 801)"
Count,30 Tasks,15 Chunks
Type,int16,numpy.ndarray
"Array Chunk Bytes 19.25 MB 1.28 MB Shape (15, 801, 801) (1, 801, 801) Count 30 Tasks 15 Chunks Type int16 numpy.ndarray",801  801  15,

Unnamed: 0,Array,Chunk
Bytes,19.25 MB,1.28 MB
Shape,"(15, 801, 801)","(1, 801, 801)"
Count,30 Tasks,15 Chunks
Type,int16,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,19.25 MB,1.28 MB
Shape,"(15, 801, 801)","(1, 801, 801)"
Count,30 Tasks,15 Chunks
Type,int16,numpy.ndarray
"Array Chunk Bytes 19.25 MB 1.28 MB Shape (15, 801, 801) (1, 801, 801) Count 30 Tasks 15 Chunks Type int16 numpy.ndarray",801  801  15,

Unnamed: 0,Array,Chunk
Bytes,19.25 MB,1.28 MB
Shape,"(15, 801, 801)","(1, 801, 801)"
Count,30 Tasks,15 Chunks
Type,int16,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,19.25 MB,1.28 MB
Shape,"(15, 801, 801)","(1, 801, 801)"
Count,30 Tasks,15 Chunks
Type,int16,numpy.ndarray
"Array Chunk Bytes 19.25 MB 1.28 MB Shape (15, 801, 801) (1, 801, 801) Count 30 Tasks 15 Chunks Type int16 numpy.ndarray",801  801  15,

Unnamed: 0,Array,Chunk
Bytes,19.25 MB,1.28 MB
Shape,"(15, 801, 801)","(1, 801, 801)"
Count,30 Tasks,15 Chunks
Type,int16,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,19.25 MB,1.28 MB
Shape,"(15, 801, 801)","(1, 801, 801)"
Count,30 Tasks,15 Chunks
Type,int16,numpy.ndarray
"Array Chunk Bytes 19.25 MB 1.28 MB Shape (15, 801, 801) (1, 801, 801) Count 30 Tasks 15 Chunks Type int16 numpy.ndarray",801  801  15,

Unnamed: 0,Array,Chunk
Bytes,19.25 MB,1.28 MB
Shape,"(15, 801, 801)","(1, 801, 801)"
Count,30 Tasks,15 Chunks
Type,int16,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,19.25 MB,1.28 MB
Shape,"(15, 801, 801)","(1, 801, 801)"
Count,30 Tasks,15 Chunks
Type,int16,numpy.ndarray
"Array Chunk Bytes 19.25 MB 1.28 MB Shape (15, 801, 801) (1, 801, 801) Count 30 Tasks 15 Chunks Type int16 numpy.ndarray",801  801  15,

Unnamed: 0,Array,Chunk
Bytes,19.25 MB,1.28 MB
Shape,"(15, 801, 801)","(1, 801, 801)"
Count,30 Tasks,15 Chunks
Type,int16,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,19.25 MB,1.28 MB
Shape,"(15, 801, 801)","(1, 801, 801)"
Count,30 Tasks,15 Chunks
Type,uint16,numpy.ndarray
"Array Chunk Bytes 19.25 MB 1.28 MB Shape (15, 801, 801) (1, 801, 801) Count 30 Tasks 15 Chunks Type uint16 numpy.ndarray",801  801  15,

Unnamed: 0,Array,Chunk
Bytes,19.25 MB,1.28 MB
Shape,"(15, 801, 801)","(1, 801, 801)"
Count,30 Tasks,15 Chunks
Type,uint16,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,9.62 MB,641.60 kB
Shape,"(15, 801, 801)","(1, 801, 801)"
Count,30 Tasks,15 Chunks
Type,uint8,numpy.ndarray
"Array Chunk Bytes 9.62 MB 641.60 kB Shape (15, 801, 801) (1, 801, 801) Count 30 Tasks 15 Chunks Type uint8 numpy.ndarray",801  801  15,

Unnamed: 0,Array,Chunk
Bytes,9.62 MB,641.60 kB
Shape,"(15, 801, 801)","(1, 801, 801)"
Count,30 Tasks,15 Chunks
Type,uint8,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,19.25 MB,1.28 MB
Shape,"(15, 801, 801)","(1, 801, 801)"
Count,30 Tasks,15 Chunks
Type,uint16,numpy.ndarray
"Array Chunk Bytes 19.25 MB 1.28 MB Shape (15, 801, 801) (1, 801, 801) Count 30 Tasks 15 Chunks Type uint16 numpy.ndarray",801  801  15,

Unnamed: 0,Array,Chunk
Bytes,19.25 MB,1.28 MB
Shape,"(15, 801, 801)","(1, 801, 801)"
Count,30 Tasks,15 Chunks
Type,uint16,numpy.ndarray


### CRS reprojection
Certain applications may require that you output your data into a specific CRS.
You can reproject your output data by specifying the new `output_crs` and identifying the `resolution` required.

In this example, we will reproject our data to a new CRS (UTM Zone 34S, `EPSG:32734`) and resolution (250 x 250 m). Note that for most CRSs, the first resolution value is negative (e.g. `(-250, 250)`):

In [7]:
ds_reprojected = dc.load(product="usgs_espa_ls8c1_sr",
                         measurements=["red", "green", "blue"],
                         x=(-71.1, -71.5),
                         y=(-29.5, -30),
                         output_crs = "EPSG:32734",
                         time = ("2020-01-01", "2020-12-31"),
                         resolution = (-250, 250),
                         dask_chunks={"time": 1}
                        )

ds_reprojected

Unnamed: 0,Array,Chunk
Bytes,10.28 MB,321.30 kB
Shape,"(32, 344, 467)","(1, 344, 467)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray
"Array Chunk Bytes 10.28 MB 321.30 kB Shape (32, 344, 467) (1, 344, 467) Count 64 Tasks 32 Chunks Type int16 numpy.ndarray",467  344  32,

Unnamed: 0,Array,Chunk
Bytes,10.28 MB,321.30 kB
Shape,"(32, 344, 467)","(1, 344, 467)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,10.28 MB,321.30 kB
Shape,"(32, 344, 467)","(1, 344, 467)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray
"Array Chunk Bytes 10.28 MB 321.30 kB Shape (32, 344, 467) (1, 344, 467) Count 64 Tasks 32 Chunks Type int16 numpy.ndarray",467  344  32,

Unnamed: 0,Array,Chunk
Bytes,10.28 MB,321.30 kB
Shape,"(32, 344, 467)","(1, 344, 467)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,10.28 MB,321.30 kB
Shape,"(32, 344, 467)","(1, 344, 467)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray
"Array Chunk Bytes 10.28 MB 321.30 kB Shape (32, 344, 467) (1, 344, 467) Count 64 Tasks 32 Chunks Type int16 numpy.ndarray",467  344  32,

Unnamed: 0,Array,Chunk
Bytes,10.28 MB,321.30 kB
Shape,"(32, 344, 467)","(1, 344, 467)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray


Note that the `crs` attribute in the *Attributes* section has changed to `EPSG:32734`. 
Due to the larger 250 m resolution, there are also now less pixels on the `x` and `y` dimensions (e.g. `x: 467, y: 344` compared to `x: 801, y: 801` in earlier examples).


### Spatial resampling methods
When a product is re-projected to a different CRS and/or resolution, the new pixel grid may differ from the original input pixels by size, number and alignment.
It is therefore necessary to apply a spatial "resampling" rule that allocates input pixel values into the new pixel grid.

By default, `dc.load()` resamples pixel values using "nearest neighbour" resampling, which allocates each new pixel with the value of the closest input pixel.
Depending on the type of data and the analysis being run, this may not be the most appropriate choice (e.g. for continuous data).

The `resampling` parameter in `dc.load()` allows you to choose a custom resampling method from the following options: 

```
"nearest", "cubic", "bilinear", "cubic_spline", "lanczos", 
"average", "mode", "gauss", "max", "min", "med", "q1", "q3"
```

For example, we can request that all loaded data is resampled using "average" resampling:

In [8]:
# Note the additional `resampling` parameter
ds_averageresampling = dc.load(product="usgs_espa_ls8c1_sr",
                               measurements=["red", "green", "blue"],
                               x=(-71.1, -71.5),
                               y=(-29.5, -30),
                               output_crs = "EPSG:32719",
                               time = ("2020-01-01", "2020-12-31"),
                               resolution = (-250, 250),
                               dask_chunks={"time": 1},
                               resampling="average"
                              )

ds_averageresampling

Unnamed: 0,Array,Chunk
Bytes,2.31 MB,72.32 kB
Shape,"(32, 226, 160)","(1, 226, 160)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray
"Array Chunk Bytes 2.31 MB 72.32 kB Shape (32, 226, 160) (1, 226, 160) Count 64 Tasks 32 Chunks Type int16 numpy.ndarray",160  226  32,

Unnamed: 0,Array,Chunk
Bytes,2.31 MB,72.32 kB
Shape,"(32, 226, 160)","(1, 226, 160)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,2.31 MB,72.32 kB
Shape,"(32, 226, 160)","(1, 226, 160)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray
"Array Chunk Bytes 2.31 MB 72.32 kB Shape (32, 226, 160) (1, 226, 160) Count 64 Tasks 32 Chunks Type int16 numpy.ndarray",160  226  32,

Unnamed: 0,Array,Chunk
Bytes,2.31 MB,72.32 kB
Shape,"(32, 226, 160)","(1, 226, 160)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,2.31 MB,72.32 kB
Shape,"(32, 226, 160)","(1, 226, 160)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray
"Array Chunk Bytes 2.31 MB 72.32 kB Shape (32, 226, 160) (1, 226, 160) Count 64 Tasks 32 Chunks Type int16 numpy.ndarray",160  226  32,

Unnamed: 0,Array,Chunk
Bytes,2.31 MB,72.32 kB
Shape,"(32, 226, 160)","(1, 226, 160)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray


You can also provide a Python dictionary to request a different sampling method for different measurements. 
This can be particularly useful when some measurements contain contain categorical data which require resampling methods such as "nearest" or "mode" that do not modify the input pixel values.

In the example below, we specify `resampling={"red": "nearest", "*": "average"}`, which will use "nearest" neighbour resampling for the `red` satellite band only. `"*": "average"` will apply "average" resampling for all other satellite bands:


In [9]:
ds_customresampling = dc.load(product="usgs_espa_ls8c1_sr",
                              measurements=["red", "green", "blue"],
                              x=(-71.1, -71.5),
                              y=(-29.5, -30),
                              output_crs = "EPSG:32719",
                              time = ("2020-01-01", "2020-12-31"),
                              resolution = (-250, 250),
                              dask_chunks={"time": 1},
                              resampling={"red": "nearest", "*": "average"}
                             )

ds_customresampling

Unnamed: 0,Array,Chunk
Bytes,2.31 MB,72.32 kB
Shape,"(32, 226, 160)","(1, 226, 160)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray
"Array Chunk Bytes 2.31 MB 72.32 kB Shape (32, 226, 160) (1, 226, 160) Count 64 Tasks 32 Chunks Type int16 numpy.ndarray",160  226  32,

Unnamed: 0,Array,Chunk
Bytes,2.31 MB,72.32 kB
Shape,"(32, 226, 160)","(1, 226, 160)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,2.31 MB,72.32 kB
Shape,"(32, 226, 160)","(1, 226, 160)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray
"Array Chunk Bytes 2.31 MB 72.32 kB Shape (32, 226, 160) (1, 226, 160) Count 64 Tasks 32 Chunks Type int16 numpy.ndarray",160  226  32,

Unnamed: 0,Array,Chunk
Bytes,2.31 MB,72.32 kB
Shape,"(32, 226, 160)","(1, 226, 160)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,2.31 MB,72.32 kB
Shape,"(32, 226, 160)","(1, 226, 160)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray
"Array Chunk Bytes 2.31 MB 72.32 kB Shape (32, 226, 160) (1, 226, 160) Count 64 Tasks 32 Chunks Type int16 numpy.ndarray",160  226  32,

Unnamed: 0,Array,Chunk
Bytes,2.31 MB,72.32 kB
Shape,"(32, 226, 160)","(1, 226, 160)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray


> **Note**: For more information about spatial resampling methods, see the [following guide](https://rasterio.readthedocs.io/en/stable/topics/resampling.html)

## Loading data using the query dictionary syntax
It is often useful to re-use a set of query parameters to load data from multiple products.
To achieve this, we can load data using the "query dictionary" syntax.
This involves placing the query parameters we used to load data above inside a Python dictionary object which we can re-use for multiple data loads:

In [10]:
query = {"x": (-71.1, -71.5),
         "y": (-29.5, -30),
         "time": ("2020-01-01", "2020-12-31"),
         "output_crs": "EPSG:32719",
         "time": ("2020-01-01", "2020-12-31"),
         "resolution": (-250, 250),
         "dask_chunks": {"time": 1}
        }


We can then use this query dictionary object as an input to `dc.load()`. 

> The `**` syntax below is Python's "keyword argument unpacking" operator.
This operator takes the named query parameters listed in the dictionary we created (e.g. `"x": (153.3, 153.4)`), and "unpacks" them into the `dc.load()` function as new arguments. 
For more information about unpacking operators, refer to the [Python documentation](https://docs.python.org/3/tutorial/controlflow.html#unpacking-argument-lists)

In [11]:
ds = dc.load(product="usgs_espa_ls8c1_sr",
             **query)

ds

Unnamed: 0,Array,Chunk
Bytes,2.31 MB,72.32 kB
Shape,"(32, 226, 160)","(1, 226, 160)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray
"Array Chunk Bytes 2.31 MB 72.32 kB Shape (32, 226, 160) (1, 226, 160) Count 64 Tasks 32 Chunks Type int16 numpy.ndarray",160  226  32,

Unnamed: 0,Array,Chunk
Bytes,2.31 MB,72.32 kB
Shape,"(32, 226, 160)","(1, 226, 160)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,2.31 MB,72.32 kB
Shape,"(32, 226, 160)","(1, 226, 160)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray
"Array Chunk Bytes 2.31 MB 72.32 kB Shape (32, 226, 160) (1, 226, 160) Count 64 Tasks 32 Chunks Type int16 numpy.ndarray",160  226  32,

Unnamed: 0,Array,Chunk
Bytes,2.31 MB,72.32 kB
Shape,"(32, 226, 160)","(1, 226, 160)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,2.31 MB,72.32 kB
Shape,"(32, 226, 160)","(1, 226, 160)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray
"Array Chunk Bytes 2.31 MB 72.32 kB Shape (32, 226, 160) (1, 226, 160) Count 64 Tasks 32 Chunks Type int16 numpy.ndarray",160  226  32,

Unnamed: 0,Array,Chunk
Bytes,2.31 MB,72.32 kB
Shape,"(32, 226, 160)","(1, 226, 160)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,2.31 MB,72.32 kB
Shape,"(32, 226, 160)","(1, 226, 160)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray
"Array Chunk Bytes 2.31 MB 72.32 kB Shape (32, 226, 160) (1, 226, 160) Count 64 Tasks 32 Chunks Type int16 numpy.ndarray",160  226  32,

Unnamed: 0,Array,Chunk
Bytes,2.31 MB,72.32 kB
Shape,"(32, 226, 160)","(1, 226, 160)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,2.31 MB,72.32 kB
Shape,"(32, 226, 160)","(1, 226, 160)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray
"Array Chunk Bytes 2.31 MB 72.32 kB Shape (32, 226, 160) (1, 226, 160) Count 64 Tasks 32 Chunks Type int16 numpy.ndarray",160  226  32,

Unnamed: 0,Array,Chunk
Bytes,2.31 MB,72.32 kB
Shape,"(32, 226, 160)","(1, 226, 160)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,2.31 MB,72.32 kB
Shape,"(32, 226, 160)","(1, 226, 160)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray
"Array Chunk Bytes 2.31 MB 72.32 kB Shape (32, 226, 160) (1, 226, 160) Count 64 Tasks 32 Chunks Type int16 numpy.ndarray",160  226  32,

Unnamed: 0,Array,Chunk
Bytes,2.31 MB,72.32 kB
Shape,"(32, 226, 160)","(1, 226, 160)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,2.31 MB,72.32 kB
Shape,"(32, 226, 160)","(1, 226, 160)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray
"Array Chunk Bytes 2.31 MB 72.32 kB Shape (32, 226, 160) (1, 226, 160) Count 64 Tasks 32 Chunks Type int16 numpy.ndarray",160  226  32,

Unnamed: 0,Array,Chunk
Bytes,2.31 MB,72.32 kB
Shape,"(32, 226, 160)","(1, 226, 160)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,2.31 MB,72.32 kB
Shape,"(32, 226, 160)","(1, 226, 160)"
Count,64 Tasks,32 Chunks
Type,uint16,numpy.ndarray
"Array Chunk Bytes 2.31 MB 72.32 kB Shape (32, 226, 160) (1, 226, 160) Count 64 Tasks 32 Chunks Type uint16 numpy.ndarray",160  226  32,

Unnamed: 0,Array,Chunk
Bytes,2.31 MB,72.32 kB
Shape,"(32, 226, 160)","(1, 226, 160)"
Count,64 Tasks,32 Chunks
Type,uint16,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,1.16 MB,36.16 kB
Shape,"(32, 226, 160)","(1, 226, 160)"
Count,64 Tasks,32 Chunks
Type,uint8,numpy.ndarray
"Array Chunk Bytes 1.16 MB 36.16 kB Shape (32, 226, 160) (1, 226, 160) Count 64 Tasks 32 Chunks Type uint8 numpy.ndarray",160  226  32,

Unnamed: 0,Array,Chunk
Bytes,1.16 MB,36.16 kB
Shape,"(32, 226, 160)","(1, 226, 160)"
Count,64 Tasks,32 Chunks
Type,uint8,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,2.31 MB,72.32 kB
Shape,"(32, 226, 160)","(1, 226, 160)"
Count,64 Tasks,32 Chunks
Type,uint16,numpy.ndarray
"Array Chunk Bytes 2.31 MB 72.32 kB Shape (32, 226, 160) (1, 226, 160) Count 64 Tasks 32 Chunks Type uint16 numpy.ndarray",160  226  32,

Unnamed: 0,Array,Chunk
Bytes,2.31 MB,72.32 kB
Shape,"(32, 226, 160)","(1, 226, 160)"
Count,64 Tasks,32 Chunks
Type,uint16,numpy.ndarray


Query dictionaries can contain any set of parameters that would usually be provided to `dc.load()`:

In [12]:
query = {"x": (-71.1, -71.5),
         "y": (-29.5, -30),
         "time": ("2020-01-01", "2020-12-31"),
         "output_crs": "EPSG:32719",
         "time": ("2020-01-01", "2020-12-31"),
         "resolution": (-500, 500),
         "dask_chunks": {"time": 1},
         "resampling": {"red": "nearest", "*": "average"}
        }

ds_ls8 = dc.load(product="usgs_espa_ls8c1_sr",
                 **query)

ds_ls8


Unnamed: 0,Array,Chunk
Bytes,578.56 kB,18.08 kB
Shape,"(32, 113, 80)","(1, 113, 80)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray
"Array Chunk Bytes 578.56 kB 18.08 kB Shape (32, 113, 80) (1, 113, 80) Count 64 Tasks 32 Chunks Type int16 numpy.ndarray",80  113  32,

Unnamed: 0,Array,Chunk
Bytes,578.56 kB,18.08 kB
Shape,"(32, 113, 80)","(1, 113, 80)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,578.56 kB,18.08 kB
Shape,"(32, 113, 80)","(1, 113, 80)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray
"Array Chunk Bytes 578.56 kB 18.08 kB Shape (32, 113, 80) (1, 113, 80) Count 64 Tasks 32 Chunks Type int16 numpy.ndarray",80  113  32,

Unnamed: 0,Array,Chunk
Bytes,578.56 kB,18.08 kB
Shape,"(32, 113, 80)","(1, 113, 80)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,578.56 kB,18.08 kB
Shape,"(32, 113, 80)","(1, 113, 80)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray
"Array Chunk Bytes 578.56 kB 18.08 kB Shape (32, 113, 80) (1, 113, 80) Count 64 Tasks 32 Chunks Type int16 numpy.ndarray",80  113  32,

Unnamed: 0,Array,Chunk
Bytes,578.56 kB,18.08 kB
Shape,"(32, 113, 80)","(1, 113, 80)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,578.56 kB,18.08 kB
Shape,"(32, 113, 80)","(1, 113, 80)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray
"Array Chunk Bytes 578.56 kB 18.08 kB Shape (32, 113, 80) (1, 113, 80) Count 64 Tasks 32 Chunks Type int16 numpy.ndarray",80  113  32,

Unnamed: 0,Array,Chunk
Bytes,578.56 kB,18.08 kB
Shape,"(32, 113, 80)","(1, 113, 80)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,578.56 kB,18.08 kB
Shape,"(32, 113, 80)","(1, 113, 80)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray
"Array Chunk Bytes 578.56 kB 18.08 kB Shape (32, 113, 80) (1, 113, 80) Count 64 Tasks 32 Chunks Type int16 numpy.ndarray",80  113  32,

Unnamed: 0,Array,Chunk
Bytes,578.56 kB,18.08 kB
Shape,"(32, 113, 80)","(1, 113, 80)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,578.56 kB,18.08 kB
Shape,"(32, 113, 80)","(1, 113, 80)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray
"Array Chunk Bytes 578.56 kB 18.08 kB Shape (32, 113, 80) (1, 113, 80) Count 64 Tasks 32 Chunks Type int16 numpy.ndarray",80  113  32,

Unnamed: 0,Array,Chunk
Bytes,578.56 kB,18.08 kB
Shape,"(32, 113, 80)","(1, 113, 80)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,578.56 kB,18.08 kB
Shape,"(32, 113, 80)","(1, 113, 80)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray
"Array Chunk Bytes 578.56 kB 18.08 kB Shape (32, 113, 80) (1, 113, 80) Count 64 Tasks 32 Chunks Type int16 numpy.ndarray",80  113  32,

Unnamed: 0,Array,Chunk
Bytes,578.56 kB,18.08 kB
Shape,"(32, 113, 80)","(1, 113, 80)"
Count,64 Tasks,32 Chunks
Type,int16,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,578.56 kB,18.08 kB
Shape,"(32, 113, 80)","(1, 113, 80)"
Count,64 Tasks,32 Chunks
Type,uint16,numpy.ndarray
"Array Chunk Bytes 578.56 kB 18.08 kB Shape (32, 113, 80) (1, 113, 80) Count 64 Tasks 32 Chunks Type uint16 numpy.ndarray",80  113  32,

Unnamed: 0,Array,Chunk
Bytes,578.56 kB,18.08 kB
Shape,"(32, 113, 80)","(1, 113, 80)"
Count,64 Tasks,32 Chunks
Type,uint16,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,289.28 kB,9.04 kB
Shape,"(32, 113, 80)","(1, 113, 80)"
Count,64 Tasks,32 Chunks
Type,uint8,numpy.ndarray
"Array Chunk Bytes 289.28 kB 9.04 kB Shape (32, 113, 80) (1, 113, 80) Count 64 Tasks 32 Chunks Type uint8 numpy.ndarray",80  113  32,

Unnamed: 0,Array,Chunk
Bytes,289.28 kB,9.04 kB
Shape,"(32, 113, 80)","(1, 113, 80)"
Count,64 Tasks,32 Chunks
Type,uint8,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,578.56 kB,18.08 kB
Shape,"(32, 113, 80)","(1, 113, 80)"
Count,64 Tasks,32 Chunks
Type,uint16,numpy.ndarray
"Array Chunk Bytes 578.56 kB 18.08 kB Shape (32, 113, 80) (1, 113, 80) Count 64 Tasks 32 Chunks Type uint16 numpy.ndarray",80  113  32,

Unnamed: 0,Array,Chunk
Bytes,578.56 kB,18.08 kB
Shape,"(32, 113, 80)","(1, 113, 80)"
Count,64 Tasks,32 Chunks
Type,uint16,numpy.ndarray


## Other helpful tricks
### Loading data "like" another dataset
Another option for loading matching data from multiple products is to use `dc.load()`'s `like` parameter.
This will copy the spatial and temporal extent and the CRS/resolution from an existing dataset, and use these parameters to load a new data from a new product.

In the example below, we load another WOfS dataset that exactly matches the `ds_ls8` dataset we loaded earlier:


In [13]:
# THIS WON'T WORK UNTIL WE GET MORE DATA IN THE CHILE DATACUBE

# ds_wofs = dc.load(product="ga_ls8c_wofs_2_annual_summary",
#                  like=ds_ls8)

# print(ds_wofs)

### Adding a progress bar
When loading large amounts of data, it can be useful to view the progress of the data load. 
The `progress_cbk` parameter in `dc.load()` allows us to add a progress bar which will indicate how the load is progressing. In this example, we will load 5 years of data (2013, 2014, 2015, 2016 and 2017) from the `ga_ls8c_wofs_2_annual_summary` product with a progress bar:

This only works when dask chunking is **disabled**. To understand more about Dask, please see [Parallel processing with Dask](08_Parallel_processing_with_dask.ipynb)

In [14]:
query = {"x": (-71.1, -71.5),
         "y": (-29.5, -30),
         "time": ("2020-01-01", "2020-12-31"),
         "output_crs": "EPSG:32719",
         "time": ("2020-01-01", "2020-12-31"),
         "resolution": (-500, 500),
#          "dask_chunks": {"time": 1},
         "resampling": {"red": "nearest", "*": "average"}
        }

ds_progress = dc.load(product="usgs_espa_ls8c1_sr",
                      progress_cbk=with_ui_cbk(),
                      **query)

ds_progress

VBox(children=(HBox(children=(Label(value=''), Label(value='')), layout=Layout(justify_content='space-between'…

## Recommended next steps

For more advanced information about working with Jupyter Notebooks or JupyterLab, you can explore [JupyterLab documentation page](https://jupyterlab.readthedocs.io/en/stable/user/notebook.html).

To continue working through the notebooks in this beginner's guide, the following notebooks are designed to be worked through in the following order:

1. [Jupyter Notebooks](01_Jupyter_notebooks.ipynb)
2. [Products and Measurements](02_Products_and_measurements.ipynb)
3. **Loading data (this notebook)**
4. [Plotting](04_Plotting.ipynb)
5. [Performing a basic analysis](05_Basic_analysis.ipynb)
6. [Introduction to numpy](06_Intro_to_numpy.ipynb)
7. [Introduction to xarray](07_Intro_to_xarray.ipynb)
8. [Parallel processing with Dask](08_Parallel_processing_with_dask.ipynb)

Once you have you have completed the above six tutorials, join advanced users in exploring:

* The "Datasets" directory in the repository, where you can explore DE Africa products in depth.
* The "Frequently used code" directory, which contains a recipe book of common techniques and methods for analysing DE Africa data.
* The "Real-world examples" directory, which provides more complex workflows and analysis case studies.