## Landsat data loading

There is a general file storage pattern of encoding meaningful information in filenames or directory names. Most recently we have encountered this when working with landsat data. These data are stored in tif files where each file represents one band. The number of the band is meaningful because it specifies the spectrum that is covered.


In [None]:
import intake
cat = intake.open_catalog('catalog.yml')
list(cat)

In this catalog 'l5' is setup as a schema on current master and 'l5_proposed' lays out a new way of defining a schema.

## Master

`l5` is an example using glob notation to iterate over files with similar names to get all the bands

In [None]:
xa = cat.l5.read_chunked()
xa

Note that the band coordinates is just a list of 1s. From the files we can see that what we actually want is [1, 2, 3, 4, 5, 7].

In [None]:
ls ~/.intake/cache/0088b75722009b0a583f65974c60bd87/

If glob order were guaranteed, then we could set coords like this:

In [None]:
import xarray as xr
xa.assign_coords(**{'band': xr.DataArray([1, 2, 3, 4, 5, 7], dims=['band'])})

But since the glob order isn't guaranteed we have no way of knowing that we are naming the bands correctly. In order to guarantee that we are actually setting the bands correclty, we'd have to write a separate catalog entry for each file.

## Proposal
In this implementation an arbitrary number of fields can be specified using python format notation. These fields get added to the xarray object as coordinates with the same dimension that we are concating on. Coordinates are just sets of labels along a particular dimension so there can be many coordinates along the same dimension. By making the file fields coordinates rather than attributes, we allow for each file to have a different value. 

In [None]:
l5_proposed = cat.l5_proposed
xa = l5_proposed.read_chunked()
xa

This means that we can select just one band using the `xarray.DataArray.sel` method. 

In [None]:
xa.sel(band=7)

### Impact on visualizations

If only filenames were available we wouldn't be able to declare the plot in the catalog, but since we do the parsing on load, we can use declarative plotting in the catalog 

In [None]:
import hvplot.intake
intake.output_notebook()

l5_proposed.plot.band_image()