# Making QA masks with unpackqa and xarray

The `xarray` package allows you to work with labelled arrays. It uses numpy arrays, so `unpackqa` works with it without any modifications. 

In [None]:
import xarray as xr
import dask
import numpy as np
import pooch

import unpackqa

L8_QA_PIXEL_FILE = 'https://landsat.usgs.gov/sites/default/files/C2_Sample_Data/LC08_L1TP_140041_20130503_20200912_02_T1.zip'
L8_qa_product = 'LANDSAT_8_C2_L2_QAPixel'

file_path = pooch.retrieve(
    L8_QA_PIXEL_FILE,
    known_hash=None,
    processor=pooch.Unzip()
)

my_file = [f for f in file_path if 'QA_PIXEL' in f][0]
ds = xr.open_rasterio(my_file, chunks={'x':2000,'y':2000})

# Drop the band axis since the file has a single band.
ds = ds.isel(band=0)

### Flags values as a new dim
`unpack_to_array` can be used directly inside the xarray `apply_ufunc` function. With this setup you can take advantage of a parallel computing environment. Read more in the xarray docs: http://xarray.pydata.org/en/stable/user-guide/dask.html

Here a new dim will be added called `flag`, with the same length as the number of flags in the Landsat 8 QA_PIXEL band. The `flag` dim should be set as a core dim.  

In [None]:
L8_flag_names = unpackqa.list_qa_flags(L8_qa_product)

flag_ds = xr.apply_ufunc(
    unpackqa.unpack_to_array,
    ds,
    kwargs = dict(product=L8_qa_product),
    output_core_dims = [['flag']],
    dask_gufunc_kwargs = dict(output_sizes={'flag':len(L8_flag_names)}),
    output_dtypes=[np.uint8],
    vectorize=False,
    dask='parallelized'
    )

# put labels on the flag coordinates
flag_ds['flag'] = L8_flag_names

print(flag_ds)

### Flag values as new variables
Another option would be to have each flag as it's own data variable. That can be done with some rearranging.

In [None]:
flag_ds2 = [flag_ds.sel(flag=flag_name).drop('flag').rename(flag_name) for flag_name in L8_flag_names]
flag_ds2 = xr.merge(flag_ds2)
print(flag_ds2)

### Loading and viewing flags
When the `chunks` argument is set for `xr.open_rasterio`, all files are accessed lazily. The `load` options executes all underlying functions and loads all data into memory.

In [None]:
flag_ds2.load()
print(flag_ds2)