# Quickstart tutorial

This tutorial will introduce the basic concepts of this new version of PolSARpro in python.

It is important to notice that the toolbox relies on Xarray. Here we only describe the basics required to run PolSARpro and refer the reader to https://docs.xarray.dev/en/stable/ for mode details.

## Loading data

First we need to import some data. The demo dataset (TODO: host the data somewhere) has been prepared using SNAP and exported in the NetCDF-BEAM. NetCDF is the format recommended by Xarray as it was used as they share the same data model.

Let's first open the data using the `open_netcdf_beam` function.

In [None]:
from polsarpro.io import open_netcdf_beam

# change to your data path
data_path = "/data/psp/test_files/SAN_FRANCISCO_ALOS1_slc.nc"
out_path = "/data/psp/res"
S = open_netcdf_beam(data_path)

The variable `S` has been recognized by the reader as a 2x2 Sinclair matrix. It has been imported in a structure called `xarray.Dataset`. 
In a jupyter notebook it may be inspected as:

In [None]:
S

Each channel is represented by a data variable and may be accessed using the dot notation:

In [None]:
# Access to the hh polarization channel
S.hh

Each variable is a dask array which is divided into smaller chunks to be processed in parallel to improve performance.
One important thing is that the data is not loaded into memory at this point. This is called lazy loading and allows to execute processing only when the result is required. More on that later.

To convert the S matrix to other types, we can use utilities:

In [None]:
from polsarpro.util import S_to_T3

T3 = S_to_T3(S)

Let's now look at this new variable:

In [None]:
T3

The output has been automatically converted to a new polarimetric type `T3` representing the coherency matrix. Now the elements of the matrix are accessed as:

In [None]:
T3.m12

For storage optimization, PolSARpro takes advantage of the Hermitian structure of the matrix and stores only diagonal and upper elements.

To ensure its algorithms is applied to valid polarimetric types, PolSARpro looks for a `poltype` attribute, which may be accessed as:

In [None]:
T3.poltype

A human readable description may also be accessed as:

In [None]:
T3.description

To compute the actual result and load the data into memory, one can simply use:

In [None]:
T3 = T3.compute()
T3

Now, each variable is a numpy array stored in memory. This mechanism is useful as it allows to define complex processing pipelines without storing intermediate variables in memory. It is even possible to write the result to disk without storing the whole data in memeory. For instance one may define:

In [None]:
from polsarpro.util import boxcar

# crop the data using the isel accessor, re-chunk to avoid errors when writing zarr
S_crop = S.isel(y=slice(5000, 12000)).chunk("auto")
T3 = S_to_T3(S_crop)
boxcar(T3, 5, 5).to_zarr(f"{out_path}/T3_box5x5.zarr", mode="w", consolidated=False)
# az the zarr specification is still evolving, we have to set some parameter to True to silence warnings. 

Now let's apply the H/A/$\alpha$ decomposition to our $S$ matrix:

In [None]:
from polsarpro.decompositions import h_a_alpha

res = h_a_alpha(S_crop, boxcar_size=[5, 5]).compute()

In [None]:
res

As we can see, the new dataset has a specific poltype and stores the output parameters into labeled variables. As previously they can be accessed via the dot notation e.g. `res.alpha`.
We can now plot the outputs either using matplotlib or xarray plotting capabilities:

In [None]:
res.alpha.plot.imshow()

In [None]:
res.entropy.plot.imshow()

In [None]:
res.anisotropy.plot.imshow()