# Analyze Dataset

An example on how to use the analyzer to analyze a **dataset**.

### Imports
For this example we will use only **xarray** and **analyze_dataset** from **enstools-compression**.

In [None]:
import xarray
from enstools.compression.analyzer.analyzer import analyze_dataset

In [None]:
dataset_name = "air_temperature"
dataset = xarray.tutorial.open_dataset(dataset_name)
dataset

### Analyze dataset using default constrains
Use **analyze_dataset** to obtain the compression specification that guarantee quality constrains while maximising compression ratios.
In this case if the argument `constrains` is not provided it will use the default ones, which are `"correlation_I:5,ssim_I:2"`.


> **Note**:
>
> `correlation_I` is computed like: `-log10(1-pearson_correlation)`. i.e. number of nines of correlation
>
> `correlation_I:5 == correlation:0.99999`
>
> Similarly `ssim_I` is computed like: `-log10(1-ssim)`.

The function returns two dictionaries, one containing the best encoding and another containing the resulting metrics.

In [4]:
encoding

{'air': 'lossy,sz,abs,0.14'}

In [5]:
metrics

{'air': {'correlation_I': 5.052599536069354,
  'ssim_I': 3.8547548193909162,
  'compression_ratio': 4.5967042497831745}}

### Analyze dataset using custom constrains

If we want to specify different constrains we can do it like this:

In [6]:
encoding, metrics = analyze_dataset(dataset=dataset,constrains="correlation_I:3,ssim_I:1")

INFO: air lossy,zfp,accuracy,16  CR:7.9
INFO: air lossy,zfp,rate,1  CR:6.8
INFO: air lossy,zfp,precision,12  CR:7.8
INFO: air lossy,sz,abs,1.4  CR:11.9
INFO: air lossy,sz,rel,0.0195  CR:11.8
INFO: air lossy,sz,pw_rel,0.00586  CR:11.5
