# Using the {{ben}} LMDB Reader
In this section, an example of using the {{ben}} reader is shown. This converts the Lightning Memory-Mapped Database Manager ({{lmdb}}) used in the background as a database into an indexable Python object.
:::{note}
Due to its use of {{lmdb}}, which is not pickle-able, it is not thread safe to use this object after first access. However, using it only after forking is supported (e.g. access in `__getitem__` methods in a `pytorch dataset`).
:::

To use the reader, we have to create a `BENLMDBReader` object. This object needs 4 arguments for creation, namely the directory where the LMBD file is located as `string`, a sequence of 3 `ints` for the desired image size `(Channel, Height, Width)`, an indication of which bands are to be used and the label type to use.

In [1]:
# remove-input
# remove-output
import matplotlib.pyplot as plt
import pathlib
from pprint import pprint

my_data_path = str(pathlib.Path("").resolve().parent.parent.joinpath("configilm").joinpath("extra").joinpath("mock_data").joinpath("BigEarthNetEncoded.lmdb").resolve(strict=True))


In [2]:
# remove-output
from configilm.extra.BEN_lmdb_utils import BENLMDBReader

BEN_reader = BENLMDBReader(
        lmdb_dir=my_data_path,  # path to dataset
        image_size=(3, 120, 120),
        bands="RGB",
        label_type="old",
    )
img, lbl = BEN_reader["S2B_MSIL2A_20180502T093039_82_40"]

PackageNotFoundError: No package metadata was found for configilm

We are expecting this object to contain images of size `3x120x120` in RGB, annotated with the "old" 43-label version. Images are delivered as `torch tensors` and labels as a `list of strings`.

In [None]:
# remove-input
print(f"Size: {img.shape}")
print("Labels:")
pprint(lbl)
# images are 12 bit of radiometric resolution, so to display simply divide by 2^12
# this will not be the best way of displaying but enough as a showcase
img /= 2**12

_ = plt.imshow(img.permute(1, 2, 0))
plt.axis('off')
plt.show()

## Selecting Bands
If we now are interested in the vegetation index for example, we can specifically create a reader that only returns the Bands B8 and B4 as needed for the Index.

The vegetation index is defined as
$
\begin{align*}
    VI = \frac{B08 - B04}{B08 + B04}
\end{align*}
$

In [None]:
BEN_reader = BENLMDBReader(
        lmdb_dir=my_data_path,  # path to dataset
        image_size=(2, 120, 120),
        bands=["B08", "B04"],
        label_type="old",
    )
img, lbl = BEN_reader["S2B_MSIL2A_20180502T093039_82_40"]
veg_idx = (img[0]-img[1])/(img[0]+img[1])

The images returned from this reader will have Band 08 in dimension 0 and Band 04 in dimension 1 like the order specified in the parameter. Note, that the image size also has to be set to `(2, ...)`, as this is used to check the size after interpolation. Interpolation is already applied in the `Loader` using `torch.nn.functional.interpolate()` with aligned corners in bicubic mode.

In [None]:
# remove-input
veg_idx /= 2**12

_ = plt.imshow(veg_idx)
plt.axis('off')
plt.show()

For ease of use there are some predefined configurations available that can be used without having to list all containing bands. The available pre-definitions and their respective bands are

In [None]:
# remove-input
from configilm.extra.BEN_lmdb_utils import BAND_COMBINATION_PREDEFINTIONS
for k, v in BAND_COMBINATION_PREDEFINTIONS.items():
    s = "'" if type(k) is str else " "
    k = f"{s}{k}{s}"
    print(f"{k:>8}: {v}")

## Label types
We can also request the labels in the "new" 19-label version as introduced in {cite:t}`BEN19labels`. Here we see that the Label 'Water bodies' gets converted into 'Inland waters' as expected.

In [None]:
BEN_reader = BENLMDBReader(
        lmdb_dir=my_data_path,  # path to dataset
        image_size=(3, 120, 120),
        bands="RGB",
        label_type="new",
    )
img, lbl = BEN_reader["S2B_MSIL2A_20180502T093039_82_40"]
pprint(lbl)

If wished, the 19-label lists can also be converted into a 19-dimensional one-hot tensor. This guarantees a uniform conversion, so that each label vector always has the same sequence, regardless of the user.

In [None]:
from configilm.extra.BEN_lmdb_utils import ben19_list_to_onehot
ben19_list_to_onehot(lbl)

## Mean and Standard Deviation
The reader objects also collect mean and standard deviation during initialization based on the chosen band configuration.

In [None]:
BEN_reader_1 = BENLMDBReader(
        lmdb_dir=my_data_path,  # path to dataset
        image_size=(2, 120, 120),
        bands=["B08", "B04"],
        label_type="old",
    )
print(f"Mean 1: {BEN_reader_1.mean}")
print(f" Std 1: {BEN_reader_1.std}")

BEN_reader_2 = BENLMDBReader(
        lmdb_dir=my_data_path,  # path to dataset
        image_size=(3, 120, 120),
        bands=["B04", "B01", "B8A"],
        label_type="old",
    )
print(f"Mean 2: {BEN_reader_2.mean}")
print(f" Std 2: {BEN_reader_2.std}")