# Cleaning methods when loading optical bands
Let's take a peek on the cleaning methods of optical bands and their potential respective time-consumption.

<div class="alert alert-warning">

<strong>Warning:</strong>
The durations shown hereunder may not be representative of your computer's performances.
Please take it as a hint about relative performances between constellations.

</div>

To summarize:
- `RAW` is fast and dirty
- `NODATA` is used by default, still relatively fast and puts nodata outside detectors footprint
- `CLEAN` is the most complete method (used before version `0.11.0`) but can be very slow and as the defective pixels are relatively rare. This may be overkill for your usage.

Note that the keywords are working with both `load` and `stack` functions.


## Try with Landsat-8

Let's open a Landsat-8 OLCI collection 2 tile.
Landsat COL-2 products manage their nodata and defective pixels through two flag files:
- `QA_PIXELS`
- `QA_RADSAT`

See more about these files [here](https://www.usgs.gov/core-science-systems/nli/landsat/landsat-collection-2-quality-assessment-bands)

In [1]:
# Imports
import os
from eoreader.reader import Reader
from eoreader.bands import GREEN
from eoreader.keywords import CLEAN_OPTICAL
from eoreader.products import CleanMethod

In [2]:
# Open the product
folder = os.path.join("/home", "ds2_db3", "CI", "eoreader", "optical")
path = os.path.join(folder, "LC08_L1TP_200030_20201220_20210310_02_T1.tar")
reader = Reader()
prod = reader.open(path)

### Time the RAW method
The `RAW` method is simple: just open the given tile with no pixel processing.

In [3]:
%%timeit
prod.load(
    GREEN, 
    **{CLEAN_OPTICAL: CleanMethod.RAW}
)
prod.clean_tmp()

The slowest run took 40.67 times longer than the fastest. This could mean that an intermediate result is being cached.
1min 17s ± 1min 44s per loop (mean ± std. dev. of 7 runs, 1 loop each)


### Time the NODATA method
Only the detector nodata is processed by the `NODATA` method.  
The bands will be set to `nodata` outside of the detector footprint (instead of keeping the raw nodata value)

In [4]:
%%timeit
prod.load(
    GREEN, 
    **{CLEAN_OPTICAL: CleanMethod.NODATA}
)
prod.clean_tmp()

The slowest run took 9.07 times longer than the fastest. This could mean that an intermediate result is being cached.
8.87 s ± 9.71 s per loop (mean ± std. dev. of 7 runs, 1 loop each)


### Time the CLEAN method
Every defective pixel given by the provider by the `CLEAN` method.
These pixels will be set to `nodata`.

In [5]:
%%timeit
prod.load(
    GREEN, 
    **{CLEAN_OPTICAL: CleanMethod.CLEAN}
)
prod.clean_tmp()

4.1 s ± 323 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


____
## Try another product: Sentinel-2

Let's open a Sentinel-2 *(processing baseline < 04.00, ~acquired before end of 2021, with flag files provided as vectors).*

The invalid pixel are retrived from the files:
- `DETFOO`: Detector footprint (nodata outside the detectors)
- `NODATA`: Pixel nodata (inside the detectors) (`QT_NODATA_PIXELS`)
- `DEFECT`: Defective pixels
- `SATURA`: Saturated Pixels
- `TECQUA`: Technical quality mask (`MSI_LOST`, `MSI_DEG`)

<div class="alert alert-info">

<strong>Note:</strong> Open the 20 m bands, to have array shapes comparable to Landsat-8.

</div>



In [6]:
# Open the product
path = os.path.join(folder, "S2B_MSIL2A_20200114T065229_N0213_R020_T40REQ_20200114T094749.SAFE")
prod = reader.open(path)

### Time the RAW method
The `RAW` method is simple: just open the given tile with no pixel processing.

In [7]:
%%timeit
prod.load(
    GREEN,
    pixel_size=20.,
    **{CLEAN_OPTICAL: CleanMethod.RAW}
)
prod.clean_tmp()

4.86 s ± 231 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


### Time the NODATA method
Only the detector nodata is processed by the `NODATA` method.  
The bands will be set to `nodata` outside of the detector footprint (instead of keeping the raw nodata value)

In [8]:
%%timeit
prod.load(
    GREEN,
    pixel_size=20.,
    **{CLEAN_OPTICAL: CleanMethod.NODATA}
)
prod.clean_tmp()

5.4 s ± 469 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


### Time the CLEAN method
Every defective pixel given by the provider by the `CLEAN` method.
These pixels will be set to `nodata`.

In [9]:
%%timeit
prod.load(
    GREEN,
    pixel_size=20.,
    **{CLEAN_OPTICAL: CleanMethod.CLEAN}
)
prod.clean_tmp()

5.62 s ± 507 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


____
## Try with the latest Sentinel-2 baseline

Let's open a Sentinel-2 *(processing baseline >= 04.00, ~acquired after end of 2021, with flag files provided as rasters).*

The invalid pixel are retrived from the file:
- `QUALIT`: Regrouping `TECQUA`, `DEFECT`, `NODATA`, `SATURA`

The nodata pixels (outside detector footprints) are now retrieved from null pixels, as a radiometric offset has been added.

See [here](https://sentinels.copernicus.eu/web/sentinel/-/copernicus-sentinel-2-major-products-upgrade-upcoming) for more information about the processing baseline update.

<div class="alert alert-info">

<strong>Note:</strong> Open the 20 m bands, to have array shapes comparable to Landsat-8.

</div>

In [10]:
# Open the product
path = os.path.join(folder, "S2B_MSIL2A_20210517T103619_N7990_R008_T30QVE_20211004T113819.SAFE")
prod = reader.open(path)

### Time the RAW method
The `RAW` method is simple: just open the given tile with no pixel processing.

In [11]:
%%timeit
prod.load(
    GREEN,
    pixel_size=20.,
    **{CLEAN_OPTICAL: CleanMethod.RAW}
)
prod.clean_tmp()

4.79 s ± 262 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


### Time the NODATA method
Only the detector nodata is processed by the `NODATA` method.  
The bands will be set to `nodata` outside of the detector footprint (instead of keeping the raw nodata value)

In [12]:
%%timeit
prod.load(
    GREEN,
    pixel_size=20.,
    **{CLEAN_OPTICAL: CleanMethod.NODATA}
)
prod.clean_tmp()

5.57 s ± 136 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


### Time the CLEAN method
Every defective pixel given by the provider by the `CLEAN` method.
These pixels will be set to `nodata`.

In [None]:
%%timeit
prod.load(
    GREEN,
    pixel_size=20.,
    **{CLEAN_OPTICAL: CleanMethod.CLEAN}
)
prod.clean_tmp()