## Hackathon
* integrate custom functions for NA value & change_dtype
* fix RAM accumulation issue

## Remarks
#### Benchmarking
* need to ensure that results are same as if they were processed with native semantique alone
* can't be achieved 100% due to among others...
    * NA value handling -> maybe this should be integrated as mandatory part of recipes for TileHandler
    * preview function -> will be parsed different as extent object (trim=True)

#### Black stripes
* problem of NA values during merge_spatial (rioxarray)
    * uses per default NA value in first input DataArray
    * if NA value not written, it will use rasterios default NA value (which is 0)
* can be circumvented by explicitly making sure that NA value is set
    * solution: udf - update_na

#### Small polygons, Distributed features
* polygons smaller than pixel size (spatial res) get omitted, points do not
* can raise ValueError: zero-size array, which is also the case for usual non-tiled execution
    * happen in case extent objects are defined for small objects
    * e.g. in case of polygon_small for 02_sreduce.json but not for 01_treduce.json
    * also due to preview with parse_extent, trim=True possible
* small grids - a lot of unneeded tiles need to be created & omitted (create_spatial_grid takes time & memory)
    * can be circumvented by just running precise_shp=True for large spatial chuncksizes

#### Vrt as output
* vrt covers way more than expected (400 x 400 still divided into 4 tiles) - not the minimal enclosing stuff is choosen

In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
import geopandas as gpd
import json
import os
import numpy as np
import pandas as pd
import planetary_computer as pc
import pystac
import pytz
import semantique as sq
import shutil
import urllib.request
import warnings 
import zipfile

from datetime import datetime
from pystac import Catalog, get_stac_version
from pystac.extensions.eo import EOExtension
from pystac.extensions.label import LabelExtension
from pystac_client import Client
from semantique.processor.utils import parse_extent
from shapely.geometry import box
from pathlib import Path

from copy import deepcopy

from gsemantique.data.search import Finder
from gsemantique.process.scaling import TileHandler, TileHandlerParallel

os.environ['USE_PYGEOS'] = '0'

In [10]:
from gsemantique.data.datasets import *
ds_catalog = DatasetCatalog()
ds_catalog.load()
ds_catalog.parse_as_table(keys=None)

Unnamed: 0,category,collection,copyright,endpoint,info,layout_bands,layout_file,layout_keys,provider,spatial_extent,src,temporal_extent,temporality,n_bands
0,SAR,sentinel-1-rtc,CC BY 4.0,https://planetarycomputer.microsoft.com/api/st...,Sentinel-1 represent radar imaging (SAR) satel...,"{'s1_amp_vv': {'name': 'vv', 'description': 'G...",c:\users\felix\repositories\gsemantique\gseman...,"[(Planet, reflectance, s1_amp_vv), (Planet, re...",Planet,"[[-180, -90, 180, 90]]",https://planetarycomputer.microsoft.com/datase...,"[[2014-10-10 00:28:21+00:00, None]]",s,4
1,multispectral,sentinel-2-l2a,Copernicus Sentinel Data Terms,https://planetarycomputer.microsoft.com/api/st...,The Sentinel-2 program provides global imagery...,"{'s2_band01': {'name': 'B01', 'description': '...",c:\users\felix\repositories\gsemantique\gseman...,"[(Planet, reflectance, s2_band01), (Planet, re...",Planet,"[[-180, -90, 180, 90]]",https://planetarycomputer.microsoft.com/datase...,"[[2015-06-27 10:25:31+00:00, None]]",s,13
2,multispectral,landsat-c2-l2,Public Domain (https://www.usgs.gov/emergency-...,https://planetarycomputer.microsoft.com/api/st...,"Landsat Collection 2 Level-2 Science Products,...","{'lndst_coastal': {'name': 'coastal', 'descrip...",c:\users\felix\repositories\gsemantique\gseman...,"[(Planet, reflectance, lndst_coastal), (Planet...",Planet,"[[-180.0, -90.0, 180.0, 90.0]]",https://planetarycomputer.microsoft.com/datase...,"[[1982-08-22 00:00:00+00:00, None]]",s,10
3,landcover,esa-worldcover,Creative Commons Attribution 4.0 International...,https://planetarycomputer.microsoft.com/api/st...,The European Space Agency (ESA) WorldCover pro...,"{'esa_lc': {'name': 'map', 'description': 'ESA...",c:\users\felix\repositories\gsemantique\gseman...,"[(Planet, classification, esa_lc)]",Planet,"[[-180.0, -60.0, 180.0, 83.0]]",https://planetarycomputer.microsoft.com/datase...,"[[2020-01-01 00:00:00+00:00, 2021-12-31 23:59:...",Y,1
4,landcover,io-lulc-annual-v02,Creative Commons BY-4.0,https://planetarycomputer.microsoft.com/api/st...,Time series of annual global maps of land use ...,"{'impact_lc': {'name': 'data', 'description': ...",c:\users\felix\repositories\gsemantique\gseman...,"[(Planet, classification, impact_lc)]",Planet,"[[-180, -90, 180, 90]]",https://planetarycomputer.microsoft.com/datase...,"[[2017-01-01 00:00:00+00:00, 2024-01-01 00:00:...",Y,1
5,DEM,nasadem,Public Domain (https://lpdaac.usgs.gov/data/da...,https://planetarycomputer.microsoft.com/api/st...,NASADEM provides global topographic data at 1 ...,"{'dem': {'name': 'elevation', 'description': '...",c:\users\felix\repositories\gsemantique\gseman...,"[(Planet, topography, dem)]",Planet,"[[-179.0, -56.0, 179.0, 61.0]]",https://planetarycomputer.microsoft.com/datase...,"[[2000-02-20 00:00:00+00:00, 2000-02-20 00:00:...",,1
6,DSM,cop-dem-glo-30,,https://planetarycomputer.microsoft.com/api/st...,The Copernicus DEM is a digital surface model ...,"{'dsm': {'name': 'data', 'description': 'Digit...",c:\users\felix\repositories\gsemantique\gseman...,"[(Planet, topography, dsm)]",Planet,"[[-180, -90, 180, 90]]",https://planetarycomputer.microsoft.com/datase...,"[[2021-04-22 00:00:00+00:00, 2021-04-22 00:00:...",,1
7,fire detection,modis-64A1-061,,https://planetarycomputer.microsoft.com/api/st...,The Terra and Aqua combined MCD64A1 Version 6....,"{'m_burn_date': {'name': 'Burn_Date', 'descrip...",c:\users\felix\repositories\gsemantique\gseman...,"[(Planet, burned_mapping, m_burn_date), (Plane...",Planet,"[[-180, -90, 180, 90]]",https://planetarycomputer.microsoft.com/datase...,"[[2000-11-01 00:00:00+00:00, None]]",M,3
8,fire detection,modis-14A2-061,,https://planetarycomputer.microsoft.com/api/st...,The Moderate Resolution Imaging Spectroradiome...,"{'w_burn_qa': {'name': 'QA', 'description': 'P...",c:\users\felix\repositories\gsemantique\gseman...,"[(Planet, burned_mapping, w_burn_qa), (Planet,...",Planet,"[[-180, -90, 180, 90]]",https://planetarycomputer.microsoft.com/datase...,"[[2000-02-18 00:00:00+00:00, None]]",D,2
9,hydrogeography,jrc-gsw,Copernicus Open Access Policy,https://planetarycomputer.microsoft.com/api/st...,Global surface water products from the Europea...,"{'change': {'name': 'change', 'description': '...",c:\users\felix\repositories\gsemantique\gseman...,"[(Planet, hydrogeography, change), (Planet, hy...",Planet,"[[-180.0, -56.0, 180.0, 78.0]]",https://planetarycomputer.microsoft.com/datase...,"[[1984-03-01 00:00:00+00:00, 2020-12-31 11:59:...",,4


In [6]:
# test both TileHandler and TileHanderParallel
# disable reauth for tests

# set up parameters to be assessed
recipes = [x.as_posix() for x in Path("recipes").rglob("*.json")]
t_intervals = [["2017-06-01", "2017-07-01"], ["2017-01-01", "2017-07-01"]]
aoi_files = [x.as_posix() for x in Path("aois").rglob("*.geojson")]
aoi_processors = ["bbox", "shapes"] 
tile_handlers = ["single", "parallel"] 

# create test recipe
# always recommended to use change_dtypes to reduce size of outputs
# at least float32 can be used in almost all cases (instead of float64)
# also tracking of no data values allowed this way

def update_na(obj, track_types = True, na_value=None, **kwargs):
    """
    Updates NA values by...
        * converting existing NA values to specified ones
        * persisting the NA value as part of the rio metadata

    Note that it doesn't turn existing non-NA values into NA values. 
    For this functionality see the verb `assign`.
    """
    import numpy as np
    import semantique as sq
    newobj = obj.copy(deep = True)
    na_value = eval(na_value) if isinstance(na_value, str) else na_value
    if newobj.rio.nodata is None:
        nodata = np.NaN if newobj.dtype.kind == "f" else None
        if na_value is not None:
            if nodata is np.NaN:
                newobj.values = np.where(np.isnan(newobj.values), na_value, newobj.values)
            else:
                newobj.values = np.where(newobj.values == nodata, na_value, newobj.values)
        else:
            na_value = nodata
        newobj = newobj.rio.write_nodata(na_value)
    else:
        if na_value is not None:
            nodata = newobj.rio.nodata
            if nodata is np.NaN:
                newobj.values = np.where(np.isnan(newobj.values), na_value, newobj.values)
            else:
                newobj.values = np.where(newobj.values == nodata, na_value, newobj.values)
            newobj = newobj.rio.write_nodata(na_value)
    return newobj

def change_dtype(obj, track_types = True, dtype="float32", na_value=None, **kwargs):
    import semantique as sq
    # convert dtype
    newobj = obj.copy(deep = True)
    newobj.values = newobj.astype(dtype)
    # track value types
    if track_types:
        newobj.sq.value_type = sq.processor.types.get_value_type(newobj)
    return newobj
    
class Tester:
    def __init__(
        self, 
        recipe, 
        aoi_file, 
        t_interval = ["2017-06-01", "2017-07-01"], 
        tile_handler = "single", 
        merge_mode = "merged",
        out_dir = False,
        res = 100,
        epsg = 3857
    ):
        self.recipe = recipe 
        self.t_interval = t_interval
        self.aoi_file = aoi_file
        self.tile_handler = tile_handler
        self.merge_mode = merge_mode
        self.out_dir = out_dir
        self.res = res
        self.epsg = epsg
        # parse space
        self.gdf = gpd.read_file(self.aoi_file).to_crs(4326)
        self.aoi = box(*self.gdf.total_bounds)
        self.space = sq.SpatialExtent(self.gdf)
        # execute workflow
        self._find_data()
        self._create_context()
        self._run_model()

    def _find_data(self):
        fdr = Finder(
            self.t_interval[0], 
            self.t_interval[1], 
            self.aoi, 
            "Planet", 
            "landsat-c2-l2", 
            ("Planet", "reflectance", "lndst_qa")
        )
        fdr.retrieve_params()
        fdr.retrieve_metadata()
        fdr.postprocess_metadata()
        self.fdr = fdr

    def _create_context(self):
        # init datacube
        with open(self.fdr.params_search["lfile"], "r") as file:
            dc = sq.datacube.STACCube(
                json.load(file), 
                src=self.fdr.item_coll,
                group_by_solar_day=True,
                dask_params=None,
            )
        # define spatio-temporal context vars
        time = sq.TemporalExtent(
            pd.Timestamp(self.fdr.params_search["t_start"]), 
            pd.Timestamp(self.fdr.params_search["t_end"])
        )
        space = sq.SpatialExtent(self.gdf)
        # if self.aoi_processor == "bbox":
        #     space = sq.SpatialExtent(
        #         gpd.GeoDataFrame(
        #             geometry=[box(*self.gdf.to_crs(self.epsg).total_bounds)], 
        #             crs=self.epsg
        #             )
        #         )
        # load mapping
        with open("mapping.json", "r") as file:
            rules = json.load(file)
        mapping = sq.mapping.Semantique(rules)
        # compose to context dict
        context = {
            "datacube": dc,
            "mapping": mapping,
            "space": space,
            "time": time,
            "crs": self.epsg,
            "tz": "UTC",
            "spatial_resolution": [-self.res, self.res],
            "caching": True,
            "track_types": False,
        }
        context = deepcopy(context)
        context["custom_verbs"] = {"change_dtype": change_dtype, "update_na": update_na}
        self.context = context

    def _run_model(self):
        if self.out_dir:
            # define output directory
            out_dir = os.path.splitext(os.path.split(os.path.normpath(self.recipe))[-1])[0]
            out_dir = f"{datetime.now().strftime('%Y%m%d_%H%M%S')}_{out_dir}"
            out_dir = f"results/{out_dir}"
            if os.path.exists(out_dir):
                shutil.rmtree(out_dir)
        else:
            out_dir = None
        
        # load recipe
        with open(self.recipe, "r") as file:
            recipe = json.load(file)
        recipe = sq.QueryRecipe(recipe)

        # just for debugging purposes
        if self.tile_handler == "None":
            with warnings.catch_warnings():
                warnings.simplefilter("ignore", UserWarning)
                context = self.context
                context["preview"] = False
                context["caching"] = False
                self.response = recipe.execute(**context)

        if self.tile_handler == "single":
            self.th = TileHandler(
                recipe, 
                chunksize_s=256,
                chunksize_t="2W",
                merge_mode=self.merge_mode, 
                out_dir=out_dir,
                reauth=False, 
                verbose=True, 
                **self.context
            )
            self.th.execute()
            # if self.merge_mode=="single":
            #     self.response = self.th.joint_res

        elif self.tile_handler == "parallel":
            self.th = TileHandlerParallel(
                recipe, 
                chunksize_s=128, 
                merge_mode=self.merge_mode, 
                out_dir=out_dir, 
                reauth=False, 
                verbose=True, 
                n_procs=os.cpu_count(), 
                **self.context
            )
            self.th.execute()
            if self.merge_mode=="single":
                self.response = self.th.joint_res

In [4]:
# remaining
# MultiProcessorTest

In [7]:
# successful tests?
tester = Tester( 
    recipe = 'recipes/01_treduce.json',
    aoi_file = "aois/polygon.geojson",
    t_interval = t_intervals[0],
    tile_handler = "single", 
    merge_mode = "merged",
    out_dir = True,
    res = 100,
    epsg = 32634
)

tester = Tester( 
    recipe = 'recipes/01_treduce.json',
    aoi_file = "aois/polygon.geojson",
    t_interval = t_intervals[0],
    tile_handler = "single", 
    merge_mode = "merged",
    out_dir = True,
    res = 100,
    epsg = 32634
)

tester = Tester( 
    recipe = 'recipes/01_treduce.json',
    aoi_file = "aois/polygon.geojson",
    t_interval = t_intervals[0],
    tile_handler = "single", 
    merge_mode = "vrt_shapes",
    out_dir = True,
    res = 100,
    epsg = 32634
)

# works (omits small polygon)
tester = Tester( 
    recipe = 'recipes/01_treduce.json',
    aoi_file = "aois/multipolygon.geojson",
    t_interval = t_intervals[0],
    tile_handler = "single", 
    merge_mode = "vrt_shapes",
    out_dir = True,
    res = 100,
    epsg = 32754
)

tester = Tester( 
    recipe = 'recipes/01_treduce.json',
    aoi_file = "aois/multipoint.geojson",
    t_interval = t_intervals[0],
    tile_handler = "single", 
    merge_mode = "vrt_shapes",
    out_dir = True,
    res = 100,
    epsg = 32634
)


tester = Tester( 
    recipe = 'recipes/01_treduce.json',
    aoi_file = "aois/polygon.geojson",
    t_interval = t_intervals[0],
    tile_handler = "single", 
    merge_mode = "merged",
    out_dir = True,
    res = 50,
    epsg = 32632
)

tester = Tester( 
    recipe = 'recipes/02_sreduce.json',
    aoi_file = "aois/polygon.geojson",
    t_interval = t_intervals[0],
    tile_handler = "single", 
    merge_mode = "merged",
    out_dir = True,
    res = 100,
    epsg = 32634
)

tester = Tester( 
    recipe = 'recipes/03_tconcatenated.json',
    aoi_file = "aois/polygon.geojson",
    t_interval = t_intervals[0],
    tile_handler = "single", 
    merge_mode = "merged",
    out_dir = True,
    res = 100,
    epsg = 32634
)

tester = Tester( 
    recipe = 'recipes/03_tconcatenated.json',
    aoi_file = "aois/multipolygon.geojson",
    t_interval = t_intervals[1],
    tile_handler = "single", 
    merge_mode = "merged",
    out_dir = True,
    res = 100,
    epsg = 32754
)

tester = Tester( 
    recipe = 'recipes/04_sconcatenated.json',
    aoi_file = "aois/polygon.geojson",
    t_interval = t_intervals[0],
    tile_handler = "single", 
    merge_mode = "merged", 
    out_dir = False,
    res = 100,
    epsg = 32634
)

tester = Tester(
    recipe = "recipes/05_tgrouped.json",
    aoi_file = "aois/multipolygon.geojson",
    t_interval = t_intervals[1],
    tile_handler = "single", 
    merge_mode = "merged",
    out_dir = True,
    res = 100,
    epsg = 32754
)

tester = Tester(
    recipe = "recipes/05_tgrouped.json",
    aoi_file = "aois/multipoint.geojson",
    t_interval = t_intervals[1],
    tile_handler = "single", 
    merge_mode = "merged",
    out_dir = True,
    res = 100,
    epsg = 32632
)

tester = Tester( 
    recipe = 'recipes/06_sgrouped.json',
    aoi_file = "aois/polygon.geojson",
    t_interval = t_intervals[0],
    tile_handler = "single", 
    merge_mode = "merged",
    out_dir = True,
    res = 100,
    epsg = 32634
)

tester = Tester( 
    recipe = 'recipes/06_sgrouped.json',
    aoi_file = "aois/multipolygon.geojson",
    t_interval = t_intervals[0],
    tile_handler = "single", 
    merge_mode = "merged",
    out_dir = True,
    res = 100,
    epsg = 32754
)

tester = Tester(
    recipe = "recipes/07_tmulti_grouped.json",
    aoi_file = "aois/polygon.geojson",
    t_interval = t_intervals[0],
    tile_handler = "single", 
    merge_mode = "vrt_shapes",
    out_dir = True,
    res = 100,
    epsg = 32634
)

tester = Tester(
    recipe = "recipes/08_tmulti_double_strat.json",
    aoi_file = "aois/polygon.geojson",
    t_interval = t_intervals[0],
    tile_handler = "single", 
    merge_mode = "vrt_shapes",
    out_dir = True,
    res = 100,
    epsg = 32634
)

tester = Tester( 
    recipe = 'recipes/09_udf.json',
    aoi_file = "aois/polygon.geojson",
    t_interval = t_intervals[0],
    tile_handler = "single", 
    merge_mode = "merged",
    out_dir = True,
    res = 100,
    epsg = 32634
)

Found: 26 datasets
The following numbers are rough estimations depending on the chosen strategy for merging the individual tile results. If merge='merged' is choosen the total size indicates a lower bound for how much RAM is required since the individual tile results will be stored in RAM before merging.

---------------------------------------
General layer info
---------------------------------------
layer     : dtype     crs   res        
---------------------------------------
obs_count : float64   32634 [-100, 100]
---------------------------------------

------------------------------------------
Scenario: 'merge' = None
------------------------------------------
layer     :  size     tile n    tile shape
------------------------------------------
obs_count : 0.00 Gb  4 tile(s)  (256, 256)
------------------------------------------
Total       0.00 Gb  4 tile(s)
------------------------------------------

------------------------------------------
Scenario: 'merge' = vrt_*
------

100%|██████████| 4/4 [00:01<00:00,  2.47it/s]


Found: 26 datasets
The following numbers are rough estimations depending on the chosen strategy for merging the individual tile results. If merge='merged' is choosen the total size indicates a lower bound for how much RAM is required since the individual tile results will be stored in RAM before merging.

---------------------------------------
General layer info
---------------------------------------
layer     : dtype     crs   res        
---------------------------------------
obs_count : float64   32634 [-100, 100]
---------------------------------------

------------------------------------------
Scenario: 'merge' = None
------------------------------------------
layer     :  size     tile n    tile shape
------------------------------------------
obs_count : 0.00 Gb  4 tile(s)  (256, 256)
------------------------------------------
Total       0.00 Gb  4 tile(s)
------------------------------------------

------------------------------------------
Scenario: 'merge' = vrt_*
------

100%|██████████| 4/4 [00:01<00:00,  2.67it/s]


Found: 26 datasets
The following numbers are rough estimations depending on the chosen strategy for merging the individual tile results. If merge='merged' is choosen the total size indicates a lower bound for how much RAM is required since the individual tile results will be stored in RAM before merging.

---------------------------------------
General layer info
---------------------------------------
layer     : dtype     crs   res        
---------------------------------------
obs_count : float64   32634 [-100, 100]
---------------------------------------

------------------------------------------
Scenario: 'merge' = None
------------------------------------------
layer     :  size     tile n    tile shape
------------------------------------------
obs_count : 0.00 Gb  4 tile(s)  (256, 256)
------------------------------------------
Total       0.00 Gb  4 tile(s)
------------------------------------------

------------------------------------------
Scenario: 'merge' = vrt_*
------

100%|██████████| 4/4 [00:01<00:00,  2.52it/s]


Found: 14 datasets
The following numbers are rough estimations depending on the chosen strategy for merging the individual tile results. If merge='merged' is choosen the total size indicates a lower bound for how much RAM is required since the individual tile results will be stored in RAM before merging.

---------------------------------------
General layer info
---------------------------------------
layer     : dtype     crs   res        
---------------------------------------
obs_count : float64   32754 [-100, 100]
---------------------------------------

------------------------------------------
Scenario: 'merge' = None
------------------------------------------
layer     :  size     tile n    tile shape
------------------------------------------
obs_count : 0.00 Gb  5 tile(s)  (256, 256)
------------------------------------------
Total       0.00 Gb  5 tile(s)
------------------------------------------

------------------------------------------
Scenario: 'merge' = vrt_*
------

100%|██████████| 5/5 [00:01<00:00,  3.02it/s]


Found: 31 datasets
The following numbers are rough estimations depending on the chosen strategy for merging the individual tile results. If merge='merged' is choosen the total size indicates a lower bound for how much RAM is required since the individual tile results will be stored in RAM before merging.

---------------------------------------
General layer info
---------------------------------------
layer     : dtype     crs   res        
---------------------------------------
obs_count : float64   32634 [-100, 100]
---------------------------------------

------------------------------------------
Scenario: 'merge' = None
------------------------------------------
layer     :  size     tile n    tile shape
------------------------------------------
obs_count : 0.00 Gb  5 tile(s)  (256, 256)
------------------------------------------
Total       0.00 Gb  5 tile(s)
------------------------------------------

------------------------------------------
Scenario: 'merge' = vrt_*
------

100%|██████████| 5/5 [00:01<00:00,  3.78it/s]


Found: 26 datasets
The following numbers are rough estimations depending on the chosen strategy for merging the individual tile results. If merge='merged' is choosen the total size indicates a lower bound for how much RAM is required since the individual tile results will be stored in RAM before merging.

-------------------------------------
General layer info
-------------------------------------
layer     : dtype     crs   res      
-------------------------------------
obs_count : float64   32632 [-50, 50]
-------------------------------------

------------------------------------------
Scenario: 'merge' = None
------------------------------------------
layer     :  size     tile n    tile shape
------------------------------------------
obs_count : 0.00 Gb  5 tile(s)  (256, 256)
------------------------------------------
Total       0.00 Gb  5 tile(s)
------------------------------------------

------------------------------------------
Scenario: 'merge' = vrt_*
------------------

100%|██████████| 5/5 [00:02<00:00,  2.06it/s]


Found: 26 datasets
preview() is currently only implemented for spatial outputs. Unless you are processing very dense timeseries and/or processing many features it's save to assume that the size of your output is rather small, so don't worry about the memory space.



100%|██████████| 3/3 [00:03<00:00,  1.20s/it]


Found: 26 datasets
The following numbers are rough estimations depending on the chosen strategy for merging the individual tile results. If merge='merged' is choosen the total size indicates a lower bound for how much RAM is required since the individual tile results will be stored in RAM before merging.

---------------------------------------
General layer info
---------------------------------------
layer     : dtype     crs   res        
---------------------------------------
obs_count : float64   32634 [-100, 100]
---------------------------------------

---------------------------------------------
Scenario: 'merge' = None
---------------------------------------------
layer     :  size     tile n     tile shape  
---------------------------------------------
obs_count : 0.00 Gb  4 tile(s)  (1, 256, 256)
---------------------------------------------
Total       0.00 Gb  4 tile(s)
---------------------------------------------

---------------------------------------------
Scenario

100%|██████████| 4/4 [00:01<00:00,  2.19it/s]


Found: 78 datasets
The following numbers are rough estimations depending on the chosen strategy for merging the individual tile results. If merge='merged' is choosen the total size indicates a lower bound for how much RAM is required since the individual tile results will be stored in RAM before merging.

---------------------------------------
General layer info
---------------------------------------
layer     : dtype     crs   res        
---------------------------------------
obs_count : float64   32754 [-100, 100]
---------------------------------------

---------------------------------------------
Scenario: 'merge' = None
---------------------------------------------
layer     :  size     tile n     tile shape  
---------------------------------------------
obs_count : 0.01 Gb  5 tile(s)  (6, 256, 256)
---------------------------------------------
Total       0.01 Gb  5 tile(s)
---------------------------------------------

---------------------------------------------
Scenario

100%|██████████| 5/5 [00:04<00:00,  1.08it/s]


Found: 26 datasets
preview() is currently only implemented for spatial outputs. Unless you are processing very dense timeseries and/or processing many features it's save to assume that the size of your output is rather small, so don't worry about the memory space.



100%|██████████| 3/3 [00:02<00:00,  1.01it/s]


Found: 78 datasets
The following numbers are rough estimations depending on the chosen strategy for merging the individual tile results. If merge='merged' is choosen the total size indicates a lower bound for how much RAM is required since the individual tile results will be stored in RAM before merging.

---------------------------------------
General layer info
---------------------------------------
layer     : dtype     crs   res        
---------------------------------------
obs_count : float64   32754 [-100, 100]
---------------------------------------

---------------------------------------------
Scenario: 'merge' = None
---------------------------------------------
layer     :  size     tile n     tile shape  
---------------------------------------------
obs_count : 0.01 Gb  5 tile(s)  (6, 256, 256)
---------------------------------------------
Total       0.01 Gb  5 tile(s)
---------------------------------------------

---------------------------------------------
Scenario

100%|██████████| 5/5 [00:04<00:00,  1.10it/s]


Found: 189 datasets
The following numbers are rough estimations depending on the chosen strategy for merging the individual tile results. If merge='merged' is choosen the total size indicates a lower bound for how much RAM is required since the individual tile results will be stored in RAM before merging.

---------------------------------------
General layer info
---------------------------------------
layer     : dtype     crs   res        
---------------------------------------
obs_count : float64   32632 [-100, 100]
---------------------------------------

---------------------------------------------
Scenario: 'merge' = None
---------------------------------------------
layer     :  size     tile n     tile shape  
---------------------------------------------
obs_count : 0.02 Gb  6 tile(s)  (6, 256, 256)
---------------------------------------------
Total       0.02 Gb  6 tile(s)
---------------------------------------------

---------------------------------------------
Scenari

100%|██████████| 6/6 [00:03<00:00,  1.55it/s]


Found: 26 datasets
preview() is currently only implemented for spatial outputs. Unless you are processing very dense timeseries and/or processing many features it's save to assume that the size of your output is rather small, so don't worry about the memory space.



100%|██████████| 3/3 [00:02<00:00,  1.11it/s]


Found: 14 datasets
preview() is currently only implemented for spatial outputs. Unless you are processing very dense timeseries and/or processing many features it's save to assume that the size of your output is rather small, so don't worry about the memory space.



100%|██████████| 3/3 [00:03<00:00,  1.06s/it]


Found: 26 datasets


  return np.nanmedian(x, axis = axis)


The following numbers are rough estimations depending on the chosen strategy for merging the individual tile results. If merge='merged' is choosen the total size indicates a lower bound for how much RAM is required since the individual tile results will be stored in RAM before merging.

--------------------------------------
General layer info
--------------------------------------
layer    : dtype     crs   res        
--------------------------------------
aoi_mask : float64   32634 [-100, 100]
comp     : float32   32634 [-100, 100]
--------------------------------------

--------------------------------------------
Scenario: 'merge' = None
--------------------------------------------
layer    :  size     tile n     tile shape  
--------------------------------------------
aoi_mask : 0.00 Gb  4 tile(s)     (256, 256)
comp     : 0.00 Gb  4 tile(s)  (3, 256, 256)
--------------------------------------------
Total      0.00 Gb  8 tile(s)
--------------------------------------------

---

  return np.nanmedian(x, axis = axis)
  return np.nanmedian(x, axis = axis)
  return np.nanmedian(x, axis = axis)
  return np.nanmedian(x, axis = axis)
100%|██████████| 4/4 [00:12<00:00,  3.20s/it]


Found: 26 datasets


  return np.nanmedian(x, axis = axis)


The following numbers are rough estimations depending on the chosen strategy for merging the individual tile results. If merge='merged' is choosen the total size indicates a lower bound for how much RAM is required since the individual tile results will be stored in RAM before merging.

--------------------------------------
General layer info
--------------------------------------
layer    : dtype     crs   res        
--------------------------------------
aoi_mask : float64   32634 [-100, 100]
comp     : float32   32634 [-100, 100]
--------------------------------------

--------------------------------------------
Scenario: 'merge' = None
--------------------------------------------
layer    :  size     tile n     tile shape  
--------------------------------------------
aoi_mask : 0.00 Gb  4 tile(s)     (256, 256)
comp     : 0.00 Gb  4 tile(s)  (3, 256, 256)
--------------------------------------------
Total      0.00 Gb  8 tile(s)
--------------------------------------------

---

  return np.nanmedian(x, axis = axis)
  return np.nanmedian(x, axis = axis)
  return np.nanmedian(x, axis = axis)
  return np.nanmedian(x, axis = axis)
100%|██████████| 4/4 [00:12<00:00,  3.19s/it]


Found: 26 datasets
The following numbers are rough estimations depending on the chosen strategy for merging the individual tile results. If merge='merged' is choosen the total size indicates a lower bound for how much RAM is required since the individual tile results will be stored in RAM before merging.

---------------------------------------
General layer info
---------------------------------------
layer     : dtype     crs   res        
---------------------------------------
obs_count : int16     32634 [-100, 100]
---------------------------------------

------------------------------------------
Scenario: 'merge' = None
------------------------------------------
layer     :  size     tile n    tile shape
------------------------------------------
obs_count : 0.00 Gb  4 tile(s)  (256, 256)
------------------------------------------
Total       0.00 Gb  4 tile(s)
------------------------------------------

------------------------------------------
Scenario: 'merge' = vrt_*
------

100%|██████████| 4/4 [00:20<00:00,  5.17s/it]


In [6]:
# import geopandas as gpd
# polys = gpd.read_file("aois/multipolygon.geojson").to_crs(32754)
# polys.bounds["maxx"] - polys.bounds["minx"]
# polys.bounds["maxx"] - polys.bounds["minx"]