*Sen2+1Cube*: Luke McQuade, June 2023 

# *dpolcat* Performance Evaluation

As *dpolcat* may be used in large-scale analyses, it is useful to know its performance characteristics. So, let's measure the time and memory to process an entire Sentinel-1 scene. Our selected scene is from an area around Almaty, Kazakhstan - S1A_IW_GRDH_1SDV_20230730T124317_20230730T124342_049653_05F87E_rtc.

## Setup

In [1]:
%pip install memory-profiler
%load_ext memory_profiler

Note: you may need to restart the kernel to use updated packages.


In [2]:
import pystac_client
import planetary_computer
import stackstac

Automatically reload the module when changes are made.

In [3]:
%load_ext autoreload
%autoreload 2

In [4]:
import dpolcat as dp

## Helpers

In [5]:
def get_item_by_id(item_collection, id: str):
    return [item for item in item_collection if item.id == id][0]

## Evaluation

In [6]:
search_bbox = [76.733460,43.085164,77.112488,43.385241]

catalog = pystac_client.Client.open(
    "https://planetarycomputer.microsoft.com/api/stac/v1",
    modifier=planetary_computer.sign_inplace,
)

search = catalog.search(
    collections=["sentinel-1-rtc"], bbox=search_bbox, datetime="2023-07-26/2023-08-07"
)
items = search.item_collection()
print(f"Found {len(items)} items")

test_item = get_item_by_id(items, "S1A_IW_GRDH_1SDV_20230730T124317_20230730T124342_049653_05F87E_rtc")
print(f"Selected {test_item.id}")

ds = stackstac.stack([test_item])

Found 3 items
Selected S1A_IW_GRDH_1SDV_20230730T124317_20230730T124342_049653_05F87E_rtc


  times = pd.to_datetime(


The computation environment may not have enough memory to process an entire scene at once, so use a fractional subset. As the computational cost is approximately linear, we should be able to estimate the total resource cost by scaling the results accordingly.

To determine the cost for each step in the process, we perform the steps cumulatively, then use subtraction, working backwards from the total costs.

In [7]:
# height_frac = 1 / 8
height_frac = 1 / 4
tile_height = int(ds.shape[2] * height_frac)
vv_lin_l = ds.sel(band="vv")[0][0:tile_height]
vh_lin_l = ds.sel(band="vh")[0][0:tile_height]

### Loading source data into memory

In [8]:
%%timeit -r 3
vv_lin = vv_lin_l.compute()
vh_lin = vh_lin_l.compute()

7.23 s ± 90.1 ms per loop (mean ± std. dev. of 3 runs, 1 loop each)


In [9]:
%%memit
vv_lin = vv_lin_l.compute()
vh_lin = vh_lin_l.compute()

peak memory: 4601.82 MiB, increment: 2527.32 MiB


### Loading and scaling

In [10]:
%%timeit -r 3
vv_sn = dp.scale_nice(vv_lin_l.compute())
vh_sn = dp.scale_nice(vh_lin_l.compute())

10.4 s ± 25.3 ms per loop (mean ± std. dev. of 3 runs, 1 loop each)


In [11]:
%%memit
vv_sn = dp.scale_nice(vv_lin_l.compute())
vh_sn = dp.scale_nice(vh_lin_l.compute())

peak memory: 9504.31 MiB, increment: 4769.12 MiB


### Loading, scaling and categorization

In [None]:
%%timeit -r 3
cat_result = dp.categorize_xa(
    dp.scale_nice(
        vv_lin_l.compute()),
    dp.scale_nice(
        vh_lin_l.compute()))

In [None]:
%%memit
cat_result = dp.categorize_xa(
    dp.scale_nice(
        vv_lin_l.compute()),
    dp.scale_nice(
        vh_lin_l.compute()))

### Results

#### Timing
Based on a sample run of the above.

In [None]:
# recorded
t_steps_1_2_3 = 31.5
t_steps_1_2 = 4.95
t_steps_1 = 3.27

# calculated
t_steps_2 = t_steps_1_2 - t_steps_1
t_steps_3 = t_steps_1_2_3 - t_steps_1_2

print(f"Tile size: {vv_lin_l.sizes['x']}*{vv_lin_l.sizes['y']} | Scene size: {ds.sizes['x']}*{ds.sizes['y']}")
print(f"Step 1 - Reading:        {t_steps_1} (tile) | {t_steps_1 / height_frac} (est. scene)")
print(f"Step 2 - Scaling:        {t_steps_2} (tile) | {t_steps_2 / height_frac} (est. scene)")
print(f"Step 3 - Categorization: {t_steps_3} (tile) | {t_steps_3 / height_frac} (est. scene)")
print(f"Total:                   {t_steps_1_2_3} (tile) | {t_steps_1_2_3 / height_frac} (est. scene)")


In summary, the estimated scene timings are:

| Step | Time |
| --- | --- |
| 1. Reading | 26s |
| 2. Scaling | 13s |
| 3. Categorization | 3m 32s |
| Total | 4m 12s |


#### Memory

Practically, overall peak memory usage is the most important metric here, which was previously measured to be approximately 13GB for a 1/8-height tile and 21GB for a 1/4-height tile. This is not strictly linear, but extrapolating based on the difference between those, it would suggest we need approximately 68GB of memory to process an entire scene.