# Adding generic post processing steps

This notebook walkthrough shows how to add your post_processing functions in the pipeline.


The expectation of this notebook is that it is ran in the environment with the deployed pipeline components

### Imports

In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
!pwd

/Users/catherinewanjiru/projects/climate/geospatial-studio-pipelines/docs/notebooks


Install the post_process package

In [3]:
%pip install -e ../../components/postprocess-generic-single/post_process


Obtaining file:///Users/catherinewanjiru/projects/climate/geospatial-studio-pipelines/components/postprocess-generic-single/post_process
  Installing build dependencies ... [?25ldone
[?25h  Checking if build backend supports build_editable ... [?25ldone
[?25h  Getting requirements to build editable ... [?25ldone
[?25h  Preparing editable metadata (pyproject.toml) ... [?25ldone
Building wheels for collected packages: post_process
  Building editable for post_process (pyproject.toml) ... [?25ldone
[?25h  Created wheel for post_process: filename=post_process-0.1.0-0.editable-py3-none-any.whl size=2973 sha256=a52014c6662b11d1eafdb10cc6902af25a005b12b11efd3a9426113308858d65
  Stored in directory: /private/var/folders/dm/1tlmdc8d5jd71fgt6jkwrq0c0000gn/T/pip-ephem-wheel-cache-g2fwmfst/wheels/13/02/49/ae196bdbc8c7b3123948f540d67cf24c7ede92b543cf60c190
Successfully built post_process
Installing collected packages: post_process
  Attempting uninstall: post_process
    Found existing ins

In [4]:
import numpy as np
import xarray as xr
import rioxarray as rio

from post_process.post_process import post_process
from post_process.sdk import add_step, download_plugins_to_local
from post_process.discovery import load_fs_plugins
from post_process.registry import POST_PROCESS_REGISTRY


## Adding new step in the Registry

### With local files

Here it is assumed that the user has their python script that has the decorator to register the functions.

In [5]:
# User uploads a script with @register_step decorator
masking_script_path = "user_masking_local_testing.py"
add_step(script_path=masking_script_path, allow_version_suffix=False)

2025-12-03 11:27:30,916 - post_process.logging - INFO - sdk.py:85 - add_step - ✅ Step 'user_masking_local_testing' added to /Users/catherinewanjiru/projects/climate/geospatial-studio-pipelines/components/postprocess-generic-single/post_process/post_process/generic successfully.
 Run `load_fs_plugins` to register the function


In [6]:
# Now we register the built-in post-process step
load_fs_plugins(directory="../../components/postprocess-generic-single/post_process/post_process/generic/")

2025-12-03 11:27:30,939 - post_process.logging - INFO - discovery.py:17 - load_fs_plugins - Loaded post-process plugin from ../../components/postprocess-generic-single/post_process/post_process/generic/user_masking_local_testing.py as module user_masking_local_testing
2025-12-03 11:27:30,940 - post_process.logging - INFO - discovery.py:17 - load_fs_plugins - Loaded post-process plugin from ../../components/postprocess-generic-single/post_process/post_process/generic/example_post_processing.py as module example_post_processing
2025-12-03 11:27:30,941 - post_process.logging - INFO - discovery.py:17 - load_fs_plugins - Loaded post-process plugin from ../../components/postprocess-generic-single/post_process/post_process/generic/__init__.py as module __init__
2025-12-03 11:27:30,995 - post_process.logging - INFO - discovery.py:17 - load_fs_plugins - Loaded post-process plugin from ../../components/postprocess-generic-single/post_process/post_process/generic/im2poly_regularize.py as module i

And when we list the registered steps, they should appear

In [7]:
# Confirm that the step is available in the registry. 
# This lists all the other functions in the directory.
list(POST_PROCESS_REGISTRY.keys())

['user_masking_local',
 'example_post_processing',
 'im2poly_regularize',
 'masking']

With the registered functions, we can now run  `post_processing` function that should discover all the registered steps and only run the ones provided in the `activated_steps` list

#### Test that the function works

With the script registered in the pipeline, let us try to use it to run

In [8]:
activated_local_steps = [
    {
        "name": "user_masking_local",
        "params": {
            "img_path": "",
            "out_path": "",
            "data": "",
        },
    },
]

Read a sample tif to use in the functions

In [9]:
data = np.random.randint(0, 256, size=(1, 256, 256)).astype('int16')
xds = xr.DataArray(data, dims=('band', 'y', 'x'), name='sample')
# attach CRS so it behaves like a raster for rioxarray/post-processing
xds.rio.write_crs("EPSG:4326", inplace=True)
print("Created in-memory sample xds:", xds.shape)

Created in-memory sample xds: (1, 256, 256)


Clear the previously registered steps in the registry as `post_process` entrypoint registers them again

In [10]:
POST_PROCESS_REGISTRY.clear()
POST_PROCESS_REGISTRY

{}

Run the post-process function. Notice the prints how the post-process function runs the previously defined scripts/steps

In [11]:
# Since this function is expected to be the entrypoint for post-processing, 
# by default the steps are registered a fresh.
post_process_outputs = post_process(
    img=xds,
    steps_config=activated_local_steps,
    plugins_dir="../../components/postprocess-generic-single/post_process/post_process/generic/",
   
)

2025-12-03 11:27:31,198 - post_process.logging - INFO - discovery.py:17 - load_fs_plugins - Loaded post-process plugin from ../../components/postprocess-generic-single/post_process/post_process/generic/user_masking_local_testing.py as module user_masking_local_testing
2025-12-03 11:27:31,199 - post_process.logging - INFO - discovery.py:17 - load_fs_plugins - Loaded post-process plugin from ../../components/postprocess-generic-single/post_process/post_process/generic/example_post_processing.py as module example_post_processing
2025-12-03 11:27:31,200 - post_process.logging - INFO - discovery.py:17 - load_fs_plugins - Loaded post-process plugin from ../../components/postprocess-generic-single/post_process/post_process/generic/__init__.py as module __init__
2025-12-03 11:27:31,201 - post_process.logging - INFO - discovery.py:17 - load_fs_plugins - Loaded post-process plugin from ../../components/postprocess-generic-single/post_process/post_process/generic/im2poly_regularize.py as module i

Requested steps: [{'name': 'user_masking_local', 'params': {'img_path': '', 'out_path': '', 'data': ''}}]
Post-process step 'user_masking_local'  found.
Running user masking registered step


### With downloadable python script files

Here, the assumption is the user has a python script in a downloadable url. We will use the function `download_plugins_to_local` to download the script to the folder with the deployed post-processing component. 

This function downloads the files and places them in the folder. <br> No ovewriting of files is allowed by default. To enable this, set 
```
allow_version_suffix=True
```

In [12]:
results, saved_plugins = download_plugins_to_local(
    verify_ssl=False, # Whether to verify SSL certificates
    plugins_list=[
        {
            "url": "https://s3.us-east.cloud-object-storage.appdomain.cloud/geospatial-studio-example-data/example_post_processing_scripts/user_masking_testing.py?response-content-type=application%2Foctet-stream&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=1c6317aac0054b9890797f09f217b54e%2F20251202%2Fus-east%2Fs3%2Faws4_request&X-Amz-Date=20251202T142413Z&X-Amz-Expires=259200&X-Amz-SignedHeaders=host&X-Amz-Signature=833373fd68b5266ad79e2edc23666ace1a58bb886c45179f504c312592f49c7e",
            "filename": "user_masking_testing.py",
        },
        {
            "url": "https://s3.us-east.cloud-object-storage.appdomain.cloud/geospatial-studio-example-data/example_post_processing_scripts/cloud_masking_testing.py?response-content-type=application%2Foctet-stream&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=1c6317aac0054b9890797f09f217b54e%2F20251202%2Fus-east%2Fs3%2Faws4_request&X-Amz-Date=20251202T142349Z&X-Amz-Expires=259200&X-Amz-SignedHeaders=host&X-Amz-Signature=f625ebade9ce400c0535c216f7ee18fbf589b9ba6bc3c481ed74ca9826322d40",
            "filename": "cloud_masking_testing.py",
        },
    ] # List of plugins to download
)

2025-12-03 11:27:31,218 - post_process.logging - INFO - sdk.py:200 - download_plugins_to_local -  ⬇️  Downloading plugin from https://s3.us-east.cloud-object-storage.appdomain.cloud/geospatial-studio-example-data/example_post_processing_scripts/user_masking_testing.py?response-content-type=application%2Foctet-stream&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=1c6317aac0054b9890797f09f217b54e%2F20251202%2Fus-east%2Fs3%2Faws4_request&X-Amz-Date=20251202T142413Z&X-Amz-Expires=259200&X-Amz-SignedHeaders=host&X-Amz-Signature=833373fd68b5266ad79e2edc23666ace1a58bb886c45179f504c312592f49c7e to /tmp/post_process/user_masking_testing.py
2025-12-03 11:27:32,717 - post_process.logging - INFO - sdk.py:85 - add_step - ✅ Step 'user_masking_testing' added to /Users/catherinewanjiru/projects/climate/geospatial-studio-pipelines/components/postprocess-generic-single/post_process/post_process/generic successfully.
 Run `load_fs_plugins` to register the function
2025-12-03 11:27:32,717 - post_proces

In [13]:
results

[{'entry': {'url': 'https://s3.us-east.cloud-object-storage.appdomain.cloud/geospatial-studio-example-data/example_post_processing_scripts/user_masking_testing.py?response-content-type=application%2Foctet-stream&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=1c6317aac0054b9890797f09f217b54e%2F20251202%2Fus-east%2Fs3%2Faws4_request&X-Amz-Date=20251202T142413Z&X-Amz-Expires=259200&X-Amz-SignedHeaders=host&X-Amz-Signature=833373fd68b5266ad79e2edc23666ace1a58bb886c45179f504c312592f49c7e',
   'filename': 'user_masking_testing.py'},
  'status': 'writen_to_disk',
  'filename': 'user_masking_testing.py',
  'path': '/tmp/post_process/user_masking_testing.py'},
 {'entry': {'url': 'https://s3.us-east.cloud-object-storage.appdomain.cloud/geospatial-studio-example-data/example_post_processing_scripts/cloud_masking_testing.py?response-content-type=application%2Foctet-stream&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=1c6317aac0054b9890797f09f217b54e%2F20251202%2Fus-east%2Fs3%2Faws4_request&

In [14]:
saved_plugins

[{'entry': {'url': 'https://s3.us-east.cloud-object-storage.appdomain.cloud/geospatial-studio-example-data/example_post_processing_scripts/user_masking_testing.py?response-content-type=application%2Foctet-stream&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=1c6317aac0054b9890797f09f217b54e%2F20251202%2Fus-east%2Fs3%2Faws4_request&X-Amz-Date=20251202T142413Z&X-Amz-Expires=259200&X-Amz-SignedHeaders=host&X-Amz-Signature=833373fd68b5266ad79e2edc23666ace1a58bb886c45179f504c312592f49c7e',
   'filename': 'user_masking_testing.py'},
  'status': 'writen_to_disk',
  'filename': 'user_masking_testing.py',
  'path': '/tmp/post_process/user_masking_testing.py'},
 {'entry': {'url': 'https://s3.us-east.cloud-object-storage.appdomain.cloud/geospatial-studio-example-data/example_post_processing_scripts/cloud_masking_testing.py?response-content-type=application%2Foctet-stream&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=1c6317aac0054b9890797f09f217b54e%2F20251202%2Fus-east%2Fs3%2Faws4_request&

After downloading the files to the generic post_processing folder, we can now register them using `load_fs_plugins` method

In [15]:
load_fs_plugins()

And when we list the registered steps, they should appear

In [16]:
print(list(POST_PROCESS_REGISTRY.keys()))

['user_masking_local', 'example_post_processing', 'im2poly_regularize', 'masking']


Clear the previously registered steps in the registry as `post_process` entrypoint registers them again

In [17]:
POST_PROCESS_REGISTRY.clear()
POST_PROCESS_REGISTRY

{}

Then as previously, we can run the post_processing steps

In [18]:
activated_downloaded_steps = [
    {
        "name": "cloud_masking_testing",
        "params": {
            "img_path": "test_images/big_bbox_pred.tif",
            "out_path": "",
            "data": "",
        },
    },
    {
        "name": "user_masking",
        "params": {
            "img_path": "test_images/big_bbox_pred.tif",
            "out_path": "",
            "data": "",
        },
    },
]

In [19]:
post_process_outputs = post_process(
    img=xds,
    steps_config=activated_downloaded_steps,
    plugins_dir="../../components/postprocess-generic-single/post_process/post_process/generic/",
   
)

2025-12-03 11:27:33,995 - post_process.logging - INFO - discovery.py:17 - load_fs_plugins - Loaded post-process plugin from ../../components/postprocess-generic-single/post_process/post_process/generic/user_masking_testing.py as module user_masking_testing
2025-12-03 11:27:33,996 - post_process.logging - INFO - discovery.py:17 - load_fs_plugins - Loaded post-process plugin from ../../components/postprocess-generic-single/post_process/post_process/generic/user_masking_local_testing.py as module user_masking_local_testing
2025-12-03 11:27:33,996 - post_process.logging - INFO - discovery.py:17 - load_fs_plugins - Loaded post-process plugin from ../../components/postprocess-generic-single/post_process/post_process/generic/example_post_processing.py as module example_post_processing
2025-12-03 11:27:33,997 - post_process.logging - INFO - discovery.py:17 - load_fs_plugins - Loaded post-process plugin from ../../components/postprocess-generic-single/post_process/post_process/generic/__init__.

Requested steps: [{'name': 'cloud_masking_testing', 'params': {'img_path': 'test_images/big_bbox_pred.tif', 'out_path': '', 'data': ''}}, {'name': 'user_masking', 'params': {'img_path': 'test_images/big_bbox_pred.tif', 'out_path': '', 'data': ''}}]
Post-process step 'cloud_masking_testing'  found.
Running cloud masking registered step
Post-process step 'user_masking'  found.
Running user masking registered step
