# 003-Running-inferences-with-extra-pipeline-steps

Download [003-Running-inferences-with-extra-pipeline-steps](003-Running-inferences-with-extra-pipeline-steps.ipynb) notebook and try it out

## Introduction
This notebook is meant for someone with minimal knowledge, to be able to meaningfully use the most important functions of the Geospatial SDK.

For more information about the Geospatial Studio see the docs page: [Geospatial Studio Docs](https://terrastackai.github.io/geospatial-studio)

For more information about the Geospatial Studio SDK and all the functions available through it, see the SDK docs page: [Geospatial Studio SDK Docs](https://terrastackai.github.io/geospatial-studio-toolkit)

In [1]:
# Install extra requirements
! pip install boto3


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.3.1[0m[39;49m -> [0m[32;49m25.3[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


In [2]:
%load_ext autoreload
%autoreload 2

In [2]:
# import the required packages

import urllib3
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)

from geostudio import Client
from geostudio import gswidgets

## Connecting to the platform
First, we set up the connection to the platform backend.  To do this we need the base url for the studio UI and an API key.

To get an API Key:
1. Go to the Geospatial Studio UI page and navigate to the Manage your API keys link.
2.  This should pop-up a window where you can generate, access and delete your api keys. NB: every user is limited to a maximum of two activate api keys at any one time.

Store the API key and geostudio ui base url in a credentials file locally, for example in /User/bob/.geostudio_config_file. You can do this by:

```bash
echo "GEOSTUDIO_API_KEY=<paste_api_key_here>" > .geostudio_config_file
echo "BASE_STUDIO_UI_URL=<paste_ui_base_url_here>" >> .geostudio_config_file
```

Copy and paste the file path to this credentials file in call below.


In [3]:
#############################################################
# Initialize Geostudio client using a geostudio config file
#############################################################
gfm_client = Client(geostudio_config_file=".geostudio_config_file")


Using api key and base urls from geostudio config file
Using api key and base urls from geostudio config file
Using api key and base urls from geostudio config file


## Running inference

This notebook is used to run inference with extra custom post-processing steps to be applied after model prediction.

In [5]:
# Replace with your tune_id

tune_id = "geotune-4kyeuubgnuk8kjdlnjdrsl"

In [6]:
# define payload
request_prebuilt_steps_payload = {
    "description": "prebuilt steps inference",
    "location": "austin",
    "fine_tuning_id": tune_id,  # replace this with the ID of your tune
    "model_display_name": "geofm-sandbox-models",  # Keep as is
    "spatial_domain": {  # provide downloadable link to the file(s) to run inference on.
        "urls": [
            "https://ibm.box.com/shared/static/kpvuc75q0j29rvf3w2qiamgczuvuymmd.tif"
        ]
    },
    "temporal_domain": ["2025-11-24"],
    # Provide this
    "geoserver_push": [
        # Keep these 2 layers as is.
        {
            "z_index": 0,
            "workspace": "geofm",
            "layer_name": "input_rgb",
            "file_suffix": "",
            "display_name": "Input image (RGB)",
            "filepath_key": "model_input_original_image_rgb",
            "geoserver_style": {
                "rgb": [
                    {
                        "label": "RedChannel",
                        "channel": 1,
                        "maxValue": 255,
                        "minValue": 0,
                    },
                    {
                        "label": "GreenChannel",
                        "channel": 2,
                        "maxValue": 255,
                        "minValue": 0,
                    },
                    {
                        "label": "BlueChannel",
                        "channel": 3,
                        "maxValue": 255,
                        "minValue": 0,
                    },
                ]
            },
            "visible_by_default": "True",
        },
        {
            "z_index": 1,
            "workspace": "geofm",
            "layer_name": "pred",
            "file_suffix": "",
            "display_name": "Model prediction",
            "filepath_key": "model_output_image",
            "geoserver_style": {
                "segmentation": [
                    {
                        "color": "#7d7247",
                        "label": "no-buildings",
                        "opacity": 0,
                        "quantity": "0",
                    },
                    {
                        "color": "#390c8c",
                        "label": "buildings",
                        "opacity": 1,
                        "quantity": "1",
                    },
                ]
            },
            "visible_by_default": "True",
        },
        # Add all layers to push created by your step(s).
        {
            "z_index": 1,
            "workspace": "geofm",
            "layer_name": "im2poly",  # Name of the registered step
            "file_suffix": "gpkg",  # Suffix of the generated file
            "display_name": "im2poly_regularize_image",  # Display name of the Step
            "filepath_key": "model_output_im2poly_regularize_image",  # Should be <model_output_{step_registered_name}_image>
            "geoserver_style": {},  # for shape files, keep this empty
            "visible_by_default": "True",
        },
    ],
    "post_processing": {
        "cloud_masking": "False",
        "regularization": "False",
        # Defining the extra post-processing steps
        "regularization_custom": [
            {
                "name": "im2poly_regularize",  # step_registered_name
                "params": {  # params of the function
                    "geoserver_suffix_extension": "gpkg",  # Expected
                    "params": {
                        "simplify_tolerance": 2,
                        "parallel_threshold": 2.0,
                        "allow_45_degree": "True",
                    },
                },
            }
        ],
    },
}

In [7]:
inference_response = gfm_client.try_out_tune(tune_id=tune_id, data=request_prebuilt_steps_payload)
inference_response

{'spatial_domain': {'bbox': [],
  'polygons': [],
  'tiles': [],
  'urls': ['https://ibm.box.com/shared/static/kpvuc75q0j29rvf3w2qiamgczuvuymmd.tif']},
 'temporal_domain': ['2025-11-24'],
 'fine_tuning_id': 'geotune-4kyeuubgnuk8kjdlnjdrsl',
 'maxcc': 100,
 'model_display_name': 'geofm-sandbox-models',
 'description': 'prebuilt steps inference',
 'location': 'austin',
 'geoserver_layers': None,
 'demo': None,
 'model_id': '6eeed629-5206-4c1a-8188-58ad754fd235',
 'inference_output': None,
 'id': '84e8b661-d568-4314-8b31-1827352215c0',
 'active': True,
 'created_by': 'Catherine.Wanjiru@ibm.com',
 'created_at': '2025-12-04T09:29:17.137150Z',
 'updated_at': '2025-12-04T09:29:17.457513Z',
 'status': 'PENDING',
 'tasks_count_total': 1,
 'tasks_count_success': 0,
 'tasks_count_failed': 0,
 'tasks_count_stopped': 0,
 'tasks_count_waiting': 1}

In [8]:
# define payload
request_download_scripts_payload = {
    "description": "download scripts",
    "location": "austin",
    "fine_tuning_id": tune_id,  # replace this with the ID of your tune
    "model_display_name": "geofm-sandbox-models",  # Keep as is
    "spatial_domain": {  # provide downloadable link to the file(s) to run inference on.
        "urls": [
            "https://ibm.box.com/shared/static/kpvuc75q0j29rvf3w2qiamgczuvuymmd.tif"
        ]
    },
    "temporal_domain": ["2025-11-24"],
    # Provide this
    "geoserver_push": [
        # Keep these 2 layers as is.
        {
            "z_index": 0,
            "workspace": "geofm",
            "layer_name": "input_rgb",
            "file_suffix": "",
            "display_name": "Input image (RGB)",
            "filepath_key": "model_input_original_image_rgb",
            "geoserver_style": {
                "rgb": [
                    {
                        "label": "RedChannel",
                        "channel": 1,
                        "maxValue": 255,
                        "minValue": 0,
                    },
                    {
                        "label": "GreenChannel",
                        "channel": 2,
                        "maxValue": 255,
                        "minValue": 0,
                    },
                    {
                        "label": "BlueChannel",
                        "channel": 3,
                        "maxValue": 255,
                        "minValue": 0,
                    },
                ]
            },
            "visible_by_default": "True",
        },
        {
            "z_index": 1,
            "workspace": "geofm",
            "layer_name": "pred",
            "file_suffix": "",
            "display_name": "Model prediction",
            "filepath_key": "model_output_image",
            "geoserver_style": {
                "segmentation": [
                    {
                        "color": "#7d7247",
                        "label": "no-buildings",
                        "opacity": 0,
                        "quantity": "0",
                    },
                    {
                        "color": "#390c8c",
                        "label": "buildings",
                        "opacity": 1,
                        "quantity": "1",
                    },
                ]
            },
            "visible_by_default": "True",
        },
        # Add all layers to push created by your step(s).
        {
            "z_index": 1,
            "workspace": "geofm",
            "layer_name": "im2poly",  # Name of the registered step
            "file_suffix": "gpkg",  # Suffix of the generated file
            "display_name": "im2poly_regularize_image",  # Display name of the Step
            "filepath_key": "model_output_im2poly_regularize_image",  # Should be <model_output_{step_registered_name}_image>
            "geoserver_style": {},  # for shape files, keep this empty
            "visible_by_default": "True",
        },
    ],
    "post_processing": {
        "cloud_masking": "False",
        "regularization": "False",
        "download_scripts": {
            "activated": "True",
            "verify_ssl": "False",
            "plugins_list": [
                {
                    "url": "https://s3.us-east.cloud-object-storage.appdomain.cloud/geospatial-studio-example-data/example_post_processing_scripts/user_masking_testing.py?response-content-type=application%2Foctet-stream&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=1c6317aac0054b9890797f09f217b54e%2F20251203%2Fus-east%2Fs3%2Faws4_request&X-Amz-Date=20251203T103901Z&X-Amz-Expires=259200&X-Amz-SignedHeaders=host&X-Amz-Signature=ca85cef43562cf6bc88d20a8f7463a311a0cdde8d53288ef552acc38b86a3849",
                    "filename": "user_masking_testing.py",
                },
                {
                    "url": "https://s3.us-east.cloud-object-storage.appdomain.cloud/geospatial-studio-example-data/example_post_processing_scripts/cloud_masking_testing.py?response-content-type=application%2Foctet-stream&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=1c6317aac0054b9890797f09f217b54e%2F20251203%2Fus-east%2Fs3%2Faws4_request&X-Amz-Date=20251203T104802Z&X-Amz-Expires=259200&X-Amz-SignedHeaders=host&X-Amz-Signature=4f702cb5424c1a8438acc5a0865142e95c58240ca9d122cc322016d1878ba1b8",
                    "filename": "cloud_masking_testing.py",
                },
            ],
        },
        # Defining the extra post-processing steps
        "regularization_custom": [
            {
                "name": "im2poly_regularize",  # step_registered_name
                "params": {  # params of the function
                    "geoserver_suffix_extension": "gpkg",  # Expected
                    "params": {
                        "simplify_tolerance": 2,
                        "parallel_threshold": 2.0,
                        "allow_45_degree": "True",
                    },
                },
            }
        ],
    },
}

In [9]:
inference_response = gfm_client.try_out_tune(tune_id=tune_id, data=request_download_scripts_payload)
inference_response


{'spatial_domain': {'bbox': [],
  'polygons': [],
  'tiles': [],
  'urls': ['https://ibm.box.com/shared/static/kpvuc75q0j29rvf3w2qiamgczuvuymmd.tif']},
 'temporal_domain': ['2025-11-24'],
 'fine_tuning_id': 'geotune-4kyeuubgnuk8kjdlnjdrsl',
 'maxcc': 100,
 'model_display_name': 'geofm-sandbox-models',
 'description': 'download scripts',
 'location': 'austin',
 'geoserver_layers': None,
 'demo': None,
 'model_id': '6eeed629-5206-4c1a-8188-58ad754fd235',
 'inference_output': None,
 'id': 'ee82608d-39cf-4101-b39f-653579b1b314',
 'active': True,
 'created_by': 'Catherine.Wanjiru@ibm.com',
 'created_at': '2025-12-04T09:29:19.158507Z',
 'updated_at': '2025-12-04T09:29:19.495407Z',
 'status': 'PENDING',
 'tasks_count_total': 1,
 'tasks_count_success': 0,
 'tasks_count_failed': 0,
 'tasks_count_stopped': 0,
 'tasks_count_waiting': 1}

## Monitor inference status and progress

After submitting the request, we can poll the inference service to check the progress as well as get the output details once its complete (this could take a few minutes depending on the request size and the current service load).

In [10]:
# Poll inference status
r = gfm_client.poll_inference_until_finished(inference_id=inference_response['id'])


COMPLETED - 289 seconds


In [11]:
response = gfm_client.get_inference(inference_id=inference_response['id'])
response

{'spatial_domain': {'bbox': [],
  'polygons': [],
  'tiles': [],
  'urls': ['https://ibm.box.com/shared/static/kpvuc75q0j29rvf3w2qiamgczuvuymmd.tif']},
 'temporal_domain': ['2025-11-24'],
 'fine_tuning_id': 'geotune-4kyeuubgnuk8kjdlnjdrsl',
 'maxcc': 100,
 'model_display_name': 'geofm-sandbox-models',
 'description': 'download scripts',
 'location': 'austin',
 'geoserver_layers': {'predicted_layers': [{'uri': 'geofm:ee82608d-39cf-4101-b39f-653579b1b314-input_rgb',
    'display_name': 'Input image (RGB)',
    'sld_body': '<?xml version="1.0" encoding="UTF-8"?>\n    <StyledLayerDescriptor xmlns="http://www.opengis.net/sld" xmlns:ogc="http://www.opengis.net/ogc" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.opengis.net/sld\n    http://schemas.opengis.net/sld/1.0.0/StyledLayerDescriptor.xsd" version="1.0.0">\n        <NamedLayer>\n            <Name>geofm:ee82608d-39cf-4101-b39f-653579b1b314-input_rgb</Name>\n

You can view the finished inference in the UI now

## Accessing inference outputs
Once an inference run is completed, the inputs and outputs of each task within an inference are packaged up into a zip file which is uploaded to a url you can use to download the files.

To access the inference task files:
1. Get the inference tasks list
2. Identify the specific inference task you want to view
3. Download task output files

In [12]:
# Get the inference tasks list
inf_tasks_res = gfm_client.get_inference_tasks(response["id"])
inf_tasks_res

{'inference_id': 'ee82608d-39cf-4101-b39f-653579b1b314',
 'status': 'COMPLETED',
 'tasks': [{'inference_id': 'ee82608d-39cf-4101-b39f-653579b1b314',
   'task_id': 'ee82608d-39cf-4101-b39f-653579b1b314-task_0',
   'inference_folder': '/data/ee82608d-39cf-4101-b39f-653579b1b314',
   'status': 'FINISHED',
   'pipeline_steps': [{'status': 'FINISHED',
     'end_time': '2025-12-04T09:31:18',
     'process_id': 'url-connector',
     'start_time': '2025-12-04T09:30:31',
     'step_number': 0},
    {'status': 'FINISHED',
     'end_time': '2025-12-04T09:32:17',
     'process_id': 'terratorch-inference',
     'start_time': '2025-12-04T09:31:31',
     'step_number': 1},
    {'status': 'FINISHED',
     'end_time': '2025-12-04T09:32:55',
     'process_id': 'postprocess-generic',
     'start_time': '2025-12-04T09:32:20',
     'step_number': 2},
    {'status': 'FINISHED',
     'end_time': '2025-12-04T09:34:07',
     'process_id': 'push-to-geoserver',
     'start_time': '2025-12-04T09:33:00',
     'ste

In [13]:
df = gfm_client.inference_task_status_df(response["id"])


display(df.style.map(gswidgets.color_inference_tasks_by_status))

0



ChainedAssignmentError: behaviour will change in pandas 3.0!
You are setting values through chained assignment. Currently this works in certain cases, but when using Copy-on-Write (which will become the default behaviour in pandas 3.0) this will never work to update the original DataFrame or Series, because the intermediate object on which we are setting values will behave as a copy.
A typical example is when you are setting values in a column of a DataFrame, like:

df["col"][row_indexer] = value

Use `df.loc[row_indexer, "col"] = values` instead, to perform the assignment in a single step and ensure this keeps updating the original `df`.

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy




Unnamed: 0,task_id,terratorch-inference,url-connector,postprocess-generic,push-to-geoserver
0,ee82608d-39cf-4101-b39f-653579b1b314-task_0,FINISHED,FINISHED,FINISHED,FINISHED


In [14]:
gswidgets.view_inference_process_timeline(gfm_client, inference_id = response["id"])

## Accessing inference outputs
Once an inference run is completed, the inputs and outputs of each task within an inference are packaged up into a zip file which is uploaded to a url you can use to download the files.

To access the inference task files:
1. Get the inference tasks list
2. Identify the specific inference task you want to view
3. Download task output files

In [15]:
# Get the inference tasks list

inf_tasks_res = gfm_client.get_inference_tasks(response["id"])
inf_tasks_res

{'inference_id': 'ee82608d-39cf-4101-b39f-653579b1b314',
 'status': 'COMPLETED',
 'tasks': [{'inference_id': 'ee82608d-39cf-4101-b39f-653579b1b314',
   'task_id': 'ee82608d-39cf-4101-b39f-653579b1b314-task_0',
   'inference_folder': '/data/ee82608d-39cf-4101-b39f-653579b1b314',
   'status': 'FINISHED',
   'pipeline_steps': [{'status': 'FINISHED',
     'end_time': '2025-12-04T09:31:18',
     'process_id': 'url-connector',
     'start_time': '2025-12-04T09:30:31',
     'step_number': 0},
    {'status': 'FINISHED',
     'end_time': '2025-12-04T09:32:17',
     'process_id': 'terratorch-inference',
     'start_time': '2025-12-04T09:31:31',
     'step_number': 1},
    {'status': 'FINISHED',
     'end_time': '2025-12-04T09:32:55',
     'process_id': 'postprocess-generic',
     'start_time': '2025-12-04T09:32:20',
     'step_number': 2},
    {'status': 'FINISHED',
     'end_time': '2025-12-04T09:34:07',
     'process_id': 'push-to-geoserver',
     'start_time': '2025-12-04T09:33:00',
     'ste

Next, Identify the task you want to view from the response above, ensure status of the task is FINISHED and set `selected_task` variable below to the task number at the end of the task id string. For example, if `task_id` is "6d1149fa-302d-4612-82dd-5879fc06081d-task_0", selected_task would be 0

In [16]:
# Select a task to view

selected_task = 0 
selected_task_id = f"{inf_tasks_res['inference_id']}-task_{selected_task}"

In [17]:
selected_task_id

'ee82608d-39cf-4101-b39f-653579b1b314-task_0'

In [None]:
# Download task output files

gswidgets.fileDownloaderTasks(client=gfm_client, task_id=selected_task_id,just_tifs=False)

Just tiffs: False


HTML(value='<h1>Inference Task output downloader</h1> </p>Select the files and the download path and hit downl…

SelectMultiple(description='Files:', layout=Layout(width='1000px'), options=('ee82608d-39cf-4101-b39f-653579b1…

Text(value='./', description='Dl path:')

Button(description='Download', icon='check', layout=Layout(height='auto', width='800px'), style=ButtonStyle(),…

Output()