Support for multi-step algorithms (multiple algorithms in single prompt) by iamtekson · Pull Request #46 · iamtekson/GeoAgent

iamtekson · 2026-01-28T05:03:04Z

No description provided.

Copilot

Pull request overview

This pull request adds support for multi-step geospatial processing workflows by implementing a recursive sub-graph architecture. The implementation allows users to specify complex queries that involve multiple sequential operations (e.g., "buffer layer X by 50m AND calculate statistics from that buffer").

Changes:

Added task decomposition, dependency analysis, and error handling for multi-step workflows
Introduced new Pydantic schemas for task definitions, decomposition, and error analysis
Extended ProcessingState with fields for task queue, results tracking, and workflow metadata

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 11 comments.

Show a summary per file

File	Description
prompts/system.py	Added prompts for task decomposition, dependency analysis, and multi-step error handling; improved formatting of parameter gathering prompt
agents/schemas.py	Added TaskDefinition, TaskDecomposition, DependencyInjection, TaskResult, and ErrorAnalysis schemas; minor formatting improvements to existing schemas
agents/states.py	Extended ProcessingState with multi-task workflow fields (task_queue, task_results, current_task_index, completed_tasks, is_multi_step)
agents/multi_step_processing.py	New file implementing multi-step processing graph with nodes for decomposition, dependency analysis, task execution, error handling, and workflow finalization
agents/graph.py	Added multi_step parameter to build_unified_graph to toggle between single-step and multi-step processing modes
agents/init.py	Exported new schemas and build_multi_step_processing_graph function

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-28T05:09:12Z

+    state["_current_task_id"] = task_id
+    state["_previous_outputs"] = previous_outputs
+
+    _logger.info(
+        f"NODE: analyze_dependencies_node END -> {len(previous_outputs)} outputs available"
+    )
+    return state


Direct state mutation: The code directly mutates the input state dictionary by assigning to state["_current_task_id"] and state["_previous_outputs"]. Node functions in LangGraph should not mutate the input state but instead return a new state dictionary. This should be changed to return {**state, "_current_task_id": task_id, "_previous_outputs": previous_outputs} at the end of the function (line 235).

Suggested change

state["_current_task_id"] = task_id

state["_previous_outputs"] = previous_outputs

_logger.info(

f"NODE: analyze_dependencies_node END -> {len(previous_outputs)} outputs available"

)

return state

_logger.info(

f"NODE: analyze_dependencies_node END -> {len(previous_outputs)} outputs available"

)

return {

**state,

"_current_task_id": task_id,

"_previous_outputs": previous_outputs,

}

Copilot · 2026-01-28T05:09:13Z

+            for layer in output_layers:
+                previous_outputs[f"task_{prev_task_id}_output"] = layer
+                _logger.debug(f"  Available from task {prev_task_id}: {layer}")


Logic error in output layer tracking: When a task produces multiple output layers, this loop overwrites the same dictionary key f"task_{prev_task_id}_output" for each layer, resulting in only the last layer being preserved. If multiple outputs need to be tracked, consider using indexed keys like f"task_{prev_task_id}_output_{idx}" or storing all layers in a list. Alternatively, if only one output per task is expected, use a break statement or take only the first layer.

Suggested change

for layer in output_layers:

previous_outputs[f"task_{prev_task_id}_output"] = layer

_logger.debug(f" Available from task {prev_task_id}: {layer}")

for idx, layer in enumerate(output_layers):

# Preserve original key for the first output to maintain compatibility,

# and use indexed keys for any additional outputs from the same task.

if idx == 0:

key = f"task_{prev_task_id}_output"

else:

key = f"task_{prev_task_id}_output_{idx}"

previous_outputs[key] = layer

_logger.debug(f" Available from task {prev_task_id}: {layer} (as {key})")

Copilot · 2026-01-28T05:09:13Z

+    graph.set_entry_point("route")
+    graph.add_edge("route", "decompose")
+    graph.add_edge("decompose", "analyze_deps")
+    graph.add_edge("analyze_deps", "discover")


Missing conditional routing after route node: The graph unconditionally proceeds from route to decompose and through the processing pipeline, even for non-processing tasks. While the nodes check is_processing_task and skip processing, they still execute unnecessary logic. Consider adding a conditional edge after the route node to bypass the processing pipeline entirely for non-processing tasks, going directly to the llm node for general queries. This would match the pattern used in the single-step processing graph.

Copilot · 2026-01-28T05:09:13Z

 from langgraph.checkpoint.memory import MemorySaver
 from langchain_core.messages import BaseMessage, AIMessage, ToolMessage
+
+from agents.multi_step_processing import build_multi_step_processing_graph


Inconsistent import style: this line uses an absolute import while the surrounding imports use relative imports (lines 14, 16, 17). For consistency with the codebase conventions, this should be changed to a relative import: from .multi_step_processing import build_multi_step_processing_graph

Copilot · 2026-01-28T05:09:13Z

+                f"LLM extracted: {parameters}"
+            )
+            _logger.error(error_msg)
+            return {"error_message": error_msg}


Inconsistent state handling: this return statement doesn't preserve the existing state (missing **state), unlike the error handling at lines 512-515. When validation fails for missing required parameters, the function should return {**state, "error_message": error_msg} to preserve all existing state fields.

Suggested change

return {"error_message": error_msg}

return {**state, "error_message": error_msg}

Copilot · 2026-01-28T05:09:14Z

+        task_results = state.get("task_results", {})
+        task_results[current_task_id] = {
+            "task_id": current_task_id,
+            "success": True,
+            "output_layers": output_layers,
+            "execution_result": result,
+            "error": None,
+        }


Potential state mutation issue: The code gets task_results from state and directly mutates it by assigning to task_results[current_task_id]. In LangGraph's state management, this could lead to unintended side effects if the state dictionary is reused. Instead, create a new dictionary: task_results = {**state.get("task_results", {}), current_task_id: {...}} or create a copy before mutation: task_results = dict(state.get("task_results", {})).

Copilot · 2026-01-28T05:09:14Z

+        task_results = state.get("task_results", {})
+        task_results[current_task_id] = {
+            "task_id": current_task_id,
+            "success": False,
+            "output_layers": [],
+            "execution_result": None,
+            "error": str(e),
+        }


Potential state mutation issue: The code gets task_results from state and directly mutates it by assigning to task_results[current_task_id]. In LangGraph's state management, this could lead to unintended side effects if the state dictionary is reused. Instead, create a new dictionary: task_results = {**state.get("task_results", {}), current_task_id: {...}} or create a copy before mutation: task_results = dict(state.get("task_results", {})).

Copilot · 2026-01-28T05:09:14Z

+3. Finalize: Summarize all tasks and results
+"""
+import json
+from typing import List, Dict, Any, Optional


Import of 'List' is not used.
Import of 'Dict' is not used.
Import of 'Optional' is not used.

Suggested change

from typing import List, Dict, Any, Optional

from typing import Any

Copilot · 2026-01-28T05:09:15Z

+from langchain_core.messages import (
+    BaseMessage,
+    AIMessage,
+    HumanMessage,
+    SystemMessage,
+    ToolMessage,
+)


Import of 'BaseMessage' is not used.

Copilot · 2026-01-28T05:09:15Z

+    ParameterGathering,
+    TaskDecomposition,
+    DependencyInjection,
+    TaskResult,


Import of 'TaskResult' is not used.

Suggested change

TaskResult,

iamtekson · 2026-01-28T05:17:41Z

We might need to another node to check whether the current decomposed task are processing task or normal request. Based on that we might need to connect the analyze_depths to the llm (or tools) for further processing if it is not a processing task.

Copilot

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 8 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-03T03:57:37Z

 from langgraph.checkpoint.memory import MemorySaver
 from langchain_core.messages import BaseMessage, AIMessage, ToolMessage
+
+from agents.multi_step_processing import build_multi_step_processing_graph


Import style is inconsistent with the rest of the codebase. This line uses an absolute import from agents.multi_step_processing while other imports in this file and throughout the codebase use relative imports (e.g., from .states, from ..tools). This should be changed to from .multi_step_processing import build_multi_step_processing_graph to maintain consistency with the established pattern.

Suggested change

from agents.multi_step_processing import build_multi_step_processing_graph

from .multi_step_processing import build_multi_step_processing_graph

Copilot · 2026-02-03T03:57:40Z

+
+        if current_idx < len(tasks):
+            current_task = tasks[current_idx]
+            current_task_id = current_task.get("task_id", current_idx + 1)


Variable current_task_id is not used.

Suggested change

current_task_id = current_task.get("task_id", current_idx + 1)

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Refactor multi-step processing logic to enhance clarity and functionality. Update routing and error handling for better task management.

Copilot

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 10 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-03T05:06:29Z

+        task_results = state.get("task_results", {})
+        task_results[current_task_id] = {
+            "task_id": current_task_id,
+            "success": True,
+            "output_layers": output_layers,
+            "execution_result": result,
+            "error": None,
+        }


Similar state mutation issue: task_results is retrieved from state and then modified in-place. This should create a new dictionary instead.

Use the same pattern:

task_results = dict(state.get("task_results", {})) task_results[current_task_id] = {...}

@copilot open a new pull request to apply changes based on this feedback

Copilot · 2026-02-03T05:06:30Z

+    global llm
+    llm = llm_instance


This module uses a global llm variable pattern where node functions at module level access a global llm that's only initialized when build_multi_step_processing_graph is called (line 1031-1032). This differs from the pattern in agents/processing.py (which uses nested functions and closures to access llm safely).

While this pattern works, it has potential issues:

Functions will fail if called before the graph is built

The dependency on global state is not obvious from function signatures

It could lead to issues in multi-threaded or test scenarios

Consider refactoring to match the closure-based pattern from processing.py, or at minimum add a module-level docstring explaining this dependency.

@copilot open a new pull request to apply changes based on this feedback

Copilot · 2026-02-03T05:06:31Z

+                for key, value in output_obj.items():
+                    if isinstance(value, str) and value not in ["TEMPORARY_OUTPUT"]:


The output layer extraction logic iterates through all values in the result dictionary and treats any string value (except "TEMPORARY_OUTPUT") as an output layer. This could incorrectly extract non-layer strings from the result.

For example, if the result contains {"OUTPUT": "layer1", "status": "completed", "method": "buffer"}, all three strings would be extracted as output layers.

Consider filtering more carefully by checking specific known output keys, or verifying that the string value looks like a layer name or file path:

# Known output parameter names OUTPUT_KEYS = ["OUTPUT", "output", "OUTPUT_LAYER", "output_layer", "RESULT", "result"] for key, value in output_obj.items(): if key in OUTPUT_KEYS and isinstance(value, str) and value not in ["TEMPORARY_OUTPUT"]: output_layers.append(value)

Suggested change

for key, value in output_obj.items():

if isinstance(value, str) and value not in ["TEMPORARY_OUTPUT"]:

# Only consider known output-related keys to avoid treating

# unrelated string fields (e.g. status, method) as layers.

output_keys = ["OUTPUT", "output", "OUTPUT_LAYER", "output_layer", "RESULT", "result"]

for key, value in output_obj.items():

if (

key in output_keys

and isinstance(value, str)

and value not in ["TEMPORARY_OUTPUT"]

):

Copilot · 2026-02-03T05:06:31Z

+        task_results = state.get("task_results", {})
+        task_results[current_task_id] = {
+            "task_id": current_task_id,
+            "success": False,
+            "output_layers": [],
+            "execution_result": None,
+            "error": str(e),
+        }
+


The task_results dictionary is being mutated in-place by retrieving it from state and then modifying it. This can cause issues with state management in LangGraph. The pattern should be to create a new dictionary for task_results.

Change to:

task_results = dict(state.get("task_results", {})) task_results[current_task_id] = {...} return { **state, "task_results": task_results, ... }

This same issue exists in execute_node_multi (lines 601, 627), execute_llm_task_node (lines 957, 980), and other places where task_results is modified.

Copilot · 2026-02-03T05:06:31Z

+        task_results = state.get("task_results", {})
+        task_results[current_task_id] = {
+            "task_id": current_task_id,
+            "success": True,
+            "output_layers": [],
+            "execution_result": response_content,
+            "error": None,
+        }


Similar state mutation issue: task_results is being modified in-place. Create a new dictionary copy before modifying.

task_results = dict(state.get("task_results", {})) task_results[current_task_id] = {...}

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot · 2026-02-03T05:10:09Z

@iamtekson I've opened a new pull request, #48, to work on those changes. Once the pull request is ready, I'll request review from you.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot · 2026-02-03T05:11:04Z

@iamtekson I've opened a new pull request, #49, to work on those changes. Once the pull request is ready, I'll request review from you.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Co-authored-by: iamtekson <39838116+iamtekson@users.noreply.github.com>

Fix state mutation in task_results handling

working on #44

ac970aa

iamtekson requested a review from Copilot January 28, 2026 05:03

Copilot started reviewing on behalf of iamtekson January 28, 2026 05:03 View session

Copilot AI reviewed Jan 28, 2026

View reviewed changes

iamtekson added 2 commits February 2, 2026 20:49

Refactor multi-step processing logic and routing

6d1bb8e

Add metadata for multi-task workflow processing

e64ae61

iamtekson requested a review from Copilot February 3, 2026 03:50

Copilot started reviewing on behalf of iamtekson February 3, 2026 03:51 View session

Copilot AI reviewed Feb 3, 2026

View reviewed changes

iamtekson and others added 7 commits February 2, 2026 21:08

Update agents/multi_step_processing.py

59252b7

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update agents/multi_step_processing.py

390a8a8

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update agents/multi_step_processing.py

f54826a

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update agents/multi_step_processing.py

822a146

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update agents/multi_step_processing.py

e6eaf93

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update agents/multi_step_processing.py

db3751d

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Refactor multi-step processing and routing logic

f58eb8b

Refactor multi-step processing logic to enhance clarity and functionality. Update routing and error handling for better task management.

iamtekson requested a review from Copilot February 3, 2026 04:59

Copilot started reviewing on behalf of iamtekson February 3, 2026 04:59 View session

Copilot AI reviewed Feb 3, 2026

View reviewed changes

iamtekson and others added 2 commits February 2, 2026 22:09

Update agents/states.py

7360f63

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Initial plan

a692b5f

Copilot AI mentioned this pull request Feb 3, 2026

Fix state mutation in task_results handling #48

Merged

Update agents/multi_step_processing.py

aa7e049

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot AI mentioned this pull request Feb 3, 2026

Refactor multi_step_processing to use closures instead of global llm variable #49

Closed

iamtekson and others added 3 commits February 2, 2026 22:11

Update agents/multi_step_processing.py

c632a5a

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update agents/multi_step_processing.py

3fe4fa1

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Fix state mutation in task_results handling

c089ba2

Co-authored-by: iamtekson <39838116+iamtekson@users.noreply.github.com>

iamtekson added 2 commits February 2, 2026 22:15

Merge branch 'dev' into copilot/sub-pr-46

02f60a0

Merge pull request #48 from iamtekson/copilot/sub-pr-46

5b1b3e3

Fix state mutation in task_results handling

iamtekson merged commit 0b6b95c into main Feb 3, 2026
1 check passed

iamtekson deleted the dev branch February 3, 2026 05:26

iamtekson mentioned this pull request Feb 4, 2026

Add support for multi-step problems #44

Closed

-            for layer in output_layers:
-                previous_outputs[f"task_{prev_task_id}_output"] = layer
-                _logger.debug(f"  Available from task {prev_task_id}: {layer}")
+            for idx, layer in enumerate(output_layers):
+                # Preserve original key for the first output to maintain compatibility,
+                # and use indexed keys for any additional outputs from the same task.
+                if idx == 0:
+                    key = f"task_{prev_task_id}_output"
+                else:
+                    key = f"task_{prev_task_id}_output_{idx}"
+                previous_outputs[key] = layer
+                _logger.debug(f"  Available from task {prev_task_id}: {layer} (as {key})")

	return {"error_message": error_msg}
	return {**state, "error_message": error_msg}

	from typing import List, Dict, Any, Optional
	from typing import Any

	from agents.multi_step_processing import build_multi_step_processing_graph
	from .multi_step_processing import build_multi_step_processing_graph

		for key, value in output_obj.items():
		if isinstance(value, str) and value not in ["TEMPORARY_OUTPUT"]:

-                for key, value in output_obj.items():
-                    if isinstance(value, str) and value not in ["TEMPORARY_OUTPUT"]:
+                # Only consider known output-related keys to avoid treating
+                # unrelated string fields (e.g. status, method) as layers.
+                output_keys = ["OUTPUT", "output", "OUTPUT_LAYER", "output_layer", "RESULT", "result"]
+                for key, value in output_obj.items():
+                    if (
+                        key in output_keys
+                        and isinstance(value, str)
+                        and value not in ["TEMPORARY_OUTPUT"]
+                    ):

Conversation

iamtekson commented Jan 28, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

iamtekson commented Jan 28, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

Uh oh!

iamtekson Feb 3, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

Uh oh!

iamtekson Feb 3, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 3, 2026