Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add Toolset to tooling architecture #9043

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
Open

feat: Add Toolset to tooling architecture #9043

wants to merge 8 commits into from

Conversation

vblagoje
Copy link
Member

Why:

Enhances the tooling infrastructure by introducing a new abstraction, the Toolset class, to facilitate the grouping and management of related tool functionalities, supporting dynamic tool loading and registration.

What:

  • Added a new Toolset class to manage collections of tools and enable dynamic loading of tool configurations.
  • Modified __init__.py to include the Toolset in the exposed API.
  • Added comprehensive tests for the Toolset class to ensure its functionality.

How can it be used:

The Toolset class can be used to group related tools into a cohesive unit, allowing for easier management and potential dynamic loading from external sources:

def add_numbers(a: int, b: int) -> int:
    return a + b

def subtract_numbers(a: int, b: int) -> int:
    return a - b

# Create the tools
add_tool = Tool(
    name="add",
    description="Add two numbers",
    parameters={
        "type": "object",
        "properties": {"a": {"type": "integer"}, "b": {"type": "integer"}},
        "required": ["a", "b"],
    },
    function=add_numbers,
)

subtract_tool = Tool(
    name="subtract",
    description="Subtract two numbers",
    parameters={
        "type": "object",
        "properties": {"a": {"type": "integer"}, "b": {"type": "integer"}},
        "required": ["a", "b"],
    },
    function=subtract_numbers,
)

# Create a toolset
math_toolset = Toolset([add_tool, subtract_tool])

# Create a complete pipeline
pipeline = Pipeline()
pipeline.add_component("llm", OpenAIChatGenerator(model="gpt-4o-mini", tools=math_toolset))
pipeline.add_component("tool_invoker", ToolInvoker(tools=math_toolset))
pipeline.add_component(
    "adapter",
    OutputAdapter(
        template="{{ initial_msg + initial_tool_messages + tool_messages }}",
        output_type=list[ChatMessage],
        unsafe=True,
    ),
)
pipeline.add_component("response_llm", OpenAIChatGenerator(model="gpt-4o-mini"))

# Connect the components
pipeline.connect("llm.replies", "tool_invoker.messages")
pipeline.connect("llm.replies", "adapter.initial_tool_messages")
pipeline.connect("tool_invoker.tool_messages", "adapter.tool_messages")
pipeline.connect("adapter.output", "response_llm.messages")

# Test addition through the pipeline
user_input = "What is 5 plus 3?"
user_input_msg = ChatMessage.from_user(text=user_input)

result = pipeline.run({"llm": {"messages": [user_input_msg]}, "adapter": {"initial_msg": [user_input_msg]}})

# Verify the complete flow worked
pipe_result = result["response_llm"]["replies"][0].text
assert "8" in pipe_result or "eight" in pipe_result

How did you test it:

Tests were written to validate the Toolset functionality, including registration of tools, serialization/deserialization, and integration within a pipeline. Specific methods like ToolInvoker were used to test tool invocation. Tests confirm that tools can be dynamically loaded, executed, and serialized correctly.

Notes for the reviewer:

Please pay special attention to the dynamic loading capabilities and serialization logic within the Toolset. Ensure there are no name conflicts when registering tools and that the integration tests cover all intended use cases.

@github-actions github-actions bot added topic:tests type:documentation Improvements on the docs labels Mar 15, 2025
@coveralls
Copy link
Collaborator

coveralls commented Mar 15, 2025

Pull Request Test Coverage Report for Build 14043355544

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage decreased (-0.04%) to 90.098%

Totals Coverage Status
Change from base Build 14043167436: -0.04%
Covered Lines: 9900
Relevant Lines: 10988

💛 - Coveralls

@vblagoje vblagoje marked this pull request as ready for review March 18, 2025 17:25
@vblagoje vblagoje requested review from a team as code owners March 18, 2025 17:25
@vblagoje vblagoje requested review from dfokina and anakin87 and removed request for a team March 18, 2025 17:25
@vblagoje
Copy link
Member Author

Perfect, it landed on you @anakin87 🙏 Let's see how we can move forward with the Toolset - as simple as it can get, yet enable dynamic tool discovery

@anakin87
Copy link
Member

Sorry for the wait, @vblagoje.
I like the idea and will review this PR tomorrow.

@vblagoje
Copy link
Member Author

No problem @anakin87 - as long as we get it into next release - if possible, no rush, let's nail the abstraction correctly.

Copy link
Member

@anakin87 anakin87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea, but something seems off with the types.

from haystack.tools import Toolset, Tool
from haystack.components.tools import ToolInvoker

def add_numbers(a: int, b: int) -> int:
    return a + b

add_tool = Tool(
    name="add",
    description="Add two numbers",
    parameters={
        "type": "object",
        "properties": {"a": {"type": "integer"}, "b": {"type": "integer"}},
        "required": ["a", "b"],
    },
    function=add_numbers,
)

math_toolset = Toolset([add_tool])

tool_invoker = ToolInvoker(math_toolset)

Running mypy, I get
error: Argument 1 to "ToolInvoker" has incompatible type "Toolset"; expected "list[Tool]" [arg-type].

Could you check and iterate on the idea?

@vblagoje
Copy link
Member Author

vblagoje commented Mar 20, 2025

Correct @anakin87 somehow it didn't get picked up by mypy when I was running lint/types tests. We have two options:

  1. Leave as is - and wrap tools init param in ToolInvoker and XYZChatGenerator with list(toolset)
  2. Rewrite Toolset as subset of list
  3. Add Union on Toolset in chat generator and tool invoker init

As everything will still work even if you simply pass toolset instance to ToolInvoker and XYZChatGenerator - I'm kind leaning to approach 1) or 3), wdyt?

@anakin87
Copy link
Member

I am not fully understanding options 1 and 2.
Could you please explain/show some code examples?

@vblagoje
Copy link
Member Author

Ah, no problem:

  1. Wrap toolset with list
# Create a toolset with the math tool
math_toolset = Toolset([add_tool])

# Create a complete pipeline that can use the toolset
pipeline = Pipeline()
pipeline.add_component("llm", OpenAIChatGenerator(model="gpt-3.5-turbo", tools=list(math_toolset)))
pipeline.add_component("tool_invoker", ToolInvoker(tools=list(math_toolset)))
  1. Extend list
class Toolset(list):
...
  1. Union - that's easy, just change tools init param to tools: Optional[Union[List[Tool], Toolset]] = None,

@anakin87
Copy link
Member

Ah, sorry... It was simple.
Perhaps ask also Bilge but option 1 looks good to me.

@vblagoje
Copy link
Member Author

vblagoje commented Mar 20, 2025

@bilgeyucel wdyt, anyone else with a preference feedback? cc @julian-risch @sjrl

@vblagoje
Copy link
Member Author

Updated to include list wrapper around all Toolset instances. wdyt @julian-risch

@anakin87
Copy link
Member

I like this simple solution, but perhaps we should also think to YAML support...
@vblagoje WDYT?

@vblagoje
Copy link
Member Author

Ah good point @anakin87 - so then Union option looks more appropriate?

@vblagoje
Copy link
Member Author

vblagoje commented Mar 27, 2025

@anakin87 Quick heads-up (also shared with @julian-risch): I realized we likely need to change the init param in the chat generator and tool invoker to accept an optional Union[List[Tool], Toolset]].

The current Toolset serde approach can cause issues when serializing at time A and deserializing at time B—tools in Toolset might have been removed or changed in the meantime. That means deserialization might deserialize unavailable or changed tools and generaly behave incorrectly.

So, we should not always serialize the the Tools in the Toolset as we do now in this PR. Instead, we sometimes need to treat it as an endpoint/resolver (e.g., MCP server, OpenAPI URI, etc.) and only serde that endpoint descriptor. Also, since there could be different types of Toolset, we should include the type in the serialized form. All in all we might by default treat Toolset as simple container of some tools added - as is now in PR, allow serializing of these tools but also allow Toolset subclasses to serde themselves into some endpoint descriptor for example, and not strictly list of Tool instances. Hope I'm making sense 😃

Not ideal, but seems like the correct path—unless you see a better one. Can you double-check this reasoning and approach against our tooling architecture?
cc @mpangrazzi

@anakin87
Copy link
Member

anakin87 commented Mar 27, 2025

@vblagoje At this point, it seems that the requirements have evolved and that this simple abstraction may not the best fit.

I think we should discuss this further with a clear use-case, more details and possibly a POC.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic:tests type:documentation Improvements on the docs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants