Skip to content

How to test agents? integration tests or evals? #2981

@AlexEnrique

Description

@AlexEnrique

Question

Hi. I am trying to test my agents without doing so manually, but I still didn't understand how to do it well.

I need to run live tests, hitting the LLM API.

Here are some cases I need to test:

  • Test with different inputs if a tool is called as expected. Maybe I need to mock a tool
  • Test the impacts of changes in instructions with complete workflows, for different inputs (many tool calls, testing a whole conversation)
  • Given some message histories that caused unexpected model behavior exceptions, I would like to write tests for changes in the tools and prompts in order to see if the issue was fixed
  • write tests to validade the agent against new workflows, observing the message history exchanged (could be through logfire)
  • Evaluate the impact of switching models, if the agent continue to perform as expected

Could someone help me figure out how to perform these tests?

Additional Context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions