-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Closed
Labels
questionFurther information is requestedFurther information is requested
Description
Question
Hi. I am trying to test my agents without doing so manually, but I still didn't understand how to do it well.
I need to run live tests, hitting the LLM API.
Here are some cases I need to test:
- Test with different inputs if a tool is called as expected. Maybe I need to mock a tool
- Test the impacts of changes in instructions with complete workflows, for different inputs (many tool calls, testing a whole conversation)
- Given some message histories that caused unexpected model behavior exceptions, I would like to write tests for changes in the tools and prompts in order to see if the issue was fixed
- write tests to validade the agent against new workflows, observing the message history exchanged (could be through logfire)
- Evaluate the impact of switching models, if the agent continue to perform as expected
Could someone help me figure out how to perform these tests?
Additional Context
No response
Metadata
Metadata
Assignees
Labels
questionFurther information is requestedFurther information is requested