Retail LLM agent optimization and evaluation showcase built on top of tau2-bench, focused on execution-chain improvements, route comparison, and reproducible benchmark demos.
-
Updated
Apr 15, 2026 - Python
Retail LLM agent optimization and evaluation showcase built on top of tau2-bench, focused on execution-chain improvements, route comparison, and reproducible benchmark demos.
Add a description, image, and links to the tau-bench topic page so that developers can more easily learn about it.
To associate your repository with the tau-bench topic, visit your repo's landing page and select "manage topics."