Skip to content

Structured Outputs Functional#18

Merged
5aumit merged 6 commits into
masterfrom
dev
May 23, 2026
Merged

Structured Outputs Functional#18
5aumit merged 6 commits into
masterfrom
dev

Conversation

@5aumit
Copy link
Copy Markdown
Owner

@5aumit 5aumit commented May 23, 2026

This pull request introduces several improvements and additions across documentation, user interface, agent middleware, and development workflow. The main highlights are the addition of a comprehensive manual test checklist for MLflow agent workflows, enhancements to the CLI welcome banner and user message formatting, the introduction of agent middleware for structured tool responses, and a minor update to the project banner.

Documentation & Testing

  • Added a detailed manual test checklist (2026-05-21-mlflow-agent-manual-test-checklist-design.md) covering real-world MLflow agent scenarios, including acceptance criteria and prompt templates for manual verification.

User Interface Enhancements

  • Improved the CLI welcome banner in print_welcome with a purple-to-blue gradient and glow effect for a more visually appealing introduction.
  • Added print_user to clearly display user messages with a "You" tag for better distinction between user and agent output.

Agent Middleware & Structured Output

  • Introduced agent middleware (agent_middleware.py) that enforces a strict JSON schema for tool responses and provides custom error handling, ensuring agent outputs are consistently structured and user-friendly.

Development & Workflow

  • Updated the CLI entrypoint in run_agent.sh to use src/app.py instead of the previous script, aligning with the new application structure.

Project Branding

  • Updated the project banner image in README.md to a new version for improved branding.

5aumit added 6 commits May 5, 2026 01:32
- Langfuse sessions now correctly show root input and output
- Simplified the Langfuse session code
- Session (short term) memory works now, and agent can answer questions about previous queries too.
…de_metrics option; introduce raw_count_runs_per_experiment and corresponding tool wrapper
…app entry point and improve error handling in tool wrappers
…g list_runs_tool to include metrics, and adding error handling middleware; introduce manual test checklist for MLflow agent workflows.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the MLflow agent CLI to support structured, schema-driven assistant outputs, adds Langfuse tracing setup utilities, and improves the CLI UI/UX and documentation around manual testing.

Changes:

  • Added a BlockResponse JSON schema + middleware/hooks to support structured agent outputs and friendlier tool error handling.
  • Refactored MLflow run listing to reduce token usage (single-experiment runs, optional metric/param previews) and added a “count runs per experiment” tool.
  • Introduced Langfuse tracing setup utilities, a new CLI entrypoint (src/app.py), and updated CLI rendering + docs/assets.

Reviewed changes

Copilot reviewed 11 out of 14 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
src/mlflow_tools/schemas.py Updates tool arg schemas (single experiment run listing, metrics inclusion flag, count-runs params).
src/mlflow_tools/data_access.py Implements updated MLflow tool behaviors, adds count-runs helper/tool, adjusts tool wrapper return serialization.
src/llm/tracing.py Adds centralized Langfuse setup helper used by the CLI agent.
src/app.py New CLI entrypoint wrapper that runs the agent and attempts to finish/flush Langfuse on interrupt.
src/agent/langchain_agent.py Wires structured response strategy, middleware, tracing metadata, and improves interactive input/output loop.
src/agent/console_ui.py Adds user message printing and BlockResponse rendering; updates welcome banner styling.
src/agent/agent_middleware.py Adds BlockResponse JSON schema and tool-call middleware for error handling/schema forcing.
run_agent.sh Points the CLI launcher to src/app.py.
README.md Updates project banner image reference.
docs/superpowers/specs/2026-05-21-mlflow-agent-manual-test-checklist-design.md Adds a comprehensive manual test checklist for MLflow agent workflows.
diary.md Updates project notes/diary entries.
.gitignore Ignores additional agent/workflow artifacts.
src/agent/init.py Present as part of the agent package (no content changes).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/agent/console_ui.py
Comment on lines +60 to +71

# Add glow layer (dim version as shadow)
banner_glow = "\n".join(banner_lines)
t.append(banner_glow + "\n\n\n", style="dim #5533ff")

# Overlay gradient banner on top
t_banner = Text()
for line, color in zip(banner_lines, colors):
t_banner.append(line + "\n", style=f"bold {color}")

t = Text()
for line, color in zip(banner_lines, colors):
Comment thread src/app.py
Comment on lines +3 to +4
sys.path.append('./src/agent') # Add agent directory to path for imports

Comment thread src/app.py
Comment on lines +8 to +21
try:
main()
except KeyboardInterrupt:
print("\nInterrupted. Finishing Langfuse run if present...")
try:
if lf_run is not None:
if hasattr(lf_run, 'finish'):
lf_run.finish()
elif fuse_client is not None and hasattr(fuse_client, 'finish_run'):
fuse_client.finish_run(conversation_id)
if fuse_client is not None:
fuse_client.flush()
except Exception:
pass
Comment thread src/llm/tracing.py
def setup_langfuse(config):
"""
Set up Langfuse tracing based on config and environment variables.
Returns: langfuse_handler, fuse_client, conversation_id, lf_run, FLUSH_PER_QUERY
Comment on lines 87 to 100
def raw_list_runs(
experiment_ids: List[str],
experiment_id: str,
status: Optional[List[str]] = None,
start_time: Optional[int] = None,
end_time: Optional[int] = None,
order_by: Optional[str] = None,
max_results: int = 100,
include_metrics: bool = False,
) -> List[Dict[str, Any]]:
"""Return summarized runs for given experiments.

Note: MLflowClient.search_runs accepts experiment_ids and order_by; more complex
filtering can be added later.
"""
Comment on lines +127 to +138
# New: Count runs per experiment ID
def raw_count_runs_per_experiment(experiment_ids: List[str]) -> Dict[str, int]:
"""Return a dict of experiment_id -> number of runs."""
counts = {}
for exp_id in experiment_ids:
try:
runs = client.search_runs([exp_id], max_results=50000)
counts[exp_id] = len(runs)
except MlflowException as e:
logging.error(f"Error counting runs for experiment {exp_id}: {e}")
counts[exp_id] = -1
return counts
Comment on lines +57 to +59
return ToolMessage(
content=f"Tool error: Please check your input and try again. ({str(e)})",
tool_call_id=request.tool_call["id"]
Comment on lines +190 to +192
# User pressed Ctrl-C; continue the loop to allow graceful exit
print("\nInterrupted. Goodbye!")
continue
@5aumit 5aumit merged commit cc6fa41 into master May 23, 2026
1 check passed
@5aumit 5aumit mentioned this pull request May 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants