docs(agent-workflow): deduplicate judge env var block (#130)#159
Closed
Dongbumlee wants to merge 1 commit into
Closed
docs(agent-workflow): deduplicate judge env var block (#130)#159Dongbumlee wants to merge 1 commit into
Dongbumlee wants to merge 1 commit into
Conversation
Re-validated docs/tutorial-agent-workflow.md against current develop
(291 lines). Single doc-only drift fixed:
- Section 3 'Initialize AgentOps' set both AZURE_OPENAI_DEPLOYMENT
and AZURE_AI_MODEL_DEPLOYMENT_NAME to the same value
('gpt-4o-mini'). _model_config() reads them as fallbacks of each
other - setting both is redundant. Reduced to one, added a
one-line note explaining the alias (same change as PR #158 for
the Copilot-skills tutorial).
Other claims verified against the code without re-deploying the
container app:
- The documented agentops.yaml shape (request_field, response_field,
tool_calls_field at top level, plus thresholds for tool_call_accuracy
/ intent_resolution / task_adherence) parses cleanly via
AgentOpsConfig.model_validate.
- All six documented thresholds match real evaluators in
src/agentops/core/evaluators.py (default thresholds at lines 82,
90, 181, 194, 212, 228).
- The dataset shape (input + expected + tool_definitions + tool_calls)
triggers the agent-evaluator set per the docstring in
src/agentops/core/evaluators.py.
- 'agentops workflow generate --kinds pr --force' still exists
(--force here is generate's, not the deprecated skills install
--force).
- 'agentops doctor --severity-fail critical' is valid.
Refs #130.
This was referenced May 14, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #130.
Summary
Doc-only validation pass for
docs/tutorial-agent-workflow.mdagainst currentdevelop(291 lines). One drift fixed.Drift fixed
AZURE_OPENAI_DEPLOYMENTandAZURE_AI_MODEL_DEPLOYMENT_NAMEto the same value._model_config()insrc/agentops/pipeline/runtime.pyreads them as fallbacks of each other (os.getenv(AZURE_OPENAI_DEPLOYMENT) or os.getenv(AZURE_AI_MODEL_DEPLOYMENT_NAME)); setting both is redundant. Same fix shipped in PR #158 for the Copilot-skills tutorial.What I verified without re-deploying the Container App
Tutorial Section 1-2 builds + deploys a FastAPI tool-calling agent to ACA. That's heavy to re-spin every validation pass, so I validated the AgentOps-side claims directly against the code:
agentops.yamlshape parses cleanlyAgentOpsConfig.model_validate(...)succeeded with all six top-level fields (request_field,response_field,tool_calls_field,thresholds, etc.)grep -n score_key src/agentops/core/evaluators.py:coherence,fluency,tool_call_accuracy,intent_resolution,task_adherence,avg_latency_secondsall present with matching default thresholdsinput+expected+tool_definitions+tool_calls) triggers the agent-evaluator setsrc/agentops/core/evaluators.py: 'If rows includetool_callsortool_definitions: add agent evaluators (ToolCallAccuracy, IntentResolution, TaskAdherence).'agentops workflow generate --kinds pr --forceworks--helpshows--forceis still a validworkflow generateoption (distinct from the deprecatedskills install --force)agentops doctor --severity-fail criticalworksTests
Full suite: 346 passed, 1 skipped (with the pre-existing
test_cli_platform_invalid_value_failsdeselected — Click 8.2 stderr issue on develop, unrelated).Note for reviewers
Branched directly off current
develop. No dependencies on other PRs.