Skip to content

docs(http-agent): deduplicate judge env var block (#129)#161

Merged
Dongbumlee merged 1 commit into
developfrom
fix/issue-129-http-agent-rev2
May 14, 2026
Merged

docs(http-agent): deduplicate judge env var block (#129)#161
Dongbumlee merged 1 commit into
developfrom
fix/issue-129-http-agent-rev2

Conversation

@Dongbumlee
Copy link
Copy Markdown
Collaborator

Closes #129.

Summary

Doc-only validation pass for docs/tutorial-http-agent.md against current develop (374 lines). One drift fixed.

Drift fixed

Issue Evidence
Prerequisites set both AZURE_OPENAI_DEPLOYMENT and AZURE_AI_MODEL_DEPLOYMENT_NAME to the same value. _model_config() reads them as fallbacks of each other — setting both is redundant. Same fix shipped in PRs #158 / #159 for the Copilot-skills and agent-workflow tutorials.

Runtime verification (hybrid testing, Option C)

The ACA deploy in Section 2 is heavy infra. Used a two-part approach:

Part 1 — Azure CLI flag check (covers Section 2)

az containerapp up on current containerapp extension v1.3.0b4 still accepts every documented flag:

Flag Present in --help
--name -n
--resource-group -g
--location -l
--source
--target-port
--ingress (external/internal)

Part 2 — Local FastAPI stand-in (covers Sections 1, 3–6 end-to-end)

Wrote the tutorial's exact app.py and ran uvicorn app:app --host 127.0.0.1 --port 8000. All three documented /chat paths returned the documented JSON shape:

// lookup
{"text":"Order ORD-12345 is in transit and expected to arrive tomorrow.","tool_calls":[{"type":"tool_call","tool_call_id":"lookup_1","name":"lookup_order","arguments":{"order_id":"ORD-12345"}}]}
// refund
{"text":"I started a refund for ORD-77821 because it arrived broken.","tool_calls":[{"type":"tool_call","tool_call_id":"refund_1","name":"refund_order","arguments":{"order_id":"ORD-77821","reason":"arrived broken"}}]}
// greeting
{"text":"Hello! I can help with order status, refunds, or connecting you to support.","tool_calls":[]}

agentops eval run against http://127.0.0.1:8000/chat produced all six documented metrics:

| coherence           | 4.000 | ✅ |
| fluency             | 3.000 | ✅ |
| tool_call_accuracy  | 5.000 | ✅ |
| intent_resolution   | 4.000 | ✅ |
| task_adherence      | 0.333 | ❌ |  (real signal — synthetic agent doesn't really adhere)
| avg_latency_seconds | 0.003 | ✅ |

Tool calls were correctly extracted from the HTTP response via tool_calls_field: tool_calls. The greeting row (no tools) correctly showed tool_call_accuracy=n/a. results.json + report.md written under the documented timestamped folder + latest/ mirror.

Out of scope

ACA provisioning itself wasn't exercised. AgentOps doesn't change behaviour based on the URL pointing at ACA vs localhost — only that the URL is reachable and returns the documented JSON shape. Anyone who wants the full deploy verified should run Section 2 verbatim; the doc has no claims about ACA-specific behaviour that the local run cannot exercise.

Tests

Full suite: 346 passed, 1 skipped (with the pre-existing test_cli_platform_invalid_value_fails deselected — Click 8.2 stderr issue on develop, unrelated).

Note for reviewers

Branched directly off current develop. No code changes, no dependencies on other PRs.

Re-validated docs/tutorial-http-agent.md against current develop
(374 lines). One doc-only drift fixed.

The Prerequisites block set both AZURE_OPENAI_DEPLOYMENT and
AZURE_AI_MODEL_DEPLOYMENT_NAME to the same value ('gpt-4o-mini').
_model_config() reads them as fallbacks of each other - setting both
is redundant. Reduced to one with a one-line alias note. Same fix
shipped in PR #158 (#133) and #159 (#130).

End-to-end verification (hybrid approach):

  * Section 2 az CLI: 'az containerapp up' on current
    'containerapp' extension v1.3.0b4 still accepts every documented
    flag (--name, --resource-group, --location, --source,
    --target-port, --ingress external).
  * Sections 1, 3-6: ran the tutorial's exact FastAPI app.py
    locally at http://127.0.0.1:8000/chat (skipping the ACA deploy
    in Section 2) and pointed AgentOps at it.

    - 'agentops eval run' produced all six documented metrics
      (coherence, fluency, tool_call_accuracy, intent_resolution,
      task_adherence, avg_latency_seconds).
    - tool_calls extracted correctly via 'tool_calls_field: tool_calls'
      (row 0/1: tool_call_accuracy=5.00). Greeting row (no tools)
      shows tool_call_accuracy=n/a as expected.
    - 'results.json' and 'report.md' written under
      .agentops/results/<timestamp>/ and mirrored to /latest/.

The ACA deploy itself wasn't exercised here, only the documented az
CLI flag set; full ACA provisioning was deemed out of scope for
doc-validation since AgentOps doesn't change behaviour based on
ACA vs local HTTP URL.

Refs #129.
@Dongbumlee Dongbumlee merged commit 3af28f7 into develop May 14, 2026
2 of 9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant