Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,20 @@ This format follows [Keep a Changelog](https://keepachangelog.com/) and adheres

## [Unreleased]

### Changed
- **Skill + tutorial guidance now require `Cognitive Services OpenAI User` as a prerequisite RBAC role.**
The `agentops-workflow` skill, `tutorial-prompt-agent-quickstart.md`,
`tutorial-end-to-end.md`, and `docs/ci-github-actions.md` now instruct users
to grant the OIDC/CI service principal **both** Foundry User on the Foundry
project **and** Cognitive Services OpenAI User on the underlying Azure AI
Services account that hosts the evaluator model deployment. Foundry
`azure_ai_evaluator` graders impersonate the OIDC principal to call OpenAI;
without the OpenAI User role they fail with a 401 `PermissionDenied` and
every cloud eval metric returns `null`, blocking the first PR run. The skill
now emits the matching `az role assignment create` commands for both roles
(role ids `53ca6127-db72-4b80-b1b0-d745d6d5456d` and
`5e0bd9bd-7b93-4f28-af87-19fc36ad61bd`) before dispatching the workflow.

### Fixed
- **Cloud eval surfaces grader execution errors instead of silent nulls.**
When a Foundry `azure_ai_evaluator` grader fails to execute (most
Expand Down
20 changes: 17 additions & 3 deletions docs/ci-github-actions.md
Original file line number Diff line number Diff line change
Expand Up @@ -125,9 +125,23 @@ from GitHub Actions runs. See
[Microsoft's WIF docs](https://learn.microsoft.com/azure/active-directory/workload-identities/workload-identity-federation-create-trust?pivots=identity-wif-apps-methods-azp).

For Foundry prompt-agent gates, the same app registration / service principal
also needs **Foundry User** on the Foundry project or Foundry resource. Azure
`Reader` is not enough because the eval step calls Foundry data-plane APIs such
as `Microsoft.CognitiveServices/accounts/AIServices/agents/read`.
needs **two** Azure RBAC roles before the first workflow run. Both are required
and the eval step fails silently (every metric returns `null`) if only one is
in place:

- **Foundry User** on the Foundry project or Foundry resource. Azure `Reader`
is not enough because the eval step calls Foundry data-plane APIs such as
`Microsoft.CognitiveServices/accounts/AIServices/agents/read`.
- **Cognitive Services OpenAI User** on the underlying Azure AI Services
account that hosts the evaluator model deployment. Foundry `azure_ai_evaluator`
graders impersonate the OIDC principal to call OpenAI; without this role
they fail with a 401 `PermissionDenied` on
`Microsoft.CognitiveServices/accounts/OpenAI/deployments/chat/completions/action`
and every metric returns `null` in the cloud eval report. AgentOps lifts that
error into `results.json` and the orchestrator's "0 usable metric scores"
warning so you can see the cause in CI logs, but the workflow still fails the
gate. The role ids are `53ca6127-db72-4b80-b1b0-d745d6d5456d` (Foundry User)
and `5e0bd9bd-7b93-4f28-af87-19fc36ad61bd` (Cognitive Services OpenAI User).

The generated eval and doctor workflows install AgentOps telemetry support.
When `AZURE_AI_FOUNDRY_PROJECT_ENDPOINT` is set, AgentOps first tries to
Expand Down
18 changes: 12 additions & 6 deletions docs/tutorial-end-to-end.md
Original file line number Diff line number Diff line change
Expand Up @@ -428,8 +428,11 @@ this Foundry prompt-agent repo.
Create or connect the GitHub repo if needed, create the `dev` environment, wire
Azure OIDC, set AZURE_OPENAI_DEPLOYMENT=gpt-4o-mini as a GitHub `dev`
environment variable or equivalent Azure DevOps pipeline variable, verify the
OIDC principal has Foundry User access, and show me the plan before changing
GitHub or Azure.
OIDC principal has **both** Foundry User access on the dev Foundry project
**and** Cognitive Services OpenAI User access on the underlying Azure AI
Services account that hosts the evaluator model (both are required — without
the OpenAI User role, every cloud eval metric returns null), and show me the
plan before changing GitHub or Azure.
```

That value is not an `agentops init` answer. It tells the Foundry cloud eval
Expand Down Expand Up @@ -578,10 +581,13 @@ workflows running for this Foundry agent repo.

Extend the PR/dev setup if it already exists, wire Azure OIDC for the `qa` and
`production` environments, confirm required Actions variables such as
AZURE_OPENAI_DEPLOYMENT, verify the OIDC principals have Foundry User access,
and keep deploy placeholders unless this repo already has an azd deployment
path. Show me the plan before changing GitHub or Azure, and call out anything
that needs owner/admin permission.
AZURE_OPENAI_DEPLOYMENT, verify the OIDC principals have **both** Foundry User
access on each Foundry project **and** Cognitive Services OpenAI User on the
underlying AI Services account hosting the evaluator model (both are required
— without the OpenAI User role, every cloud eval metric returns null), and
keep deploy placeholders unless this repo already has an azd deployment path.
Show me the plan before changing GitHub or Azure, and call out anything that
needs owner/admin permission.
```

Use this moment in the video to connect the four repos: Foundry Toolkit creates
Expand Down
25 changes: 19 additions & 6 deletions docs/tutorial-prompt-agent-quickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -718,9 +718,12 @@ This may be a brand-new folder with no Git repo or GitHub remote yet.
Keep the scope to the PR gate and dev deploy only: create or connect the
GitHub repo if needed, wire Azure OIDC and required Actions
variables/secrets, create only the `dev` environment, verify the OIDC
principal has Foundry User access on the **dev** Foundry project, and
do not set up `qa`, `production`, scheduled Doctor, or hosted
deployment workflows yet.
principal has **both** Foundry User access on the **dev** Foundry project
**and** Cognitive Services OpenAI User on the underlying Azure AI Services
account that hosts the evaluator model (both roles are required — without
the OpenAI User role, the Foundry cloud graders fail with a 401 and every
metric comes back null), and do not set up `qa`, `production`, scheduled
Doctor, or hosted deployment workflows yet.

The dev Foundry project endpoint is in `.azure/dev/.env`; the sandbox
endpoint is local-only and must not be added to CI.
Expand All @@ -738,9 +741,19 @@ it skips:
- Set Actions variables `AZURE_TENANT_ID`, `AZURE_SUBSCRIPTION_ID`,
`AZURE_CLIENT_ID`, `AZURE_AI_FOUNDRY_PROJECT_ENDPOINT` (the dev
endpoint), and `APPLICATIONINSIGHTS_CONNECTION_STRING` if available.
- Verify the OIDC principal has **Foundry User** access on the dev
Foundry project. Reader alone is not enough for the data-plane calls
the prompt-agent staging and eval steps make.
- Verify the OIDC principal has **two** Azure RBAC roles before the first
run. Both are required and the eval step fails silently (every metric
returns `null`) if only one is in place:
- **Foundry User** on the dev Foundry project — Reader alone is not
enough for the data-plane calls the prompt-agent staging and eval steps
make.
- **Cognitive Services OpenAI User** on the underlying Azure AI Services
account that hosts the evaluator model deployment. Foundry
`azure_ai_evaluator` graders impersonate the OIDC principal to call
OpenAI; without this role they fail with a 401 `PermissionDenied`. The
AgentOps cloud-results parser lifts that error into `results.json` so
you can see the cause in the artifact, but the workflow still fails
the gate.

## 13. First green PR → merge → dev deploy

Expand Down
66 changes: 49 additions & 17 deletions plugins/agentops/skills/agentops-workflow/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -100,22 +100,40 @@ by discovering the whole Azure subscription.
`repo:<owner>/<repo>:environment:dev`. Do not assume branch or
`pull_request` subjects without reading the workflow.
9. Before triggering a Foundry prompt-agent workflow, make sure the OIDC app /
service principal has Foundry data-plane access. It needs **Foundry User**
(role id `53ca6127-db72-4b80-b1b0-d745d6d5456d`, formerly Azure AI User) at
the Foundry project scope, or at the Foundry resource scope if that is the
team's standard. Azure **Reader** is not enough; without this role the eval
step fails on
`Microsoft.CognitiveServices/accounts/AIServices/agents/read`.
10. If the Foundry RBAC assignment is missing, do not run the workflow yet.
Show the exact GitHub OIDC client ID / service principal, desired role, and
target Foundry scope, then ask the user to approve the role assignment or
service principal has **two** RBAC assignments. Both are required; the eval
step fails silently (every metric returns `null`) if only one is in place.
1. **Foundry User** on the Foundry project (or the Foundry resource scope
if that is the team's standard). Role id
`53ca6127-db72-4b80-b1b0-d745d6d5456d` (formerly Azure AI User). Without
this the candidate-staging step fails on
`Microsoft.CognitiveServices/accounts/AIServices/agents/read`.
2. **Cognitive Services OpenAI User** on the underlying Azure AI Services
account that hosts the evaluator model deployment
(typically the parent account of the Foundry project). Role id
`5e0bd9bd-7b93-4f28-af87-19fc36ad61bd`. Without this the Foundry
`azure_ai_evaluator` graders fail with a 401 `PermissionDenied` on
`Microsoft.CognitiveServices/accounts/OpenAI/deployments/chat/completions/action`
and every metric comes back `null` in the cloud eval report. AgentOps now
lifts that error into `results.json` and the orchestrator's "0 usable
metric scores" warning so the cause is visible in CI logs, but the
workflow still fails the gate. Grant this role **before** the first run.
Azure **Reader** is not enough for either step.
10. If either RBAC assignment is missing, do not run the workflow yet.
Show the exact GitHub OIDC client ID / service principal, desired role,
target scope (project for Foundry User, AI Services account for Cognitive
Services OpenAI User), then ask the user to approve the role assignment or
get an Azure/Foundry admin to grant it. After assignment, read it back or ask
the user to confirm before dispatching the workflow.
When the user approves and you know the Foundry scope, use the role id to
avoid rename drift:
When the user approves and you know the scopes, use the role ids to avoid
rename drift:
- `az ad sp show --id <AZURE_CLIENT_ID> --query id -o tsv`
- `az role assignment list --assignee <sp-object-id> --scope <foundry-scope> --include-inherited`
- `az role assignment create --assignee-object-id <sp-object-id> --assignee-principal-type ServicePrincipal --role 53ca6127-db72-4b80-b1b0-d745d6d5456d --scope <foundry-scope>`
- `az role assignment create --assignee-object-id <sp-object-id> --assignee-principal-type ServicePrincipal --role 5e0bd9bd-7b93-4f28-af87-19fc36ad61bd --scope <ai-services-account-scope>`
The AI Services account scope looks like
`/subscriptions/<sub>/resourceGroups/<rg>/providers/Microsoft.CognitiveServices/accounts/<ai-account-name>`
and can be derived from
`az cognitiveservices account list --resource-group <foundry-project-rg> --query "[?kind=='AIServices'].id" -o tsv`.
11. Ask before creating or updating GitHub repos, GitHub environments,
variables/secrets, Entra app registrations/service principals, federated
credentials, managed identities, or Azure RBAC assignments.
Expand Down Expand Up @@ -304,11 +322,21 @@ Then configure Workload Identity Federation on the Azure side
environment** the workflows will run from. See
`docs/ci-github-actions.md` for the exact `az` commands.

Also grant the same app registration / service principal **Foundry User** on the
Foundry project or Foundry resource before the first workflow run. The PR gate
uses Foundry data-plane APIs to read prompt agents; Azure `Reader` only proves
ARM access and will still fail the eval step with
`Microsoft.CognitiveServices/accounts/AIServices/agents/read`.
Also grant the same app registration / service principal **two** Azure
RBAC roles before the first workflow run; both are required and the eval
step fails silently (every metric returns `null`) if only one is in place:

1. **Foundry User** on the Foundry project or Foundry resource. The PR gate
uses Foundry data-plane APIs to read prompt agents; Azure `Reader` only
proves ARM access and will still fail the eval step with
`Microsoft.CognitiveServices/accounts/AIServices/agents/read`.
2. **Cognitive Services OpenAI User** on the underlying Azure AI Services
account that hosts the evaluator model deployment. Without this, Foundry
`azure_ai_evaluator` graders fail with a 401 `PermissionDenied` on the
OpenAI `chat/completions/action` data action and every metric returns
`null` in the cloud eval report. AgentOps surfaces that error in
`results.json` and the orchestrator's "0 usable metric scores" warning,
but the workflow still fails the gate — fix the role before the run.

Tell the user that CI evals emit `agentops.eval.*` telemetry and scheduled
Doctor runs emit `agentops.agent.finding.*` telemetry when App Insights is
Expand All @@ -319,7 +347,11 @@ Monitor deep links.

Already done in Step 2 - the `agentops-azure` service connection
handles auth. Make sure the underlying service principal or managed
identity has the **Foundry User** role on the Foundry project or resource.
identity has **both** the **Foundry User** role on the Foundry project (or
Foundry resource) **and** the **Cognitive Services OpenAI User** role on the
underlying Azure AI Services account that hosts the evaluator model. Both
are required; without the OpenAI User role the Foundry graders fail with a
401 `PermissionDenied` and every cloud eval metric returns `null`.

## Step 4 - Use azd for deployment

Expand Down
66 changes: 49 additions & 17 deletions src/agentops/templates/skills/agentops-workflow/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -100,22 +100,40 @@ by discovering the whole Azure subscription.
`repo:<owner>/<repo>:environment:dev`. Do not assume branch or
`pull_request` subjects without reading the workflow.
9. Before triggering a Foundry prompt-agent workflow, make sure the OIDC app /
service principal has Foundry data-plane access. It needs **Foundry User**
(role id `53ca6127-db72-4b80-b1b0-d745d6d5456d`, formerly Azure AI User) at
the Foundry project scope, or at the Foundry resource scope if that is the
team's standard. Azure **Reader** is not enough; without this role the eval
step fails on
`Microsoft.CognitiveServices/accounts/AIServices/agents/read`.
10. If the Foundry RBAC assignment is missing, do not run the workflow yet.
Show the exact GitHub OIDC client ID / service principal, desired role, and
target Foundry scope, then ask the user to approve the role assignment or
service principal has **two** RBAC assignments. Both are required; the eval
step fails silently (every metric returns `null`) if only one is in place.
1. **Foundry User** on the Foundry project (or the Foundry resource scope
if that is the team's standard). Role id
`53ca6127-db72-4b80-b1b0-d745d6d5456d` (formerly Azure AI User). Without
this the candidate-staging step fails on
`Microsoft.CognitiveServices/accounts/AIServices/agents/read`.
2. **Cognitive Services OpenAI User** on the underlying Azure AI Services
account that hosts the evaluator model deployment
(typically the parent account of the Foundry project). Role id
`5e0bd9bd-7b93-4f28-af87-19fc36ad61bd`. Without this the Foundry
`azure_ai_evaluator` graders fail with a 401 `PermissionDenied` on
`Microsoft.CognitiveServices/accounts/OpenAI/deployments/chat/completions/action`
and every metric comes back `null` in the cloud eval report. AgentOps now
lifts that error into `results.json` and the orchestrator's "0 usable
metric scores" warning so the cause is visible in CI logs, but the
workflow still fails the gate. Grant this role **before** the first run.
Azure **Reader** is not enough for either step.
10. If either RBAC assignment is missing, do not run the workflow yet.
Show the exact GitHub OIDC client ID / service principal, desired role,
target scope (project for Foundry User, AI Services account for Cognitive
Services OpenAI User), then ask the user to approve the role assignment or
get an Azure/Foundry admin to grant it. After assignment, read it back or ask
the user to confirm before dispatching the workflow.
When the user approves and you know the Foundry scope, use the role id to
avoid rename drift:
When the user approves and you know the scopes, use the role ids to avoid
rename drift:
- `az ad sp show --id <AZURE_CLIENT_ID> --query id -o tsv`
- `az role assignment list --assignee <sp-object-id> --scope <foundry-scope> --include-inherited`
- `az role assignment create --assignee-object-id <sp-object-id> --assignee-principal-type ServicePrincipal --role 53ca6127-db72-4b80-b1b0-d745d6d5456d --scope <foundry-scope>`
- `az role assignment create --assignee-object-id <sp-object-id> --assignee-principal-type ServicePrincipal --role 5e0bd9bd-7b93-4f28-af87-19fc36ad61bd --scope <ai-services-account-scope>`
The AI Services account scope looks like
`/subscriptions/<sub>/resourceGroups/<rg>/providers/Microsoft.CognitiveServices/accounts/<ai-account-name>`
and can be derived from
`az cognitiveservices account list --resource-group <foundry-project-rg> --query "[?kind=='AIServices'].id" -o tsv`.
11. Ask before creating or updating GitHub repos, GitHub environments,
variables/secrets, Entra app registrations/service principals, federated
credentials, managed identities, or Azure RBAC assignments.
Expand Down Expand Up @@ -304,11 +322,21 @@ Then configure Workload Identity Federation on the Azure side
environment** the workflows will run from. See
`docs/ci-github-actions.md` for the exact `az` commands.

Also grant the same app registration / service principal **Foundry User** on the
Foundry project or Foundry resource before the first workflow run. The PR gate
uses Foundry data-plane APIs to read prompt agents; Azure `Reader` only proves
ARM access and will still fail the eval step with
`Microsoft.CognitiveServices/accounts/AIServices/agents/read`.
Also grant the same app registration / service principal **two** Azure
RBAC roles before the first workflow run; both are required and the eval
step fails silently (every metric returns `null`) if only one is in place:

1. **Foundry User** on the Foundry project or Foundry resource. The PR gate
uses Foundry data-plane APIs to read prompt agents; Azure `Reader` only
proves ARM access and will still fail the eval step with
`Microsoft.CognitiveServices/accounts/AIServices/agents/read`.
2. **Cognitive Services OpenAI User** on the underlying Azure AI Services
account that hosts the evaluator model deployment. Without this, Foundry
`azure_ai_evaluator` graders fail with a 401 `PermissionDenied` on the
OpenAI `chat/completions/action` data action and every metric returns
`null` in the cloud eval report. AgentOps surfaces that error in
`results.json` and the orchestrator's "0 usable metric scores" warning,
but the workflow still fails the gate — fix the role before the run.

Tell the user that CI evals emit `agentops.eval.*` telemetry and scheduled
Doctor runs emit `agentops.agent.finding.*` telemetry when App Insights is
Expand All @@ -319,7 +347,11 @@ Monitor deep links.

Already done in Step 2 - the `agentops-azure` service connection
handles auth. Make sure the underlying service principal or managed
identity has the **Foundry User** role on the Foundry project or resource.
identity has **both** the **Foundry User** role on the Foundry project (or
Foundry resource) **and** the **Cognitive Services OpenAI User** role on the
underlying Azure AI Services account that hosts the evaluator model. Both
are required; without the OpenAI User role the Foundry graders fail with a
401 `PermissionDenied` and every cloud eval metric returns `null`.

## Step 4 - Use azd for deployment

Expand Down
Loading