feat: add AI model quota preflight validation check#7672
feat: add AI model quota preflight validation check#7672rajeshkamal5050 merged 11 commits intomainfrom
Conversation
Add a new local-preflight check (ai_model_quota) that detects Microsoft.CognitiveServices/accounts/deployments resources in the Bicep snapshot and validates quota availability before provisioning. The check: - Extracts model name, SKU, capacity, and location from deployment resources in the Bicep snapshot - Resolves usage names from the Azure AI model catalog (not constructed) - Queries GetAiUsages API to compare requested capacity vs remaining quota - Returns warnings for deployments exceeding quota or using invalid model names/versions not found in the catalog - Falls back to AZURE_LOCATION or resource group location when the snapshot doesn't resolve resourceGroup().location Framework improvements: - PreflightCheckFn now returns []PreflightCheckResult (multiple findings) - Snapshot for RG deployments now receives --location from AZURE_LOCATION - Added ResourceService.GetResourceGroup() for RG location lookup Fixes #5432 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Adds an ai_model_quota local preflight check to detect Cognitive Services model deployments in the Bicep snapshot and warn when quota would be exceeded (or when the model/SKU/version cannot be resolved from the catalog), plus supporting framework changes and tests.
Changes:
- Extend the local preflight framework to allow checks to return multiple findings and improve snapshot location resolution for RG-scoped deployments.
- Add AI model deployment extraction from the snapshot and a quota/catalog validation check in the Bicep provider.
- Add unit + functional tests and new Bicep samples to exercise RG/subscription scoped scenarios and edge cases.
Reviewed changes
Copilot reviewed 15 out of 21 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| cli/azd/pkg/infra/provisioning/bicep/local_preflight.go | Updates preflight check contract to return multiple results; adds cognitive deployment extraction and passes env-based location into snapshot opts for RG scope. |
| cli/azd/pkg/infra/provisioning/bicep/bicep_provider.go | Wires the new ai_model_quota check into preflight and implements quota/catalog validation + RG location fallback. |
| cli/azd/pkg/azapi/resource_service.go | Adds ResourceService.GetResourceGroup() to support RG location lookup fallback. |
| cli/azd/pkg/infra/provisioning/bicep/ai_model_quota_check_test.go | Adds unit tests for snapshot resource extraction and usage-name resolution logic. |
| cli/azd/pkg/infra/provisioning/bicep/local_preflight_test.go | Updates existing tests to the new “multiple results” check signature. |
| cli/azd/pkg/infra/provisioning/bicep/role_assignment_check_test.go | Updates role assignment preflight check tests to the new check signature and result shape. |
| cli/azd/test/functional/preflight_quota_test.go | Adds functional tests (with recordings) to validate warnings and location handling across scopes. |
| cli/azd/test/functional/testdata/samples/ai-quota/README.md | Documents the new functional test samples and their parameter/env-var mappings. |
| cli/azd/test/functional/testdata/samples/ai-quota/rg-deployment/azure.yaml | Adds RG-scoped sample project definition for functional tests. |
| cli/azd/test/functional/testdata/samples/ai-quota/rg-deployment/infra/main.bicep | Adds RG-scoped Bicep template that deploys an AI Services account + model deployments. |
| cli/azd/test/functional/testdata/samples/ai-quota/rg-deployment/infra/main.parameters.json | Adds parameter file mapping env vars into the RG-scoped sample template. |
| cli/azd/test/functional/testdata/samples/ai-quota/sub-deployment/azure.yaml | Adds subscription-scoped sample project definition for functional tests. |
| cli/azd/test/functional/testdata/samples/ai-quota/sub-deployment/infra/main.bicep | Adds subscription-scoped template that creates an RG and deploys AI resources via a module. |
| cli/azd/test/functional/testdata/samples/ai-quota/sub-deployment/infra/ai-resources.bicep | Adds module that defines the AI Services account and model deployments. |
| cli/azd/test/functional/testdata/samples/ai-quota/sub-deployment/infra/main.parameters.json | Adds parameter file mapping env vars into the subscription-scoped sample template. |
cli/azd/pkg/infra/provisioning/bicep/role_assignment_check_test.go
Outdated
Show resolved
Hide resolved
- Initialize results to non-nil empty slice to distinguish 'no findings' from 'checks skipped' in telemetry - Aggregate capacity by usage name so multiple deployments sharing the same quota pool are checked against combined demand - Use effective capacity (not raw dep.Capacity) in warning messages - Use t.Context() instead of context.Background() in tests - Use %q and include subscription ID in GetResourceGroup error message Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Treat missing usage API entries as 0 remaining quota instead of silently skipping validation - Match on ModelFormat in resolveUsageName to avoid cross-format name collisions in the model catalog - Fix test comment to accurately describe exit behavior Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
cli/azd/pkg/infra/provisioning/bicep/ai_model_quota_check_test.go
Outdated
Show resolved
Hide resolved
cli/azd/pkg/infra/provisioning/bicep/ai_model_quota_check_test.go
Outdated
Show resolved
Hide resolved
- Break long lines in bicep_provider.go and test file to satisfy the 125-char lll linter rule Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Remove snapshot --location override for RG deployments to avoid resolving resourceGroup().location incorrectly when the selected RG is in a different region than AZURE_LOCATION - Prefer RG location lookup over AZURE_LOCATION in fallback order - Sort locations for deterministic warning output order - Handle missing SKU/version gracefully in ai_model_not_found warning Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Remove unused envLocation field from localArmPreflight struct - Update README table to match actual functional test coverage Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ings The Bicep snapshot for RG-scoped deployments needs --location to resolve resourceGroup().location. Without it, the snapshot fails or produces unresolved expressions, causing the preflight check to be silently skipped. The location is resolved from the actual resource group (if it already exists) or AZURE_LOCATION (for new resource groups). Also adds explicit 'azd env set AZURE_LOCATION' in RG functional tests to ensure the location is in the azd environment. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
spboyer
left a comment
There was a problem hiding this comment.
Solid preflight check with good aggregation logic and thorough test coverage (unit + functional with recordings). All 20 prior copilot-bot threads are properly resolved. Two issues to address:
- bicep_provider.go:2642 — doc comment on resolveResourceTenantPrincipalId truncated during rebase (3 lines lost)
- bicep_provider.go:2449 — resolveResourceGroupLocation called twice per preflight run (redundant API call)
jongio
left a comment
There was a problem hiding this comment.
Four findings the automated reviewer missed:
- The doc comment on
resolveResourceTenantPrincipalIdgot truncated - three middle lines were accidentally deleted, leaving a broken sentence. launch.jsonhas personal dev config changes (hardcoded path, debug target swap) that shouldn't ship.resolveResourceGroupLocationis called twice (once invalidatePreflight, once incheckAiModelQuota) - each hitting Azure API for the same RG metadata. The resolved location could be passed throughvalidationContext.resolveResourceGroupLocationresolvesResourceServiceviaserviceLocatoreven though it's already a direct field onBicepProvider.
Side note: several of the bot's existing comments are factually incorrect - the code already handles aggregation (requiredByUsage map), deterministic iteration (slices.Sorted), and ModelFormat matching (checked when non-empty).
- Restore truncated doc comment on resolveResourceTenantPrincipalId - Remove duplicate resolveResourceGroupLocation call by passing the resolved location through validationContext.EnvLocation - Use p.resourceService directly instead of p.serviceLocator.Resolve Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Remove .vscode/.copilot-worktree-launch and revert personal changes to cli/azd/.vscode/launch.json. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Use nil-safe convert.ToValueWithDefault for GetResourceGroup response fields to prevent potential panics - Add early return guard in resolveResourceGroupLocation when subscriptionId is empty Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
jongio
left a comment
There was a problem hiding this comment.
Solid work - the author addressed all prior feedback. Implementation follows codebase conventions, quota aggregation logic is correct, and test coverage via functional recordings looks good. A couple non-blocking suggestions below.
- Add log line when deployment is skipped due to unresolved location - Use case-insensitive comparison for version matching (consistency) - Add TestResolveUsageName_FormatFiltering unit test for format filtering and version case-insensitivity Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Azure Dev CLI Install InstructionsInstall scriptsMacOS/Linux
bash: pwsh: WindowsPowerShell install MSI install Standalone Binary
MSI
Documentationlearn.microsoft.com documentationtitle: Azure Developer CLI reference
|
wbreza
left a comment
There was a problem hiding this comment.
Code Review — PR #7672: AI Model Quota Preflight Validation
Verdict: 💬 Comments — Feature concept is solid. Several items to discuss, particularly around test recording size and quota aggregation logic.
Findings
🟠 1. Test recordings are very large (~5.5MB) — can we reduce?
The 6 recording YAML files total ~5.5MB / 16,689 lines. Each RG-scoped recording is ~5,206 lines (~1.56MB), while subscription-scoped ones are ~357 lines (~100KB).
The bulk comes from the CognitiveServices model catalog response (~607KB per recording) which includes costs, SKU variants, capabilities, deprecation timelines, and rate limits — none of which the preflight logic actually inspects. The preflight check only needs model name, existence, and quota/usage numbers.
The 3 RG-scoped recordings appear to contain nearly identical responses — are these different test scenarios that happen to hit the same API responses? If so, could we:
- Strip the recordings to only include fields the preflight logic reads?
- Or share a common base recording and parameterize the differences?
This would significantly reduce the PR footprint. Not blocking, but worth investigating.
🟠 2. Quota not aggregated across deployments sharing the same model
Each deployment is validated independently against remaining quota. If deployment A requests 60 units and deployment B requests 60 units, both pass individually against 100 remaining — but the combined 120 exceeds quota. Consider aggregating required capacity per usageName before checking against remaining quota.
🟠 3. Zero/negative capacity handling is inconsistent
When dep.Capacity <= 0, it's coerced to 1 for the quota comparison but reported as the original value in the warning message. This could produce confusing diagnostics like "Requested capacity: 0, but deployment requires 1 unit." Consider either warning about invalid capacity separately, or using consistent values in both comparison and message.
�� 4. Missing quota entry defaults to zero remaining
If a usage name isn't in the quota API response (API incompleteness, regional restrictions), usageMap[usageName] defaults to float64 zero. This could trigger a false warning. Consider checking existence explicitly and emitting a distinct "quota data unavailable" warning instead.
🟡 5. Nil vs empty slice semantics in validate()
nil return from the check function could mean "skipped" or "ran with no findings." If telemetry distinguishes these, initialize the results slice to empty ([]PreflightCheckResult{}) rather than nil when checks actually execute.
🟡 6. RG-scoped deployments don't resolve actual RG location
Falls back to AZURE_LOCATION env var, which may differ from the resource group's actual region. Consider using GetResourceGroup() to look up the actual RG location for more accurate quota validation.
🔵 7. Minor: GetResourceGroup() needs nil checks on SDK response pointers
Azure SDK responses may have nil pointer fields. Add guards before dereferencing resp.ID, resp.Name, resp.Location.
What's Good
- ✅ Feature concept is solid — catching quota issues before provisioning saves real user time
- ✅ PreflightCheckFn returning []PreflightCheckResult enables multi-warning checks cleanly
- ✅ Nil aiModelService check is good defensive programming
- ✅ Good test scenario coverage (default capacity, invalid model, invalid version, different location)
- ✅ Both RG-scoped and subscription-scoped deployments tested
Nice work on the preflight framework evolution — the multi-result pattern is a good design choice. The quota validation will save users a lot of wasted provisioning time. 👍
Response to @wbreza's reviewThanks for the thorough review Wallace! Here's the status on each finding: 🟠 1. Large test recordings — Good point. This isn't specific to this PR — it's a general issue with how the recorder captures full API responses. Created #7699 to investigate recording size reduction across all functional tests. 🟠 2. Quota not aggregated — Already addressed in iteration 2. The check now aggregates required capacity per 🟠 3. Zero/negative capacity — Already addressed in iteration 1. The check now uses 🟡 4. Missing quota entry defaults to 0 — Already addressed in iteration 2. This is intentional conservative behavior — if the quota API doesn't return an entry for a usage name, we treat it as 0 remaining and warn, rather than silently skipping. 🟡 5. Nil vs empty slice — Already addressed in iteration 1. 🟡 6. RG location resolution — Already addressed. 🔵 7. Nil checks on SDK response — Already addressed. All 7 items either already fixed in prior iterations or tracked as follow-up (#7699 for recordings). |
wbreza
left a comment
There was a problem hiding this comment.
Re-Review — PR #7672: AI Model Quota Preflight Validation
Verdict: ✅ Approve
All 7 findings from my previous review have been verified as addressed in the current code:
- ✅ Test recordings — Acknowledged as functional test fixtures
- ✅ Quota aggregation — Now uses
equiredByUsage[usageName] += effectiveCapacity\ to sum across deployments before comparison - ✅ Zero/negative capacity — Coerced effective value used consistently in both comparison and reporting
- ✅ Missing quota entry — Uses
emaining, found := usageMap[...]\ with explicit existence check - ✅ Nil vs empty slice — Results initialized to []PreflightCheckResult{}\ with semantic comment
- ✅ RG location resolution — New
esolveResourceGroupLocation()\ calls \GetResourceGroup(), falls back to \AZURE_LOCATION\ - ✅ GetResourceGroup nil safety — Uses \convert.ToValueWithDefault()\ for safe pointer dereferencing
Fresh scan of all core Go files found no additional issues. Quota aggregation loop, deduplication logic, and error handling are all correct.
Nice work on the iterative improvements, Victor — the preflight framework evolution and quota validation will save real user time. 👍
Agent-Logs-Url: https://github.com/Azure/azure-dev/sessions/35135691-9b5c-4458-8cc1-3ec8534b0753 Co-authored-by: rajeshkamal5050 <11532743+rajeshkamal5050@users.noreply.github.com>
* Initial plan * Create changelog for azd 1.23.16 (#7694) Agent-Logs-Url: https://github.com/Azure/azure-dev/sessions/644ac742-2540-4fbe-bd0d-093c1a90b997 Co-authored-by: rajeshkamal5050 <11532743+rajeshkamal5050@users.noreply.github.com> * Fix changelog: remove hard-coded tool names from --fail-on-prompt entry Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Add PR #7672 (AI model quota preflight check) to changelog Agent-Logs-Url: https://github.com/Azure/azure-dev/sessions/35135691-9b5c-4458-8cc1-3ec8534b0753 Co-authored-by: rajeshkamal5050 <11532743+rajeshkamal5050@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: rajeshkamal5050 <11532743+rajeshkamal5050@users.noreply.github.com> Co-authored-by: Rajesh Kamal <rajeshkamal@microsoft.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Summary
Add a new local-preflight check (
ai_model_quota) that detects Azure Cognitive Services model deployments in the Bicep snapshot and validates quota availability before provisioning — catching quota issues early instead of failing at deploy time.What it does
Microsoft.CognitiveServices/accounts/deploymentsresources in the Bicep snapshotGetAiUsagesAPI to compare requested capacity vs remaining quotaai_model_quota_exceeded)ai_model_not_found)Framework improvements
PreflightCheckFnnow returns[]PreflightCheckResult(supports multiple findings per check)ResourceService.GetResourceGroup()for looking up RG location when resource locations are not resolved in the snapshotAZURE_LOCATION, which may not match the selected resource group's regionTest coverage
Limitations / Follow-up
Fixes #5432