[K9VULN-14660] fix(agentless-azure): harden agentless azure setup#185
Merged
mohamed-challal merged 20 commits intoMay 12, 2026
Conversation
…cer already exists
…cument new RG behavior
bc649a9 to
e4657b0
Compare
Contributor
Author
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e4657b04b7
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
e4657b0 to
3e73347
Compare
tedkahwaji
approved these changes
May 11, 2026
tedkahwaji
reviewed
May 11, 2026
parsons90
approved these changes
May 12, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Hardens the Azure Agentless Scanner setup against the failure modes we expect to hit once it leaves pre-release: re-runs from a fresh Cloud Shell, repeated deploys with different
SCANNER_RESOURCE_GROUPvalues, partial cleanups that leave Azure soft-deleted resources behind, and delegated-permission enterprise environments. Also roughly halves first-deploy time by parallelising the slowest phases.The plumbing change underneath is a new per-install identifier (
install-id = sha256(scanner_sub | rg)[:12]) used to derive Storage Account and Key Vault names, plus an Azure-tag-based discovery mechanism (DatadogAgentlessScanner=true) that replaces local state as the source of truth for "is there already a deployment in this subscription?".Changes by theme
Robustness for re-runs and edge cases
fix(agentless-azure): handle unpurged secret key during re-setup with a different RGrecover Azure soft-deleted Key Vaults instead of crashing withConflictErroron re-deploy.fix(agentless-azure): handle resource group mismatch on deploy/destroyfail loudly (no silent overwrite) with actionable guidance when the user changesSCANNER_RESOURCE_GROUPbetween runs.agentless azure: derive install_id and discover existing deployments via RG tagtag-based RG discovery on bothdeployanddestroy:ConfigurationErrorwith mismatch guidanceagentless azure: scope storage account and key vault names to install_idresource names now derive from(scanner_sub, rg), so two RGs in the same subscription will not collide on a future multi-install iteration. Drops the legacy SA-RG lookup fallback.Performance
First-deploy wall-clock on Cloud Shell drops roughly in half.
perf(agentless-azure): parallelize lookup checks and resource creationStorage Account and Key Vault control-plane work runs concurrently.perf(agentless-azure): run preflight checks in parallelperf(agentless-azure): bump Terraform parallelism 10 -> 20fix(agentless-azure): skip Key Vault secret retries when Secrets Officer already existsPermissions / enterprise delegation
add(agentless-azure): make permissions check softer in preflight if RG already existswhen the RG is pre-created (admin-provisioned enterprise pattern), the resource-creation actions are probed at RG scope rather than subscription scope, so engineers with RG-only Contributor pass preflight.add(agentless-azure): add roleDefinitions/write permission in preflightthe Terraformrolesmodule creates a custom scanning role whoseassignableScopescovers every scan-target; surfacing this in preflight turns a confusing mid-apply 403 into an actionable error at the start of the run.Observability
fix(agentless): mark active workflow step FAILED when Azure deploy exits with errorfailures now flip the in-progress step toFAILEDon the workflow-status API; the Datadog UI setup-progress timeline no longer spins forever when, e.g.,terraform applyexits with insufficient permissions.Cleanup / housekeeping
refactor(agentless-azure): remove dead codedocs(agentless-azure): update readmebuild(agentless-azure): re-build dist scriptsrefresheddist/azure_agentless_setup.pyz.Risk & compatibility
deployfast (this is the intended behaviour — the alternative would be a silent pick).DatadogAgentlessScanner=truetag the script and Terraform module already apply on resource creation. The script never re-tags resources it didn't create (admin-pre-created RGs stay untagged and require the env var on subsequent re-runs).