Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 36 additions & 2 deletions .claude/skills/review-staged/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -92,15 +92,49 @@ The review does not assume docs will be updated later — a user-visible surface
- **Not excuses**: "it's preview/opt-in/temporary", "Microsoft Learn will cover it", "you can see it in the diff". The CHANGELOG is the release's source of truth.

### Cross-Cutting Contract Checks
Five checks that static per-file analysis tends to miss — each requires tracing across multiple files or comparing code against docs/descriptions:
Nine checks that static per-file analysis tends to miss — each requires tracing across multiple files or comparing code against docs/descriptions:

- **Return-value null semantics (Rule N)**: When a method documents that `null` (or a sentinel) carries a special meaning (e.g., "null = verified, safe to persist"), grep every call site and verify the producer returns null in exactly the documented cases. A path that returns non-null when the contract says "verified" silently breaks the caller's persistence gate.
- **CHANGELOG vs code numeric consistency (Rule O)**: After reading CHANGELOG `[Unreleased]`, extract every numeric claim (interval, retry count, timeout). Grep production code for the corresponding literals. Flag any mismatch — the CHANGELOG and the code must agree.
- **Swallowed `OperationCanceledException` (Rule P)**: For every `catch (OperationCanceledException)` that returns a value instead of rethrowing, verify the swallow is intentional and documented. In long-running interactive flows (setup, consent polling), a swallowed cancellation means Ctrl+C has no effect.
- **Test-only escape hatch declared `public` (Rule Q)**: When a property/field named `*ForTests*` or `*TestOverride*` is declared `public` in a production assembly, check the `.csproj` for `InternalsVisibleTo`. If present, `public` is unnecessary and widens the security surface — flag as MEDIUM and suggest `internal`.
- **`--help` text accuracy (Rule R)**: When the diff changes how a command surfaces output (new URL handoff, removed PowerShell path, added fallback), read every description string in the same file and its parent command. If the description still names an output form that no longer matches the code, flag as MEDIUM.

Full detection rules and real examples are in `.claude/agents/pr-code-reviewer.md` Step 9, Rules N through R.
**Added 2026-05-29 after PR #432 Copilot review surfaced gaps Rules N–R didn't catch. See [.codereviews/why-review-staged-missed-copilot-findings-20260529T190832Z.md](../../codereviews/why-review-staged-missed-copilot-findings-20260529T190832Z.md) for the full miss analysis.**

- **Branch-wide stale-mechanism sweep (Rule S)**: When the staged diff replaces mechanism X with mechanism Y (e.g., PowerShell fallback → az rest, `api://{appId}` → bare appId GUID, `per-app admin-consent URLs` → `az ad sp create`), enumerate the terms being replaced and `grep -rn` the **entire branch** (not just the staged delta) for each term. For every hit outside the staged delta, classify:
- Stale comment (FIX — code/comment lie about behavior).
- Stale user-facing log/warning/exception text (FIX — operators read these in production).
- Stale doc comment / XML doc (FIX — IDE surfaces these to consumers).
- Intentional historical reference (KEEP — e.g., commit message archaeology, CHANGELOG `[Released]` entries).

This is the analog of "rename hygiene" applied to behavior changes. It catches the family of issues where the implementation moves but the surrounding documentation lies. **Treat each surviving reference as MEDIUM if user-facing (LogWarning, LogError, exception messages, Warnings collection entries), LOW if internal-only (private comments).**

Six of the nine unique findings in PR #432's Copilot review were of this exact shape — all stale `"PowerShell fallback"` and `"api://{appId}"` references on lines outside the staged delta.

- **Test-class parallelism safety (Rule T)**: For every test class in the diff (or anywhere on the branch), check whether the class reads or writes any static property or field whose name matches the pattern `*ForTests*` or `*TestOverride*` (these are the conventional escape hatches in this codebase). If yes, the class **must** carry either:
- `[Collection("Sequential")]` attribute, OR
- `[CollectionDefinition(..., DisableParallelization = true)]` for a class-specific collection it owns.

Without one of these, xUnit may run the class in parallel with other classes that also mutate the same static state, producing flaky cross-class races. **Severity: MEDIUM.** The fix is one line. Two PR #432 findings (`BatchPermissionsOrchestratorTests`, `BatchPermissionsOrchestratorMissingSpTests`) were exactly this — both mutated `BypassSpProvisioningForTests` without `[Collection]` until Copilot flagged it.

Implementation: `grep -rn "BypassConsentChecksForTests\|BypassSpProvisioningForTests\|OpenUrlOverrideForTests\|<OtherKnownForTestsName>" <test_files>`. For each hit, read the enclosing test class and verify the attribute is present.

- **Branch-scope completeness checkpoint (Rule U)**: **The single biggest miss category in the PR #432 review.** Before declaring the review complete, list every file that appears in the **branch diff** (`git diff $(git merge-base HEAD origin/main)...HEAD --name-only`), not just the staged delta. The review must touch each file in that list at least to the extent of running the other rules against it.

In practice this means: if the staged delta is 4 files but the branch diff is 22 files, the review surface is 22, not 4. Prior `/review-staged` runs do **not** absolve the current run of covering the rest of the branch — assume nothing has been covered until you've explicitly read it in this run.

- **Hardening-bypass detection (Rule V)**: **Added 2026-05-29 after PR #432's second Copilot review caught a regression Rules N–U missed.** When the staged diff hardens an entry-point helper by making a parameter required (so the compiler forces it through one path), the lower-level function it delegates to typically still keeps the parameter optional — and **any direct caller of the lower-level function bypasses the hardening**. The hardening only protects the path that goes through the helper.

Detection: when the diff includes a signature change of the shape "parameter X was optional with `= null` default, now required (no default)", identify the **lower-level function** the helper delegates to. If that function still has X optional, run `grep -rn "<LowerLevelFunctionName>" --include="*.cs" src/<ProductionPath>/` (production callers only) and verify each direct caller passes X. Any caller that doesn't is a recurrence of the exact bug the hardening was meant to prevent.

Concrete PR #432 example: commit `7a1e317` made `mcpScopesByAudience` required on `ExecuteBatchPermissionsStepAsync` so the AllSubcommand and NonDwBlueprintSetupOrchestrator entry points couldn't forget it. But `ConfigureAllPermissionsAsync` (the lower-level orchestrator method) still had `knownMcpAudienceAppIds` optional, and `PermissionsSubcommand.ConfigureMcpPermissionsAsync` called it directly — bypassing the hardening and re-introducing the AADSTS500011 V2 routing regression. Copilot flagged it as HIGH; the skill missed it.

**Severity: HIGH** when the bypassed call site is in production code (live regression risk). LOW when it's a test fixture (tests routinely use null defaults). Treat the parameter-becoming-required signature change in the diff as a trigger: the moment you see one, follow the call chain down and grep direct callers of the lower-level function.

Implementation: at the start of the review, emit the list of branch-level files and treat them as the review surface. At the end, verify each file was read or explicitly justified as "no rule applies."

Full detection rules and real examples are in `.claude/agents/pr-code-reviewer.md` Step 9, Rules N through V.

### Context Awareness
The skill differentiates between:
Expand Down
11 changes: 9 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,8 @@ Agents provisioned before this release need `Agent365.Observability.OtelWrite` g
**Option B — CLI** (`a365 setup admin`) has been removed in this release. Use Option A above, or copy the PowerShell instructions printed in the `a365 setup all` summary output.

### Added
- `--skip-sp-provisioning` option on `setup all` — skips the interactive in-line provisioning of missing resource service principals (issue #429). Default: setup prompts per-resource and runs `az ad sp create --id <appId>` using the operator's `az login`. With this flag, missing SPs are excluded from the consent URL and listed in the Action Required block with the `az` command and a per-SP consent URL. Implicitly enabled when stdin is redirected (CI / pipe scenarios).
- Action Required block in the `setup all` summary now lists missing service principals — each entry shows the resource, pending scopes, the `az ad sp create` command, and the per-SP consent URL needed to complete provisioning.
- `logs export [command] [--output <dir>]` — exports a redacted copy of a CLI diagnostic log safe to share with Microsoft support. Redacts JWT tokens, email addresses, OS-path usernames, and tenant-specific GUIDs; replaces identical values with consistent aliases so log correlation is preserved. Preserves diagnostic IDs that aren't sensitive but are useful for debugging — `TraceId`, `CorrelationId`, Microsoft Graph `request-id` and `client-request-id` values, and well-known public Microsoft / Agent 365 resource appIds (such as the Microsoft Graph appId `00000003-0000-0000-c000-000000000000`). Omit `[command]` to export all available logs at once.
- `setup blueprint --show-secret` — displays the blueprint client secret stored in `a365.generated.config.json` in plaintext without re-running any setup steps. On Windows, decryption requires the same machine and user account that ran setup (DPAPI). When no secret is found, the command prints instructions to run `a365 setup blueprint --agent-name <name>`.
- Blueprint client secret is now printed to the terminal at creation time with a "copy this value now" warning. Use `a365 setup blueprint --show-secret` to retrieve it afterwards.
Expand All @@ -49,6 +51,11 @@ Agents provisioned before this release need `Agent365.Observability.OtelWrite` g
- `--secret-lifetime-months <N>` option (and matching `secretLifetimeMonths` field in the `--input-file` JSON) on `develop-mcp register-external-mcp-server` — controls the lifetime of the client secrets created on the A365Proxy and RemoteProxy Entra apps. Valid range `1-24`; omit to use the Graph default (~2 years). Calendar-aware (uses `DateTimeOffset.AddMonths`, so Jan 31 + 1 month → Feb 28/29). Added so tenants with an `appManagementPolicies` cap on client-secret lifetime — previously a hard failure inside `CreateEntraAppsAsync` with a generic "Failed to create secret" message — can fit registration inside their tenant's policy. When Graph rejects the requested (or default) lifetime with a tenant-policy error, the CLI now emits an actionable error naming the flag and the attempted value (e.g. `Tenant Entra ID policy rejected the requested 12-month lifetime ... Pass --secret-lifetime-months N with a smaller value (e.g. --secret-lifetime-months 3) that fits inside your tenant's appManagementPolicies cap.`) instead of the previous generic failure.

### Fixed
- `setup all` now exits silently on Ctrl+C instead of printing `ERROR: Setup failed: A task was canceled.` followed by a misleading partial summary.
- `setup all --m365` no longer fails with `AADSTS650053` because the Messaging Bot scope was hard-coded to scopes the resource SP does not publish (issue #429).
- `setup all` no longer fails with `AADSTS650053` for any drift between requested scopes and what a resource SP actually publishes (issue #429). Unpublished scopes are filtered out before building the consent URL; per-resource warnings surface what was dropped.
- `setup all` no longer fails with `AADSTS650052` ("organization lacks a service principal for ...") when a resource SP is missing in the tenant (issue #429). Setup now prompts per-resource and provisions the SP via `az ad sp create`. Declined / failed paths are surfaced in the Action Required block.
- `setup all` no longer fails with `AADSTS650053` or `AADSTS500011` for V2 MCP per-server audiences (issue #429). Scopes now route to the correct per-audience resource instead of collapsing onto the WorkIQ Tools URI.
- `setup all` admin-consent pre-check no longer opens the browser unnecessarily when consent already exists for every required scope. The pre-check now uses `az rest` (mirroring the post-consent polling path) because the CLI's MSAL token cannot read `/v1.0/oauth2PermissionGrants` after the removal of `DelegatedPermissionGrant.Read.All` from the CLI client app — previously every re-run returned empty from the Graph check and opened a browser that waited up to 180 seconds for a no-op consent. The pre-check also filters `consentType eq 'AllPrincipals'` so a leftover `Principal`-scoped grant from an earlier `--authmode obo` run cannot falsely satisfy the tenant-wide check. App IDs are validated as GUIDs before being interpolated into the OData filter.
- `setup all` summary row "Blueprint Permission Grants" no longer shows `granted` when admin consent could not be auto-verified. When the browser consent completed but the poll timed out, the row now shows "unverified" and an Action Required item with the re-grant URL is printed so the operator can verify or retry.
- Admin consent polling now uses `az rest` (the Azure CLI token) to detect grants, the same path used by `a365 query-entra`. Previously the poll used an MSAL delegated token that lacked `DelegatedPermissionGrant.Read.All`, causing every poll to return empty and the timeout to expire even when consent had already been granted.
Expand Down Expand Up @@ -93,9 +100,9 @@ Agents provisioned before this release need `Agent365.Observability.OtelWrite` g
- `setup blueprint --m365` now prints a note when passed alone — the flag only takes effect with `--endpoint-only` or `--update-endpoint`; otherwise use `setup all --m365`.
- Graph error bodies in `[DBG]` logs compressed to `{code}: {message}` instead of the full JSON envelope.
- `a365.config.json` and `a365.generated.config.json` are no longer mirrored into the machine-global config folder (`%LocalAppData%\Microsoft.Agents.A365.DevTools.Cli` on Windows, `~/.config/a365` on Linux/macOS). Config is read from and written to the project directory only. Cleanup commands no longer fall back to `a365.generated.config.json` in the global folder when no project-local copy exists. MSAL token caches, CLI logs, and the version/notice caches continue to live in the global folder.
- `setup all` and `setup permissions` now automatically execute the S2S app role assignment PowerShell script when a Global Administrator runs the CLI and the programmatic Graph API path fails (the delegated token does not carry `AppRoleAssignment.ReadWrite.All`). Requires PowerShell 7+ (`pwsh`) and the `Microsoft.Graph.Authentication` / `Microsoft.Graph.Applications` modules; run `a365 setup requirements` to check and auto-install. On success, the "Action Required: S2S app role (PowerShell)" block is suppressed. If the modules are missing, the block is still printed with a note to run `a365 setup requirements`.
- `setup all` and `setup permissions` now issue the S2S app role assignment and delegated `AllPrincipals` OAuth2 consent via `az rest` against the operator's existing `az login` session, replacing the previous `Connect-MgGraph` PowerShell fallback (issue #429). `pwsh` and the Microsoft.Graph PowerShell modules are no longer required for this path. The per-prompt `[y/N]` confirmation is unchanged.
- "Blueprint Permission Grants" row in the `setup all` summary now reports `already granted` (vs `granted`) when the run was fully idempotent — no new `oauth2PermissionGrant` was POSTed for delegated consent and no new `appRoleAssignment` was POSTed for S2S. Surfaces the distinction between "consent was captured in this run" and "consent existed before this run" so re-runs visually indicate no work was needed.
- PowerShell S2S fallback subprocess (`pwsh`) is now capped at 5 minutes and runs with `-ExecutionPolicy Bypass`. On Windows it also runs with `PSModulePath` / `DOTNET_ROOT` / `DOTNET_ROOT_X64` / `DOTNET_STARTUP_HOOKS` removed from the child environment to avoid `[Assembly with same name is already loaded]` failures when the parent dotnet host's paths collide with pwsh's own assembly resolution. The script pins the latest installed `Microsoft.Graph.Authentication` and `Microsoft.Graph.Applications` modules by absolute path and exits with code 2 when either module is missing.
- `setup all`: fallback consent prompt now explains the ambiguity (declined vs error) and offers to grant the permissions programmatically via `az login`; per-audience permission lists and consent URLs show per-server MCP names (e.g. `mcp_MailTools`) instead of a generic `Agent 365 Tools` for every audience.

### Removed
- `a365 config` command family (`config init`, `config display`, `config permissions`) — replaced by `a365 setup all --agent-name` and `a365 setup permissions custom`.
Expand Down
Loading
Loading