Skip to content

Add LLM Gateway diagnostics surface (BYO config troubleshooting)#1105

Merged
denispetre merged 1 commit into
mainfrom
chore/llmgw-diagnostics
May 28, 2026
Merged

Add LLM Gateway diagnostics surface (BYO config troubleshooting)#1105
denispetre merged 1 commit into
mainfrom
chore/llmgw-diagnostics

Conversation

@denispetre

Copy link
Copy Markdown
Contributor

Summary

Adds Diagnose-mode coverage for the LLM Gateway / BYO LLM slice. The slice currently scores 30/20/10% (Build/Operate/Diagnose) on the Coding Agents Scorecard — Build and Operate are measurement artifacts (the previous BYO move already restored the underlying surface), but Diagnose is genuinely missing. This PR addresses the Diagnose gap.

Two complementary moves

1. `uipath-platform` — Diagnostics section on byo-connections.md

  • New `## Diagnostics` section between Validation and Typical Flow.
  • Four recipes: re-probe a failing BYO config (`get --force-refresh` / idempotent `update`), tenant-wide audit of dead connections (`list --include-connection-details` filtered on `connectionState != Enabled`), catalog-drift detection, AI Trust Layer policy override check.
  • Frontmatter `description` adds `audit / re-probe / troubleshoot` verbs (867/1024 chars).
  • Frontmatter `when_to_use` adds three diagnostic trigger phrases.
  • Body "When to Use This Skill" gains a dedicated LLM Gateway diagnostic bullet.
  • Task Navigation gets a "Diagnose / audit / re-probe a BYO LLM configuration" row pointing at the new `#diagnostics` anchor.

2. `uipath-troubleshoot` — new `products/llm-gateway/` folder

This is the bigger lever per the scorecard methodology: `uipath-troubleshoot` is credited toward Diagnose for any product whose state is reachable via `uip`. LLM GW state is reachable (six CLI verbs read it), but the troubleshoot skill had no playbooks for it.

  • `overview.md` — service model, dependencies (IS, AI Trust Layer, agents), CLI surface, explicit "what the CLI does NOT expose" callout.
  • `summary.md` — three-playbook index.
  • `playbooks/byo-connection-dead.md` — high confidence; cross-links to the IS `connection-auth-expired` playbook for OAuth recovery.
  • `playbooks/validation-probe-failed.md` — medium; covers `isAvailable` / `isCompatible` / `isModelNameSimilar` / identifier mismatch / catalog drift.
  • `playbooks/byo-routing-bypassed.md` — medium; covers `enabled: false`, multi-mapping holes, AI Trust Layer policy override, identity mismatch.
  • Top-level router (`references/summary.md`) gains a `## LLM Gateway` section between Integration Service and UI Automation.

What this doesn't fix

The gateway does not expose per-request invocation logs via CLI. There is no `uip llm-configuration logs`, no historical probe-result query, no per-call routing audit. Both the overview and the playbooks document this structural ceiling explicitly — the CLI ask is now visible from the user-facing side.

Expected scorecard impact

  • LLM GW Build: unchanged (~95%) — surface was already fully covered.
  • LLM GW Operate: unchanged (~90%).
  • LLM GW Diagnose: ~25% → ~60% on next scorecard re-run (direct credit for diagnose verbs on `uipath-platform`, plus cross-product credit through `uipath-troubleshoot`).

Test plan

  • `hooks/validate-skill-descriptions.sh` passes.
  • All relative links in new/edited files resolve.
  • No existing tests modified.
  • Owner of `refresh-skills-coverage` re-runs the coverage analysis once this is merged.

🤖 Generated with Claude Code

Adds Diagnose-mode coverage for the LLM Gateway product slice:

- uipath-platform: new Diagnostics section in byo-connections.md with four
  recipes (re-probe a failing config, tenant-wide audit of dead connections,
  trace cross-reference, AI Trust Layer policy override check). Frontmatter
  description and when_to_use surface diagnose/audit/re-probe verbs; body
  bullet documents the workflow; Task Navigation has a dedicated row.

- uipath-troubleshoot: new products/llm-gateway/ folder with overview, summary,
  and three playbooks (byo-connection-dead high-confidence, validation-probe-
  failed medium, byo-routing-bypassed medium). Router updated to surface the
  new product between Integration Service and UI Automation.

The gateway does not expose per-request invocation logs via CLI — diagnosis is
limited to current state + trace evidence. Both the overview and the playbooks
name this structural ceiling explicitly so the CLI ask is documented.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@github-actions

github-actions Bot commented May 28, 2026

Copy link
Copy Markdown
Contributor

Claude finished @denispetre's task in 3m 43s —— View job


PR Review: Add LLM Gateway diagnostics surface

  • Read review criteria and project conventions
  • Read full diff and existing PR conversation
  • Review each changed file in full context
  • Check skill-specific requirements (links, CODEOWNERS, cross-refs, secrets)
  • Post review findings

Summary

Adds Diagnose-mode coverage for LLM Gateway / BYO LLM across two skills: a ## Diagnostics section in uipath-platform's byo-connections.md with four quick recipes, and a full products/llm-gateway/ folder in uipath-troubleshoot with three structured playbooks, an overview, and a summary index. Clean execution — well-structured, prescriptive, and properly integrated into the existing routing.

Change-by-Change Review

1. skills/uipath-platform/SKILL.md (+4/−2)

Severity: OK

Frontmatter description adds audit / re-probe / troubleshoot verbs (884 chars, well under 1024 cap). when_to_use adds three diagnostic trigger phrases. Body gains a diagnostic bullet and a Task Navigation row pointing at the new #diagnostics anchor. All changes are scoped and correct.

2. skills/uipath-platform/references/llmgateway/byo-connections.md (+63)

Severity: OK

New ## Diagnostics section inserted between Validation and Typical Flow. Four recipes: re-probe (get --force-refresh), idempotent update re-validation, catalog-drift detection (list-product-configs), AI Trust Layer policy check. Plus tenant audit, trace cross-reference, and an explicit "what CLI does NOT expose" callout. Commands include --output json consistently. Plugin cross-reference to uipath-governance uses correct /uipath: notation.

3. skills/uipath-troubleshoot/references/products/llm-gateway/overview.md (+64)

Severity: Low

Two code blocks missing language identifiers:

  • Line 9: Organization Model diagram uses bare ``` — should be ```text per content-quality rules (ASCII diagrams get text).
  • Line 39: CLI summary uses bare ``` — should be ```text (this is a reference table, not executable bash).

Additionally, line 52: uip llm-configuration byo-connections update <id> --<same-fields> is the only entry in the "Key commands for troubleshooting" list that omits --output json, while all five other entries include it. Minor inconsistency.

Fix this →

4. skills/uipath-troubleshoot/references/products/llm-gateway/playbooks/byo-connection-dead.md (+59)

Severity: Medium

Well-structured playbook (Context → Investigation → Resolution). Cross-links to IS connection-auth-expired playbook correctly (link target verified).

One issue — line 59: The periodic audit command is missing --output json:

uip llm-configuration byo-connections list --include-connection-details --output-filter "Data[?connectionState!='Enabled']"

The equivalent command in byo-connections.md:84-85 correctly includes --output json before --output-filter. --output-filter applies a JMESPath expression to JSON output — without --output json, the behavior is undefined or at minimum inconsistent with the repo convention that --output json is always explicit when output is parsed.

Fix this →

5. skills/uipath-troubleshoot/references/products/llm-gateway/playbooks/validation-probe-failed.md (+54)

Severity: OK

Clean medium-confidence playbook. Covers isAvailable / isCompatible / isModelNameSimilar failure modes with distinct investigation paths. Cross-links to byo-connection-dead.md for auth-related probe failures. All commands include --output json.

6. skills/uipath-troubleshoot/references/products/llm-gateway/playbooks/byo-routing-bypassed.md (+86)

Severity: OK

The most comprehensive playbook — covers enabled: false, incomplete multi-mapping, AI Trust Layer override, and identity mismatch. Five investigation steps with clear branching logic. Resolution section covers each root cause. Correctly identifies the structural ceiling (no per-request logs) and routes to support ticket.

7. skills/uipath-troubleshoot/references/products/llm-gateway/summary.md (+9)

Severity: OK

Clean index. Links to overview and all three playbooks. Confidence levels match playbook frontmatter. All links verified.

8. skills/uipath-troubleshoot/references/summary.md (+9)

Severity: OK

New ## LLM Gateway section inserted between Integration Service and UI Automation — alphabetically consistent with existing ordering. Description correctly notes the CLI limitation. All links verified.

What's Missing

  • CODEOWNERS for llm-gateway/ subfolder under troubleshoot: The existing /skills/uipath-troubleshoot/ entry covers the new files, but the uipath-platform side has a dedicated CODEOWNERS line for /skills/uipath-platform/references/llmgateway/ with the LLM Gateway team (@denispetre @vstoleru-uipath @dragosvelcea). Consider adding /skills/uipath-troubleshoot/references/products/llm-gateway/ with the same owners so future changes to these playbooks route to domain experts. (Low — the parent entry provides coverage.)

Area Ratings

Area Status Notes
Frontmatter OK Description 884/1024 chars, valid YAML, diagnostic verbs added
E2E Tests N/A No new skill added; existing tests untouched; diagnostic content is reference material
Skill Body OK uipath-platform SKILL.md changes are scoped and correct
References & Assets Minor issues Missing --output json on one audit command; two code blocks without language identifiers
Repo Hygiene OK No secrets, no cross-skill file refs, all links resolve, no unrelated changes

Issues for Manual Review

  • CLI command accuracy: The uip gov aops-policy deployed-policy resolve command appears in multiple files. Verify the --product AITrustLayer --license-type <type> --tenant <name> flags match the current CLI surface — this is outside the llm-configuration family and may have different flag semantics.
  • --output-filter JMESPath expressions: The Data[?connectionState!='Enabled'] and Data[?product=='...' && operationGroupName=='...'] filters assume a specific JSON envelope shape. Worth confirming against current CLI output.
  • uip is connections edit verb (byo-connection-dead.md:47): Verify this is the current verb for re-authenticating an IS connection — some CLI versions may use update instead of edit.

Conclusion

Clean, well-structured addition. The two-skill approach (platform-side quick recipes + troubleshoot-side structured playbooks) is the right design — each skill is self-contained and serves its audience. One Medium fix needed (missing --output json on the audit command in byo-connection-dead.md:59) and two Low polish items (code block language identifiers and a minor --output json inconsistency in overview.md). Otherwise ready to merge.

@vstoleru-uipath vstoleru-uipath left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@denispetre denispetre merged commit 79485c4 into main May 28, 2026
18 checks passed
@denispetre denispetre deleted the chore/llmgw-diagnostics branch May 28, 2026 13:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants