fix(case-management): updated sdd phase 0 interview to ask for task/connector inputs by abhiram-vad · Pull Request #1166 · UiPath/skills

abhiram-vad · 2026-06-01T05:36:33Z

No description provided.

github-actions · 2026-06-01T05:36:47Z

Claude finished @abhiram-vad's task in 2m 59s —— View job

Code Review: fix(case-management): updated sdd phase 0 interview to ask for task/connector inputs

Summary

Refactors two Phase 0 → case test tasks (candidate-interview, loan-origination) from full e2e (Phase 0 + build + validate) to Phase 0-only integration tests that stop at sdd.md approval. Adds a new sdd_check.py mechanical checker for SDD markdown, updates the skill's phase-0-interview.md with "Buildability musts" and "Schema discovery" sections, and expands sdd-generation-rules.md with ExceptionStage entry-condition distinctness and Interrupting-flag rules. Also hardens case_check.py to tolerate flat data.tasks arrays while rejecting them via a new assert_tasks_nested guard.

Change-by-Change Review

1. `skills/uipath-maestro-case/references/phase-0-interview.md`

Severity: Medium

Good additions. The "Buildability musts" section (lines 137–145) and "Schema discovery" section (lines 308–329) add critical interview-time guidance that prevents unbuildable SDDs. The resource-resolution table (lines 260–268) and schema-discovery command table (lines 313–319) are well-structured for agent consumption.

Issue — Duplication with sdd-generation-rules.md: The "Buildability musts" item 1 (line 141) repeats nearly the full text of sdd-generation-rules.md §Logical integrity rule 5 (line 692 in the rules file). Both describe ExceptionStage entry-condition distinctness with the same examples (selected-stage-completed/selected-stage-exited, user-selected-stage, wait-for-connector), Interrupting semantics, and terminal vs. return exit modes. The interview file's item 1 should be a bridge sentence + link — e.g., "per lane, capture how it fires (see sdd-generation-rules.md § Logical integrity rule 5)" — not a parallel copy that will drift.

Fix this →

2. `skills/uipath-maestro-case/references/sdd-generation-rules.md`

Severity: OK

Solid expansion. Rule 5 now requires distinct entry conditions per ExceptionStage with the specific CASE_MGMT_SECONDARY_STAGE_ENTRY_RULES_DUPLICATE error code — actionable for the agent. New rule 6 classifies Interrupting semantics with the hard constraint that return-to-origin requires Interrupting: Yes. Finalization step 12 is updated to include the Interrupting-flag check. All internally consistent.

3. `tests/tasks/uipath-maestro-case/_shared/sdd_check.py` (new file)

Severity: Low

Well-structured mechanical checker covering 6 checks (mapping integrity, lineage closure, task-type enum, per-gate rule legality, conditions presence, interrupting semantics). The rule-type sets correctly mirror the schema contract. A few minor observations:

Line 96–98: The lineage-closure producer regex r"\b([A-Za-z]\w*)\s*=\s*(?!=)" is broad — it will match any assignment-like pattern in markdown prose (e.g., Marks Stage Complete = Yes). Could produce false negatives (variables incorrectly marked as "produced" when they're just markdown text). Acceptable for a v1 heuristic, but worth monitoring for false passes.
Line 160: set(cells[0]) <= set("-: ") — clever header-row filter, but would also match cells like :-- (markdown alignment). Fine in practice since those are header separators anyway.
Line 215: The < 3 stage-section floor is reasonable but fragile — a 2-stage SDD is technically valid in the platform even if unusual. Low-priority.

4. `tests/tasks/uipath-maestro-case/_shared/case_check.py`

Severity: OK

iter_tasks now tolerates flat (mis-nested) Task[] arrays by yielding dict entries directly, while the new assert_tasks_nested function rejects that shape explicitly when called. Good defensive separation — callers that just need to iterate tolerate the shape; callers that need correctness assert it. Error message is clear and actionable.

5. `tests/tasks/uipath-maestro-case/_shared/rule_variable_check.py`

Severity: OK

Clean addition — imports assert_tasks_nested and calls it at the top of main(). Correct integration.

6. `tests/tasks/uipath-maestro-case/phase_0_to_case/candidate_interview/candidate_interview.yaml`

Severity: Medium

Tag changed from e2e → integration: appropriate since the test now stops at sdd.md and doesn't build/validate a caseplan.
max_iterations: 1 and reduced max_turns/turn_timeout: sensible for a Phase 0-only run.
Success criteria replaced: validate and rule_variable_check.py criteria replaced with sdd_check.py + llm_judge + grep checks. Good split of mechanical vs. domain coherence.
The llm_judge prompt is detailed and well-structured with specific scoring criteria.

Issue — Orphaned check_candidate_interview.py: The file tests/tasks/uipath-maestro-case/phase_0_to_case/candidate_interview/check_candidate_interview.py still exists on disk but is no longer referenced by any success criterion. It should be deleted in this PR to avoid confusion.

Fix this →

Issue — sdd.md path expects cwd: The file_exists path is sdd.md (cwd-relative), while the old tests used CandidateInterview/CandidateInterview/sdd.md. The initial_prompt still says Name the solution and case project "CandidateInterview" — if the skill creates a solution directory structure, the sdd.md would land inside that directory, not cwd. However, the prompt now also says "no solution, no caseplan" so this is likely intentional. Just worth confirming the Phase 0 skill writes sdd.md to cwd when no solution exists.

7. `tests/tasks/uipath-maestro-case/phase_0_to_case/loan_origination/check_loan_origination.py`

Severity: OK

The terminal-exception case-exit check (lines 199–222) is corrected: it no longer requires marksCaseComplete=true for terminal exceptions, correctly recognizing that selected-stage-* case-exits use marksCaseComplete=false per platform semantics. The assert_tasks_nested guard is added at the top of main(). The error message on line 219–222 is improved with clearer guidance.

8. `tests/tasks/uipath-maestro-case/phase_0_to_case/loan_origination/loan_origination.yaml`

Severity: Medium

Same structural changes as candidate_interview.yaml — e2e → integration, Phase 0-only scope, sdd_check.py + llm_judge criteria. Same observations apply (path consistency, good mechanical/domain split).

Note: check_loan_origination.py is still present and modified in this PR (not orphaned) — it's used by other tests or future Phase 0 → build tests. However, it's no longer referenced from loan_origination.yaml, so if no other task YAML references it, it may become orphaned too.

What's Missing

Orphaned check_candidate_interview.py: Should be deleted — no YAML references it after this PR.
Passing-run claim: Per .claude/rules/test-writing.md §Workflow step 5, the PR should include a passing-run claim for the modified tasks. Not present in the PR description.
PR description: The PR body says "No description provided." Given the scope of changes (skill reference updates + test strategy rewrite + new shared checker), a description explaining the motivation would help reviewers.

Area Ratings

Area	Status	Notes
Frontmatter	OK	No skill SKILL.md changes; YAML frontmatter in tasks is valid
E2E Tests	OK	`e2e` tests exist elsewhere; downgrade to `integration` is appropriate for Phase 0-only
Skill Body	OK	Reference updates are accurate and well-structured
References & Assets	Medium	Content duplication between phase-0-interview.md buildability-musts item 1 and sdd-generation-rules.md rule 5
Repo Hygiene	Medium	Orphaned `check_candidate_interview.py` should be removed

Issues for Manual Review

Confirm that sdd_check.py's lineage-closure regex (line 96–98) doesn't produce false negatives on real SDD outputs — the \b([A-Za-z]\w*)\s*=\s*(?!=) pattern may match markdown prose assignments.
Confirm that Phase 0 writes sdd.md to cwd (not inside a solution subdirectory) when the prompt says "no solution, no caseplan" — the success criteria paths depend on this.
The llm_judge criterion with pass_threshold: 1.0 requires a perfect 1.0 score — verify this is intentional vs. allowing 0.5 for "mostly coherent with minor gaps."

Conclusion

Good refactoring that isolates Phase 0 testing from the full build pipeline and adds a solid mechanical SDD checker. Two actionable items: (1) delete the orphaned check_candidate_interview.py, and (2) consider bridging the duplicated ExceptionStage entry-condition content in phase-0-interview.md to avoid drift from the authoritative copy in sdd-generation-rules.md. The check_loan_origination.py fix for terminal-exception case-exit semantics (marksCaseComplete=false) is a genuine bug fix that corrects a false-negative in the integrity checker.

github-actions · 2026-06-01T05:36:47Z

Claude finished @abhiram-vad's task in 2m 53s —— View job

Coder-eval task lint (advisory)

2 task YAMLs changed (both modified); verdict counts: 0 Critical, 0 High, 0 Medium, 2 Low, 0 OK.

Rubric: .claude/commands/lint-task.md. This check is advisory and never blocks merge.

Evidence of passing run

❌ High — PR body does not claim the changed tasks have been run and passed. Please edit the PR description to add a line like:

Ran skill-case-phase-0-candidate-interview and skill-case-phase-0-loan-origination locally and they passed.

Per-task lint

`tests/tasks/uipath-maestro-case/phase_0_to_case/candidate_interview/candidate_interview.yaml` — verdict: Low

Issues:

[Low] Could pass for the wrong reason (lines 79–149): no skill_triggered or command_executed criterion verifies the agent actually loaded the uipath-maestro-case skill. The sdd_check.py mechanical check is robust enough to make gaming hard, but a sufficiently capable LLM could theoretically produce a valid SDD without ever invoking the skill. (See Theme 1.)

Suggested fixes:

Add a skill_triggered criterion: skill_triggered: uipath-maestro-case, weight ~1.0. This confirms the agent activated the skill, not just produced a plausible artifact independently.

`tests/tasks/uipath-maestro-case/phase_0_to_case/loan_origination/loan_origination.yaml` — verdict: Low

Issues:

[Low] Could pass for the wrong reason (lines 73–143): same as candidate_interview — no criterion verifies skill invocation. (See Theme 1.)

Suggested fixes:

Same fix: add skill_triggered: uipath-maestro-case.

Within-PR duplicates

No duplicate clusters detected. Both tasks share a YAML scaffold but exercise materially different domains (hiring pipeline vs. commercial loan origination) with distinct stages, exception lanes, conditional gates, and domain-specific llm_judge prompts. This is good scaffold reuse.

Themes

Theme 1 (Low): Both tasks lack a skill_triggered criterion. The sdd_check.py mechanical check (variable mapping, lineage, rule legality, task-type enum, interrupting semantics) raises the bar high enough that gaming is unlikely, but confirming skill activation is cheap insurance.

Conclusion

⚠ 2 task(s) have issues, max severity Low (theme-captured). Evidence of passing run is missing (High). Advisory only — not blocking merge.

Notes on the changes themselves: The refactoring from e2e (full Phase 0 → build → validate) to integration (Phase 0 only: interview → sdd.md → approve) is well-scoped. The new sdd_check.py shared helper is a strong deterministic validator. The llm_judge at weight 3.0 (31.6% of total) is well-balanced against the equally-weighted mechanical check. Run limits are properly placed under run_limits:, not agent:. No sandbox or CLI verb issues.

song-zhao-25

Stamp it and please merge after verify the sdd output

abhiram-vad · 2026-06-01T21:25:50Z

Screen.Recording.2026-06-01.at.12.40.05.PM.mp4

updated sdd phase 0 interview to ask for task/connector inputs

f3eb559

abhiram-vad requested review from charlesliu9, jundayin and song-zhao-25 as code owners June 1, 2026 05:36

song-zhao-25 approved these changes Jun 1, 2026

View reviewed changes

charlesliu9 approved these changes Jun 1, 2026

View reviewed changes

abhiram-vad merged commit 41751cc into main Jun 2, 2026
16 checks passed

abhiram-vad deleted the fix/case-sdd-interview branch June 2, 2026 19:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(case-management): updated sdd phase 0 interview to ask for task/connector inputs#1166

fix(case-management): updated sdd phase 0 interview to ask for task/connector inputs#1166
abhiram-vad merged 1 commit into
mainfrom
fix/case-sdd-interview

abhiram-vad commented Jun 1, 2026

Uh oh!

github-actions Bot commented Jun 1, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 1, 2026 •

edited

Loading

Uh oh!

song-zhao-25 left a comment

Uh oh!

abhiram-vad commented Jun 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

abhiram-vad commented Jun 1, 2026

Uh oh!

github-actions Bot commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review: fix(case-management): updated sdd phase 0 interview to ask for task/connector inputs

Summary

Change-by-Change Review

1. skills/uipath-maestro-case/references/phase-0-interview.md

2. skills/uipath-maestro-case/references/sdd-generation-rules.md

3. tests/tasks/uipath-maestro-case/_shared/sdd_check.py (new file)

4. tests/tasks/uipath-maestro-case/_shared/case_check.py

5. tests/tasks/uipath-maestro-case/_shared/rule_variable_check.py

6. tests/tasks/uipath-maestro-case/phase_0_to_case/candidate_interview/candidate_interview.yaml

7. tests/tasks/uipath-maestro-case/phase_0_to_case/loan_origination/check_loan_origination.py

8. tests/tasks/uipath-maestro-case/phase_0_to_case/loan_origination/loan_origination.yaml

What's Missing

Area Ratings

Issues for Manual Review

Conclusion

Uh oh!

github-actions Bot commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Coder-eval task lint (advisory)

Evidence of passing run

Per-task lint

tests/tasks/uipath-maestro-case/phase_0_to_case/candidate_interview/candidate_interview.yaml — verdict: Low

tests/tasks/uipath-maestro-case/phase_0_to_case/loan_origination/loan_origination.yaml — verdict: Low

Within-PR duplicates

Themes

Conclusion

Uh oh!

song-zhao-25 left a comment

Choose a reason for hiding this comment

Uh oh!

abhiram-vad commented Jun 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

github-actions Bot commented Jun 1, 2026 •

edited

Loading

1. `skills/uipath-maestro-case/references/phase-0-interview.md`

2. `skills/uipath-maestro-case/references/sdd-generation-rules.md`

3. `tests/tasks/uipath-maestro-case/_shared/sdd_check.py` (new file)

4. `tests/tasks/uipath-maestro-case/_shared/case_check.py`

5. `tests/tasks/uipath-maestro-case/_shared/rule_variable_check.py`

6. `tests/tasks/uipath-maestro-case/phase_0_to_case/candidate_interview/candidate_interview.yaml`

7. `tests/tasks/uipath-maestro-case/phase_0_to_case/loan_origination/check_loan_origination.py`

8. `tests/tasks/uipath-maestro-case/phase_0_to_case/loan_origination/loan_origination.yaml`

github-actions Bot commented Jun 1, 2026 •

edited

Loading

`tests/tasks/uipath-maestro-case/phase_0_to_case/candidate_interview/candidate_interview.yaml` — verdict: Low

`tests/tasks/uipath-maestro-case/phase_0_to_case/loan_origination/loan_origination.yaml` — verdict: Low