Skip to content

fix(hitl): align HITL skill and tests to v1.0 node schema#529

Merged
dushyant-uipath merged 1 commit intomainfrom
fix/hitl-v1-hitl-skill
May 3, 2026
Merged

fix(hitl): align HITL skill and tests to v1.0 node schema#529
dushyant-uipath merged 1 commit intomainfrom
fix/hitl-v1-hitl-skill

Conversation

@dushyant-uipath
Copy link
Copy Markdown
Collaborator

Summary

  • output not result: Updates all skill docs and eval tests to use $vars.<nodeId>.output (v1.0 renamed the outputId from result to output)
  • cancelled/timeout handles removed: Cleaned up all references to non-existent handles; v1.0 exposes only completed
  • fieldId vs variable rule documented: Added Critical Rule 9 to SKILL.md and a callout in hitl-node-quickform.md — output object keys are the field id, not field.variable
  • 4 new developer-mistake quality tests guarding the most common HITL authoring errors:
    • quality_08: agent uses field ID (not variable name) for output access
    • quality_09: agent infers correct types (boolean/number/date) instead of defaulting to text
    • quality_10: binding on input fields only — output fields must have no binding
    • quality_11: inOut fields are pre-filled via binding AND appear in $vars.<nodeId>.output after submit
  • TEST_PLAN.md rewritten with v1.0 invariants table, variable binding accuracy section, real-world consequence coverage map

Files changed

All files are under skills/uipath-human-in-the-loop/ and tests/tasks/uipath-human-in-the-loop/ (Dushyant-owned).

Test plan

  • quality_08 — ContractReview: field ID vs variable name access
  • quality_09 — ExpenseReview: boolean/number/date type inference
  • quality_10 — SupplierOnboarding: binding direction correctness
  • quality_11 — RCAReview: inOut pre-fill + downstream output access
  • Existing quality_04, quality_07, e2e_01, e2e_03 pass with updated .output paths

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 3, 2026

Claude finished @dushyant-uipath's task in 2m 58s —— View job


Summary

Aligns the HITL skill docs and eval tests to the v1.0 node schema: renames resultoutput, removes cancelled/timeout handles, documents the fieldId vs variable distinction (Critical Rule 9), and adds 4 new quality tests guarding common developer mistakes. Also updates cross-file links to the new shared/ paths in uipath-maestro-flow.

Change-by-Change Review

1. skills/uipath-human-in-the-loop/SKILL.md

Severity: OK
Two clean changes: adds Critical Rule 9 (fieldId not variable for output access) and updates the reporting template from .result.output. Rule 9 is well-written and addresses a real agent mistake pattern.

2. skills/uipath-human-in-the-loop/references/hitl-node-quickform.md

Severity: OK
Comprehensive v1.0 alignment — all resultoutput changes are correct. Cross-file links updated to new shared/ paths (cli-commands.md, variables-and-expressions.md) — verified both targets exist. Definition entry updated: version: "1.0", shape: "square", cancelled/timeout handles removed, input handle constraints removed, outputDefinition keys corrected. The fieldId callout and runtime variables table are accurate and well-placed. The binding rule row correctly generalizes HITL outputId to output for v1.0.

3. skills/uipath-human-in-the-loop/references/hitl-node-apptask.md

Severity: OK
Edge wiring correctly updated to completed-only with clear "no cancelled or timeout handles in v1.0" note. Variables table and status description updated. Clean.

4. skills/uipath-human-in-the-loop/references/hitl-node-coded-action-app.md

Severity: High
resultoutput and status description were correctly updated, but line 323 still reads:

wire completed, cancelled, timeout

This contradicts v1.0 (only completed available) and conflicts with the identical section updated in hitl-node-apptask.md:348. An agent following this reference will attempt to wire non-existent handles.

Fix: Change line 323 from:

**Edge wiring** — wire `completed`, `cancelled`, `timeout`. See [hitl-node-quickform.md](...) for edge format.

to:

**Edge wiring** — wire `completed` (only handle available in v1.0). See [hitl-node-quickform.md](...) for edge format.

Fix this →

5. tests/tasks/uipath-human-in-the-loop/TEST_PLAN.md

Severity: OK
Thorough rewrite: v1.0 invariants table, variable binding accuracy section, coverage map with real-world consequences, gap priority updates. Correctly removes the quality_08_timeout_handle gap (timeout handle no longer exists in v1.0). Status updated from Draft to Active.

6. tests/tasks/uipath-human-in-the-loop/e2e_01_invoice_approval_greenfield.yaml

Severity: OK
Clean .result.output update in success criteria.

7. tests/tasks/uipath-human-in-the-loop/e2e_03_gdpr_compliance_greenfield.yaml

Severity: OK
Correctly updated: description no longer mentions timeout handle wiring, handles_wired hardcoded to ["completed"], file_contains check changed from "timeout" to "completed". The timeout_configured field (ISO 8601 duration on node input) is correctly preserved — it's the timeout duration property, not the handle.

8. tests/tasks/uipath-human-in-the-loop/quality_04_all_handles.yaml

Severity: OK
All .result.output changes correct and consistent.

9. tests/tasks/uipath-human-in-the-loop/quality_07_runtime_vars.yaml

Severity: OK
All .result.output changes correct and consistent.

10. tests/tasks/uipath-human-in-the-loop/quality_08_variable_binding_fieldid.yaml (new)

Severity: OK
Well-designed test: forces a field ID vs variable name distinction by giving id: "approved" / variable: "legalApproval" and validating the agent accesses .output.approved not .output.legalApproval. Includes both positive (file_contains with .output.approved) and negative (excludes with .output.legalApproval) checks. task_id, tags, and structure all follow conventions.

11. tests/tasks/uipath-human-in-the-loop/quality_09_dev_mistake_wrong_type.yaml (new)

Severity: Medium ⚠️
The test description claims it tests "boolean for yes/no decisions, number for amounts, and date for deadlines." The prompt includes a submittedDate input field described as "should be a DATE" and the report.json template asks for "submittedDate": "<type used>" — but there is no assertion in success_criteria that validates field_types.submittedDate equals "date". The claimed date-type coverage is not actually enforced.

Fix: Add an assertion to the json_check block:

      - expression: "field_types.submittedDate"
        operator: equals
        expected: "date"

Fix this →

12. tests/tasks/uipath-human-in-the-loop/quality_10_dev_mistake_binding_direction.yaml (new)

Severity: OK
Thorough test covering binding direction correctness. Good use of both file_contains (positive binding patterns) and json_check (report validation). max_turns: 90 is reasonable given the complex flow structure.

13. tests/tasks/uipath-human-in-the-loop/quality_11_inout_field_access.yaml (new)

Severity: OK
Well-structured test for inOut direction semantics. Correctly validates that inOut fields have binding (pre-fill) AND appear in output. The mock data ("Network timeout caused by misconfigured firewall rule") is realistic and domain-appropriate.

What's Missing

Area Ratings

Area Status Notes
Frontmatter OK No changes to frontmatter, existing frontmatter valid
E2E Tests OK 4 new quality tests well-structured; existing tests correctly updated
Skill Body OK Critical Rule 9 is well-placed and accurate
References & Assets Issue coded-action-app.md:323 still references removed handles
Repo Hygiene OK CODEOWNERS present, no secrets, changes scoped to HITL skill

Issues for Manual Review

  • v1.0 schema accuracy: The outputDefinition in the definition entry was changed to "source": "=result.Action" for status. Verify this matches the actual HITL runtime's output contract — an incorrect source expression would cause runtime variable resolution failures.
  • typeVersion on node vs version on definition: The node JSON example uses "typeVersion": "1.0.0" (line 82 of quickform.md) while the definition uses "version": "1.0". Confirm these are intentionally different fields with different version granularity.

Conclusion

Solid, well-scoped PR that correctly aligns the HITL skill to v1.0. The new quality tests are thoughtfully designed and cover real agent failure patterns. Two issues to fix before merge:

  1. High: Update hitl-node-coded-action-app.md:323 to remove cancelled/timeout handle references — this is the last inconsistent spot in the v1.0 migration
  2. Medium: Add the missing submittedDate date-type assertion to quality_09 to match the test's claimed coverage

  • Read review criteria and project conventions
  • Analyze full diff (origin/main...HEAD)
  • Read existing PR conversation
  • Review each changed file in full context
  • Check skill-specific requirements (links, cross-refs, secrets, CODEOWNERS)
  • Post review findings

- Update SKILL.md, hitl-node-quickform.md, hitl-node-apptask.md,
  hitl-node-coded-action-app.md to use output (not result) outputId,
  status source =result.Action, shape square, version 1.0, and document
  fieldId-vs-variable access rule
- Remove stale cancelled/timeout handle references (v1.0 has completed only)
- Update e2e_01, e2e_03, quality_04, quality_07 tests for .output path
  and remove timeout handle wiring requirement from e2e_03
- Add quality_08: fieldId vs variable access (ContractReview); negative
  check uses run_command + grep -L instead of unsupported excludes field
- Add quality_09: correct field types — boolean/number/date vs text default
  (ExpenseReview)
- Add quality_10: binding direction — inputs must have binding, outputs must
  not (SupplierOnboarding)
- Add quality_11: inOut pre-fill + downstream output access (RCAReview)
- Rewrite TEST_PLAN.md with v1.0 invariants, variable binding accuracy
  section, and real-world consequence coverage map
@dushyant-uipath dushyant-uipath force-pushed the fix/hitl-v1-hitl-skill branch from c957ed7 to e95cdcf Compare May 3, 2026 09:55
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 3, 2026

Claude finished @dushyant-uipath's task in 2m 34s —— View job


Review: PR #529 — fix(hitl): align HITL skill and tests to v1.0 node schema

  • Read review criteria and project conventions
  • Read full diff (origin/main...HEAD)
  • Read existing PR conversation
  • Review each changed file in full context
  • Check skill-specific requirements (links, cross-refs, secrets, CODEOWNERS)
  • Post review findings

Summary

Aligns the HITL skill docs and eval tests to the v1.0 node schema: renames resultoutput, removes cancelled/timeout handles, documents the fieldId vs variable distinction (Critical Rule 9), and adds 4 new quality tests guarding common developer mistakes.

Change-by-Change Review

1. skills/uipath-human-in-the-loop/SKILL.md

Severity: OK
Adds Critical Rule 9 (fieldId not variable for output access) and updates the reporting template from .result.output. Rule 9 is well-written — addresses a real agent mistake pattern with a clear explanation of the variable property's actual role.

2. skills/uipath-human-in-the-loop/references/hitl-node-quickform.md

Severity: OK
Comprehensive v1.0 alignment:

  • All resultoutput changes correct (variables.nodes entries, outputDefinition, binding rule row, runtime variables table, code example)
  • Definition entry updated: version: "1.0", shape: "square", input handle constraints removed, cancelled/timeout handles removed from handleConfiguration
  • outputDefinition keys corrected: "result""output" with "var": "output", status source changed to "=result.Action"
  • Cross-file links updated to new shared/ paths — verified both cli-commands.md and variables-and-expressions.md exist at the target paths
  • fieldId callout and runtime variables table accurately document the v1.0 access pattern

3. skills/uipath-human-in-the-loop/references/hitl-node-apptask.md

Severity: OK
Edge wiring correctly updated to completed-only with explicit "no cancelled or timeout handles in v1.0" note. Removed the multi-edge JSON example showing cancelled/timeout edges. Runtime variables table and variables.nodes description both updated from result to output.

4. skills/uipath-human-in-the-loop/references/hitl-node-coded-action-app.md

Severity: OK
All four changes are correct: edge wiring updated to completed only, variables.nodes description uses output, runtime variables table uses output and updated status description. The prior automated review flagged line 323 — it's fixed in this commit.

5. tests/tasks/uipath-human-in-the-loop/TEST_PLAN.md

Severity: OK
Thorough rewrite with v1.0 invariants table, variable binding accuracy section, coverage map with real-world consequences, and gap priority updates. Status updated from Draft to Active. Correctly removes the quality_08_timeout_handle gap (handle no longer exists). File inventory accurately reflects all present and missing test files.

6. tests/tasks/uipath-human-in-the-loop/e2e_01_invoice_approval_greenfield.yaml

Severity: OK
Clean .result.output update in the file_contains success criterion (line 96).

7. tests/tasks/uipath-human-in-the-loop/e2e_03_gdpr_compliance_greenfield.yaml

Severity: OK
Correctly updated: handles_wired hardcoded to ["completed"], file_contains check changed from "timeout" to "completed". The timeout_configured field (ISO 8601 duration on node input) is correctly preserved — it's the timeout duration property, not the handle.

8. tests/tasks/uipath-human-in-the-loop/quality_04_all_handles.yaml

Severity: OK
All .result.output changes correct and consistent.

9. tests/tasks/uipath-human-in-the-loop/quality_07_runtime_vars.yaml

Severity: OK
All .result.output changes correct and consistent.

10. tests/tasks/uipath-human-in-the-loop/quality_08_variable_binding_fieldid.yaml (new)

Severity: OK
Well-designed test: forces a field ID vs variable name distinction by giving id: "approved" / variable: "legalApproval" and validating the agent accesses .output.approved not .output.legalApproval. Includes both positive (file_contains with .output.approved) and negative (grep -L legalApproval) checks. Task ID, tags, and structure all follow conventions.

11. tests/tasks/uipath-human-in-the-loop/quality_09_dev_mistake_wrong_type.yaml (new)

Severity: OK
Tests boolean, number, and date type inference. The prior automated review flagged a missing submittedDate assertion — it's present in this commit (line 88-90). All four type assertions are covered (approved → boolean, amount → number, approved_amount → number, submittedDate → date).

12. tests/tasks/uipath-human-in-the-loop/quality_10_dev_mistake_binding_direction.yaml (new)

Severity: OK
Thorough test covering binding direction correctness. Good use of both file_contains (positive binding patterns) and json_check (report validation). Validates that input fields have binding and that approved_has_binding is false. max_turns: 90 is reasonable given the complex flow structure with decision routing.

13. tests/tasks/uipath-human-in-the-loop/quality_11_inout_field_access.yaml (new)

Severity: OK
Well-structured test for inOut direction semantics. Correctly validates that inOut fields have binding (pre-fill) AND appear in output. The mock data is realistic and domain-appropriate. Tests both the direction: "inOut" declaration and the downstream access pattern.

What's Missing

Nothing identified. All relative links resolve, CODEOWNERS is in place, no secrets or cross-skill references beyond the expected uipath-maestro-flow dependency. The two issues from the prior automated review (coded-action-app.md:323 handles and quality_09 date assertion) are both fixed in this commit.

Area Ratings

Area Status Notes
Frontmatter OK No changes to frontmatter, existing frontmatter valid
E2E Tests OK 4 new quality tests well-structured; existing tests correctly updated
Skill Body OK Critical Rule 9 well-placed and accurate
References & Assets OK All three reference files consistently updated for v1.0
Repo Hygiene OK CODEOWNERS present, no secrets, changes scoped to HITL skill

Issues for Manual Review

  • v1.0 schema accuracy: The outputDefinition in the definition entry uses "source": "=result" for output and "source": "=result.Action" for status. Verify these match the actual HITL runtime's output contract — an incorrect source expression would cause runtime variable resolution failures.
  • typeVersion vs version granularity: The node JSON example uses "typeVersion": "1.0.0" (line 82 of quickform.md) while the definition uses "version": "1.0". Confirm these are intentionally different fields with different version granularity — typeVersion on the node instance vs version on the definition entry.

Conclusion

Clean, well-scoped PR. All resultoutput renames are consistent across skill docs and tests. The cancelled/timeout handle cleanup is complete — no stale references remain. Critical Rule 9 (fieldId vs variable) is a valuable addition that addresses a real agent failure pattern. The 4 new quality tests are thoughtfully designed with both positive and negative assertions. The two issues flagged by the prior automated review are already fixed in this commit. No issues found — this is ready to merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants