Skip to content

Workflow primitives expansion: closed-world completeness, ChatInput upgrade, HITL, and demos#12

Merged
loning merged 40 commits intodevfrom
feature/primitives
Feb 26, 2026
Merged

Workflow primitives expansion: closed-world completeness, ChatInput upgrade, HITL, and demos#12
loning merged 40 commits intodevfrom
feature/primitives

Conversation

@eanzhao
Copy link
Copy Markdown
Contributor

@eanzhao eanzhao commented Feb 23, 2026

Summary

This branch introduces a major workflow capability upgrade against dev, centered on:

  1. Significant primitives expansion (with closed-world composition and Turing-completeness proof paths).
  2. /api/chat & /api/ws/chat ChatInput enhancement to support inline workflowYaml.
  3. While/Loop runtime fixes and run-scoped execution correctness.
  4. Human-in-the-loop primitives and protocol support.
  5. New workflow demos (CLI + Web) and broad documentation/test coverage.

Why

  • Make workflows expressive enough to support deterministic closed-world orchestration patterns.
  • Support both registry workflows and inline ad-hoc workflow YAML via API.
  • Improve runtime robustness for concurrent runs, retries/timeouts, branching, and resumable interactions.
  • Provide practical demos and reference docs to accelerate adoption.

Key Changes

1) Workflow primitives expansion + closed-world completeness

  • Expanded and standardized core primitive module pack (26 registrations including aliases/internal loop):
    • src/workflow/Aevatar.Workflow.Core/WorkflowCoreModulePack.cs
  • Added new/extended primitive modules such as:
    • switch, race, map_reduce, evaluate, reflect, guard, delay, emit, cache, wait_signal, human_input, human_approval
    • plus strengthened while, foreach, parallel, workflow_call, workflow_loop
  • Introduced primitive catalog + alias canonicalization + closed-world blocked policy:
    • src/workflow/Aevatar.Workflow.Core/Primitives/WorkflowPrimitiveCatalog.cs
  • Added expression engine and expression-driven control/data evaluation:
    • src/workflow/Aevatar.Workflow.Core/Expressions/WorkflowExpressionEvaluator.cs
  • Added closed-world/Turing-completeness artifacts:
    • test/Aevatar.Integration.Tests/WorkflowTuringCompletenessTests.cs
    • workflows/turing-completeness/counter-addition.yaml
    • workflows/turing-completeness/minsky-inc-dec-jz.yaml
    • tools/ci/workflow_closed_world_guards.sh

2) /api/chat ChatInput and run-start semantics

  • ChatInput now accepts workflowYaml:
    • src/workflow/Aevatar.Workflow.Application.Abstractions/Runs/WorkflowCapabilityApiModels.cs
  • Chat run request model upgraded with inline YAML:
    • src/workflow/Aevatar.Workflow.Application.Abstractions/Runs/WorkflowChatRunModels.cs
  • API and WebSocket parsing updated accordingly:
    • src/workflow/Aevatar.Workflow.Infrastructure/CapabilityApi/ChatEndpoints.cs
    • src/workflow/Aevatar.Workflow.Infrastructure/CapabilityApi/ChatWebSocketCommandParser.cs
  • Resolver now supports inline YAML parse/validate/name-match flow and actor binding rules:
    • src/workflow/Aevatar.Workflow.Application/Runs/WorkflowRunActorResolver.cs
    • src/workflow/Aevatar.Workflow.Application.Abstractions/Runs/IWorkflowRunActorPort.cs
    • src/workflow/Aevatar.Workflow.Infrastructure/Runs/WorkflowRunActorPort.cs
  • Added explicit API errors:
    • INVALID_WORKFLOW_YAML
    • WORKFLOW_NAME_MISMATCH
    • mapped in src/workflow/Aevatar.Workflow.Infrastructure/CapabilityApi/ChatRunStartErrorMapper.cs

3) WhileModule / WorkflowLoop runtime fixes

  • WhileModule refactored to run-scoped state, expression-based continuation, and sub-parameter propagation:
    • src/workflow/Aevatar.Workflow.Core/Modules/WhileModule.cs
  • WorkflowLoopModule strengthened with:
    • run-level correlation (run_id)
    • retry / on_error / timeout handling
    • branch-aware next-step routing
    • variable/evaluation integration and closed-world runtime guard
    • src/workflow/Aevatar.Workflow.Core/Modules/WorkflowLoopModule.cs

4) Human-in-the-loop workflow support

  • Protocol/events upgraded:
    • run_id added to core workflow execution events
    • new events: WorkflowSuspendedEvent, WorkflowResumedEvent, WaitingForSignalEvent, SignalReceivedEvent
    • src/workflow/Aevatar.Workflow.Abstractions/workflow_execution_messages.proto
  • Added HITL modules:
    • src/workflow/Aevatar.Workflow.Core/Modules/HumanInputModule.cs
    • src/workflow/Aevatar.Workflow.Core/Modules/HumanApprovalModule.cs
    • src/workflow/Aevatar.Workflow.Core/Modules/WaitSignalModule.cs
  • Projection side support:
    • src/workflow/Aevatar.Workflow.Projection/Reducers/WorkflowSuspendedEventReducer.cs

5) Workflow demo suite and docs

  • New CLI demo project + workflow samples:
    • demos/Aevatar.Demos.Workflow
    • includes 01 to 47 YAML workflows covering data/control/composition/HITL cases.
  • New Web demo project:
    • demos/Aevatar.Demos.Workflow.Web (UI + custom demo modules + interactive execution).
  • Major docs additions:
    • docs/WORKFLOW.md
    • docs/WORKFLOW_PRIMITIVES.md
    • docs/architecture/workflow-closed-world-turing-completeness.md

API Impact

  • Request contract change:
    • POST /api/chat and WS command payload now support workflowYaml.
  • Error contract extension:
    • INVALID_WORKFLOW_YAML
    • WORKFLOW_NAME_MISMATCH
  • Workflow execution message schema extended with run_id and suspension/signal event types.

Compatibility Notes

  • This is a broad feature branch (large diff vs dev), not a narrow patch.
  • Runtime behavior is intentionally stricter in workflow validation/start paths when input YAML or workflow binding is invalid.
  • For mixed-version deployment, ensure workflow execution proto consumers are upgraded consistently with new fields/events.

Test Plan

  • Targeted module/runtime tests:
    • dotnet test test/Aevatar.Integration.Tests/Aevatar.Integration.Tests.csproj --filter "FullyQualifiedName~WorkflowLoopModuleCoverageTests|FullyQualifiedName~WorkflowAdditionalModulesCoverageTests|FullyQualifiedName~WorkflowValidatorCoverageTests|FullyQualifiedName~WorkflowTuringCompletenessTests" --nologo
  • API/run-start behavior tests:
    • dotnet test test/Aevatar.Workflow.Application.Tests/Aevatar.WorkflowApplication.Tests.csproj --nologo
    • dotnet test test/Aevatar.Workflow.Host.Api.Tests/Aevatar.Workflow.Host.Api.Tests.csproj --nologo
  • Core primitive/parser/expression tests:
    • dotnet test test/Aevatar.Workflow.Core.Tests/Aevatar.Workflow.Core.Tests.csproj --nologo

- Introduced new orchestration patterns: Concurrent, Handoff, Sequential, and Vote, enabling flexible multi-agent coordination.
- Enhanced observability with OpenTelemetry GenAI conventions for tracing agent activities.
- Added comprehensive documentation for the workflow engine, detailing design, implementation, and usage examples.
- Included new reports on MAF improvements and OpenViking research to support ongoing development and integration efforts.
@eanzhao eanzhao changed the title New orchestration patterns New workflow primitives: race, select, cache, ... Feb 23, 2026
eanzhao and others added 14 commits February 25, 2026 10:59
… aelf:aevatarAI/aevatar into feature/primitives
- Introduced `WorkflowSuspendedEventReducer` to handle workflow suspension events in the projection process.
- Updated `WorkflowExecutionProjectionServiceTests` and `WorkflowExecutionReadModelProjectorTests` to include the new reducer in the reducer list.
- Added a test case in `WorkflowExecutionReadModelProjectorTests` to verify the application of the `WorkflowSuspendedEvent`, ensuring correct handling of suspension metadata and event projection.
…t improvements

- Added RunId parameter to various workflow modules, including AssignModule, CheckpointModule, ConditionalModule, and others, to improve traceability and context during execution.
- Updated ChatSessionKeys to support creation of session IDs that include RunId and attempt number, ensuring unique identification across concurrent executions.
- Refactored data structures in modules like CacheModule, MapReduceModule, and ParallelFanOutModule to utilize composite keys (RunId, StepId) for better state management.
- Enhanced error handling and logging to include RunId, facilitating easier debugging and monitoring of workflow execution.
- Overall improvements to maintainability and clarity in the workflow execution process.
- Introduced closed-world mode for workflows, enabling a subset of core primitives to achieve Turing completeness without external dependencies.
- Implemented runtime and validator contracts to ensure expression evaluation, branching semantics, and state management are properly handled in closed-world mode.
- Added integration tests for closed-world workflows, including Minsky-style programs and counter addition, to validate functionality and performance.
- Created comprehensive documentation detailing the closed-world primitives, their usage, and the underlying architecture.
- Implemented CI guards to enforce closed-world mode constraints and ensure compliance with the defined semantics.
- Introduced a comprehensive markdown document outlining the architecture of the Workflow LLM streaming capability, covering end-to-end execution paths, session semantics, and component responsibilities.
- Added an audit scorecard for the LLM streaming architecture, detailing the audit scope, validation results, and scoring based on architectural constraints and testing coverage.
- This commit enhances the documentation framework, providing clarity on the architecture and facilitating future audits and improvements.
- Updated `AGENTS.md` to include guidelines for using compact layouts in `sequenceDiagram` and prohibited fixed-width styles that could distort the diagrams.
- Added CSS styles to `WORKFLOW_LLM_STREAMING_ARCHITECTURE.md` for better responsiveness of Mermaid diagrams, ensuring they fit within their containers and support horizontal scrolling when necessary.
- Improved participant naming in sequence diagrams for clarity and consistency across the documentation.

These changes aim to improve the visual representation of workflows and enhance the overall documentation quality.
- Removed the Query Service participant from the sequence diagram in `WORKFLOW_LLM_STREAMING_ARCHITECTURE.md` for clarity.
- Updated the documentation to reflect the integration of `DeltaToolCall` in the streaming process, ensuring that both text and tool call events are published correctly.
- Enhanced the audit scorecard to include successful test executions and improved scoring metrics, reflecting the current state of the architecture.
- Introduced new interfaces and implementations for state snapshot emission, improving the overall architecture and ensuring better handling of workflow states.

These changes aim to streamline the architecture documentation and improve the clarity and functionality of the Workflow LLM streaming capabilities.
…d enhanced documentation

- Renamed the scorecard to indicate it is a review version.
- Added new architecture validation checks, including various guards for architecture, routing, and test stability.
- Updated evidence references in the documentation to reflect the latest code changes and test coverage.
- Improved overall scoring from 98 to 99, indicating enhanced architecture verification and testing outcomes.

These changes aim to provide a clearer and more accurate representation of the Workflow LLM streaming architecture's current state and its validation processes.
…y WebSocket frames

- Updated the WebSocket protocol to handle both text and binary message types, allowing for more versatile communication.
- Refactored the command parser to accommodate binary frames and ensure proper response message types are maintained.
- Improved documentation to reflect changes in WebSocket handling and updated audit scorecard to indicate full support for text/binary interactions.
- Enhanced test coverage for WebSocket command parsing and execution, ensuring robust handling of both frame types.

These changes aim to improve the flexibility and reliability of the Workflow LLM streaming architecture, facilitating better integration of multimodal data streams.
…ing and message types

- Updated event type handling in `WorkflowRunEventContracts` to utilize constants from `WorkflowRunEventTypes`, enhancing maintainability and reducing hardcoded strings.
- Introduced `WorkflowCapabilityMessageTypes` for better organization of message types used in WebSocket communications.
- Enhanced `ChatWebSocketProtocol` and related classes to support structured message envelopes, improving clarity and type safety in WebSocket interactions.
- Updated documentation to reflect changes in event types and message handling, ensuring consistency across the architecture.
- Improved test coverage for event handling and WebSocket command parsing, ensuring robust functionality and adherence to new standards.

These changes aim to streamline the architecture and enhance the reliability of the Workflow LLM streaming capabilities.
- Moved `ChatInput` and `ChatWsCommand` models from `Application.Abstractions` to `Infrastructure/CapabilityApi`, clarifying the boundary between host protocols and application contracts.
- Introduced `ChatCapabilityMessageTypes` for better organization of message types related to chat commands.
- Updated documentation to reflect changes in model locations and architecture layers, ensuring consistency and clarity in the Workflow LLM streaming architecture.

These changes aim to enhance the structure and maintainability of the chat command handling within the architecture.
…efault

- Added a new environment variable `COVERAGE_GENERATED_FILE_FILTERS` to specify default filters for excluding generated files during coverage checks.
- Updated the `coverage_quality_guard.sh` script to utilize these filters in the report generation process, improving the accuracy of coverage metrics.
- Revised the README to reflect the changes in file filtering for generated files, providing clearer guidance on the script's functionality.

These changes aim to enhance the quality of coverage reports by ensuring that generated files do not skew the results.
@eanzhao eanzhao changed the title New workflow primitives: race, select, cache, ... Workflow primitives: fail-fast + run_id normalization + test hardening Feb 26, 2026
eanzhao and others added 3 commits February 26, 2026 14:08
- Updated the Aevatar workflow YAML documentation to include canonical schema, closed-world mode, and detailed descriptions of roles and parameters.
- Introduced multiple demo workflows showcasing various event modules, including CSV to Markdown conversion, JSON value extraction, and role-level event handling.
- Added edge-case demos for human input, approval processes, and signal handling to illustrate practical use cases and workflow interactions.
- Implemented new event modules for CSV markdown conversion and JSON path extraction, enhancing the functionality of the workflow system.
…rences

- Introduced new unit tests for the GenAI observability middleware, covering scenarios for setting provider tags based on valid and invalid metadata, handling errors, and ensuring sensitive data is included when enabled.
- Added a new project reference for the Aevatar.AI.Projection module in the test project to support the new tests.
- Created new test files for AI projection appliers and in-memory projection graph store, enhancing test coverage for projection functionalities.
- Updated project references in CQRS projection core tests to include the InMemory projection provider, ensuring all necessary dependencies are accounted for.

These changes aim to improve the robustness of the observability middleware and enhance the overall test coverage across the project.
@eanzhao eanzhao changed the title Workflow primitives: fail-fast + run_id normalization + test hardening Workflow primitives expansion: closed-world completeness, ChatInput upgrade, HITL, and demos Feb 26, 2026
loning and others added 15 commits February 26, 2026 14:28
- Introduced new unit tests for the InMemoryProjectionGraphStore, covering scenarios for listing nodes and edges by owner, handling missing parameters, and ensuring proper behavior with empty collections.
- Added tests for the ProjectionSessionEventHub, validating argument handling and ensuring correct behavior when publishing and subscribing to events with invalid inputs.
- Created a new test file for ProjectionPortBase, enhancing coverage for query and lifecycle services, including validation of constructor arguments and method behaviors under various conditions.

These changes aim to improve test coverage and ensure the reliability of projection functionalities across the project.
…ne YAML

- Introduced a comprehensive audit report detailing the architecture review of the Workflow execution chain and Inline YAML validation.
- Documented audit scope, input sources, verification results, and overall scoring, highlighting critical issues and recommendations.
- Identified three P1 blocking issues and one P2 major issue, emphasizing the need for immediate fixes before merging.
- Provided a detailed breakdown of findings, including specific code evidence and suggested remediation steps for each identified issue.
- Included a regression testing matrix to ensure coverage of newly identified scenarios and maintain workflow integrity.
…ion improvements

- Added a new `DemoWorkflowModulePack` to encapsulate demo modules for CSV markdown conversion, JSON path extraction, and template processing.
- Implemented role-level event handling in `DemoCsvMarkdownModule`, `DemoJsonPickModule`, and `DemoTemplateModule` to ensure proper processing of ChatRequest events.
- Introduced validation for workflow definitions at runtime, ensuring that only known step types are utilized and enhancing error handling for invalid workflows.
- Updated CI workflow to streamline regression testing and coverage checks, improving overall stability and reliability of the build process.
- Enhanced documentation for LLM streaming architecture and audit scorecard, providing clearer insights into the system's capabilities and performance metrics.
…ecard

- Added a detailed markdown report for the PR review of the Workflow Runtime and Inline YAML, outlining the audit scope, verification results, and scoring.
- Documented the closure of four critical issues, including three P1 and one P2, with evidence and remediation steps.
- Enhanced the evaluation criteria and provided a regression testing matrix to ensure comprehensive coverage of identified scenarios.
- Updated the GuardModule and WorkflowLoopModule to reflect changes in metadata handling and step evaluation logic.
- Introduced new tests to validate the behavior of while loops and next step transitions, ensuring robustness in workflow execution.
…lta semantics

- Replaced `StreamToolCallAccumulator` with `StreamingToolCallAccumulator` in `RoleGAgent` and `ChatRuntime` to enhance streaming capabilities.
- Updated `MEAILLMProvider` and `TornadoLLMProvider` to utilize new delta conversion methods, ensuring proper handling of tool calls with missing IDs.
- Added unit tests to validate behavior when tool call IDs appear late in the streaming process, ensuring robust functionality and adherence to new standards.

These changes aim to improve the reliability and clarity of tool call management across the chat and LLM provider components.
- Updated `ValidateWorkflowDefinition` in `WorkflowGAgent` to include built-in canonical types for improved workflow validation.
- Modified `ConnectorCallModule` to ensure the correct assignment of `RunId`, enhancing the reliability of connector requests.

These changes aim to strengthen the validation process and improve the handling of connector requests within the workflow system.
- Introduced a new `WaitSignalTimeoutFiredEvent` and `WorkflowStepTimeoutFiredEvent` to improve event handling for signal timeouts and workflow step timeouts.
- Updated `WaitSignalModule` to require `step_id` for disambiguating multiple waiters on the same signal, ensuring correct processing of concurrent waiters.
- Refactored `WorkflowLoopModule` to normalize `run_id` and handle timeouts more effectively, improving reliability in workflow execution.
- Added unit tests to validate the new timeout handling and ensure correct behavior when multiple waiters are present for the same signal.

These changes aim to enhance the robustness and clarity of signal and timeout management within the workflow system.
- Introduced a new markdown document detailing the PR review audit for workflow concurrent run correlation, including audit scope, verification results, and scoring.
- Updated `WaitingForSignalEvent` to include `run_id` for improved event handling and traceability.
- Enhanced `WaitSignalModule` and `WorkflowLoopModule` to enforce `run_id` checks, ensuring robust handling of concurrent workflows.
- Refactored `MakerRecursiveModule` and `MakerVoteModule` to propagate `run_id` through events, maintaining isolation across runs.
- Added comprehensive unit tests to validate the new behavior and ensure correct propagation of `run_id` in various scenarios.

These changes aim to enhance the reliability and clarity of workflow execution and auditing processes.
- Introduced a new script `workflow_runid_guard.sh` to enforce the requirement that `RunId` must be explicitly set in `StepRequestEvent` and `StepCompletedEvent` initializers.
- Updated `architecture_guards.sh` to include the execution of the new run-id guard as part of the CI workflow.

These changes aim to enhance the validation of workflow events and ensure proper handling of run identifiers in the workflow system.
…workflow

- Updated `WorkflowRunActorResolver` to prevent in-place reconfiguration of already bound actors when inline YAML is provided, ensuring isolated actor creation for new runs.
- Enhanced `WorkflowImplicitModuleDependencyExpander` to include implicit mapping of `cache` to `llm_call`, addressing previously identified dependency gaps.
- Added regression tests to validate the new behavior and ensure proper handling of workflow configurations.
- Achieved a score of 96/100 in the remediation review, recommending merge based on successful closure of critical issues.

These changes aim to improve the reliability and consistency of workflow execution with inline YAML and cache dependencies.
- Introduced multiple unit tests for the `CacheModule`, covering scenarios such as metadata handling, cache key shortening, and behavior when multiple requests share the same cache key.
- Enhanced `WorkflowImplicitModuleDependencyExpander` to ensure that implicit modules correctly include `llm_call` for various workflow types, improving dependency management.
- Added tests to validate the behavior of the `ResolveOrCreateAsync` method in `WorkflowRunActorResolver`, ensuring proper error handling for missing actors and unsupported actor types.

These changes aim to strengthen the testing framework and improve the reliability of workflow execution and module interactions.
- Introduced comprehensive unit tests for `ChatSessionKeys`, validating the creation of session IDs with various inputs and ensuring proper exception handling for invalid cases.
- Added tests for `ProjectionRuntimeCoverage`, verifying the resolution of metadata from providers and handling of missing providers, along with compensation event dispatching in the projection store.
- Enhanced the testing framework to improve reliability and coverage of workflow execution and projection functionalities.
…eryService

- Introduced new tests for the `WaitSignalModule`, covering scenarios for timeout handling, step completion, and behavior when no payload paths are present.
- Added a test for `WorkflowExecutionProjectionQueryService` to ensure it returns an empty root subgraph when the actor ID is null.
- Enhanced the testing framework to improve coverage and reliability of workflow execution and projection functionalities.
@loning loning merged commit 897f276 into dev Feb 26, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants