Workflow primitives expansion: closed-world completeness, ChatInput upgrade, HITL, and demos#12
Merged
Workflow primitives expansion: closed-world completeness, ChatInput upgrade, HITL, and demos#12
Conversation
- Introduced new orchestration patterns: Concurrent, Handoff, Sequential, and Vote, enabling flexible multi-agent coordination. - Enhanced observability with OpenTelemetry GenAI conventions for tracing agent activities. - Added comprehensive documentation for the workflow engine, detailing design, implementation, and usage examples. - Included new reports on MAF improvements and OpenViking research to support ongoing development and integration efforts.
b126fbe to
782a833
Compare
b088ab7 to
4aedfcd
Compare
48fef22 to
1fe89ea
Compare
… aelf:aevatarAI/aevatar into feature/primitives
- Introduced `WorkflowSuspendedEventReducer` to handle workflow suspension events in the projection process. - Updated `WorkflowExecutionProjectionServiceTests` and `WorkflowExecutionReadModelProjectorTests` to include the new reducer in the reducer list. - Added a test case in `WorkflowExecutionReadModelProjectorTests` to verify the application of the `WorkflowSuspendedEvent`, ensuring correct handling of suspension metadata and event projection.
…t improvements - Added RunId parameter to various workflow modules, including AssignModule, CheckpointModule, ConditionalModule, and others, to improve traceability and context during execution. - Updated ChatSessionKeys to support creation of session IDs that include RunId and attempt number, ensuring unique identification across concurrent executions. - Refactored data structures in modules like CacheModule, MapReduceModule, and ParallelFanOutModule to utilize composite keys (RunId, StepId) for better state management. - Enhanced error handling and logging to include RunId, facilitating easier debugging and monitoring of workflow execution. - Overall improvements to maintainability and clarity in the workflow execution process.
- Introduced closed-world mode for workflows, enabling a subset of core primitives to achieve Turing completeness without external dependencies. - Implemented runtime and validator contracts to ensure expression evaluation, branching semantics, and state management are properly handled in closed-world mode. - Added integration tests for closed-world workflows, including Minsky-style programs and counter addition, to validate functionality and performance. - Created comprehensive documentation detailing the closed-world primitives, their usage, and the underlying architecture. - Implemented CI guards to enforce closed-world mode constraints and ensure compliance with the defined semantics.
- Introduced a comprehensive markdown document outlining the architecture of the Workflow LLM streaming capability, covering end-to-end execution paths, session semantics, and component responsibilities. - Added an audit scorecard for the LLM streaming architecture, detailing the audit scope, validation results, and scoring based on architectural constraints and testing coverage. - This commit enhances the documentation framework, providing clarity on the architecture and facilitating future audits and improvements.
- Updated `AGENTS.md` to include guidelines for using compact layouts in `sequenceDiagram` and prohibited fixed-width styles that could distort the diagrams. - Added CSS styles to `WORKFLOW_LLM_STREAMING_ARCHITECTURE.md` for better responsiveness of Mermaid diagrams, ensuring they fit within their containers and support horizontal scrolling when necessary. - Improved participant naming in sequence diagrams for clarity and consistency across the documentation. These changes aim to improve the visual representation of workflows and enhance the overall documentation quality.
- Removed the Query Service participant from the sequence diagram in `WORKFLOW_LLM_STREAMING_ARCHITECTURE.md` for clarity. - Updated the documentation to reflect the integration of `DeltaToolCall` in the streaming process, ensuring that both text and tool call events are published correctly. - Enhanced the audit scorecard to include successful test executions and improved scoring metrics, reflecting the current state of the architecture. - Introduced new interfaces and implementations for state snapshot emission, improving the overall architecture and ensuring better handling of workflow states. These changes aim to streamline the architecture documentation and improve the clarity and functionality of the Workflow LLM streaming capabilities.
…d enhanced documentation - Renamed the scorecard to indicate it is a review version. - Added new architecture validation checks, including various guards for architecture, routing, and test stability. - Updated evidence references in the documentation to reflect the latest code changes and test coverage. - Improved overall scoring from 98 to 99, indicating enhanced architecture verification and testing outcomes. These changes aim to provide a clearer and more accurate representation of the Workflow LLM streaming architecture's current state and its validation processes.
…y WebSocket frames - Updated the WebSocket protocol to handle both text and binary message types, allowing for more versatile communication. - Refactored the command parser to accommodate binary frames and ensure proper response message types are maintained. - Improved documentation to reflect changes in WebSocket handling and updated audit scorecard to indicate full support for text/binary interactions. - Enhanced test coverage for WebSocket command parsing and execution, ensuring robust handling of both frame types. These changes aim to improve the flexibility and reliability of the Workflow LLM streaming architecture, facilitating better integration of multimodal data streams.
…ing and message types - Updated event type handling in `WorkflowRunEventContracts` to utilize constants from `WorkflowRunEventTypes`, enhancing maintainability and reducing hardcoded strings. - Introduced `WorkflowCapabilityMessageTypes` for better organization of message types used in WebSocket communications. - Enhanced `ChatWebSocketProtocol` and related classes to support structured message envelopes, improving clarity and type safety in WebSocket interactions. - Updated documentation to reflect changes in event types and message handling, ensuring consistency across the architecture. - Improved test coverage for event handling and WebSocket command parsing, ensuring robust functionality and adherence to new standards. These changes aim to streamline the architecture and enhance the reliability of the Workflow LLM streaming capabilities.
- Moved `ChatInput` and `ChatWsCommand` models from `Application.Abstractions` to `Infrastructure/CapabilityApi`, clarifying the boundary between host protocols and application contracts. - Introduced `ChatCapabilityMessageTypes` for better organization of message types related to chat commands. - Updated documentation to reflect changes in model locations and architecture layers, ensuring consistency and clarity in the Workflow LLM streaming architecture. These changes aim to enhance the structure and maintainability of the chat command handling within the architecture.
…efault - Added a new environment variable `COVERAGE_GENERATED_FILE_FILTERS` to specify default filters for excluding generated files during coverage checks. - Updated the `coverage_quality_guard.sh` script to utilize these filters in the report generation process, improving the accuracy of coverage metrics. - Revised the README to reflect the changes in file filtering for generated files, providing clearer guidance on the script's functionality. These changes aim to enhance the quality of coverage reports by ensuring that generated files do not skew the results.
- Updated the Aevatar workflow YAML documentation to include canonical schema, closed-world mode, and detailed descriptions of roles and parameters. - Introduced multiple demo workflows showcasing various event modules, including CSV to Markdown conversion, JSON value extraction, and role-level event handling. - Added edge-case demos for human input, approval processes, and signal handling to illustrate practical use cases and workflow interactions. - Implemented new event modules for CSV markdown conversion and JSON path extraction, enhancing the functionality of the workflow system.
…rences - Introduced new unit tests for the GenAI observability middleware, covering scenarios for setting provider tags based on valid and invalid metadata, handling errors, and ensuring sensitive data is included when enabled. - Added a new project reference for the Aevatar.AI.Projection module in the test project to support the new tests. - Created new test files for AI projection appliers and in-memory projection graph store, enhancing test coverage for projection functionalities. - Updated project references in CQRS projection core tests to include the InMemory projection provider, ensuring all necessary dependencies are accounted for. These changes aim to improve the robustness of the observability middleware and enhance the overall test coverage across the project.
…to simplify call chain handling.
d514739 to
aeeeda6
Compare
- Introduced new unit tests for the InMemoryProjectionGraphStore, covering scenarios for listing nodes and edges by owner, handling missing parameters, and ensuring proper behavior with empty collections. - Added tests for the ProjectionSessionEventHub, validating argument handling and ensuring correct behavior when publishing and subscribing to events with invalid inputs. - Created a new test file for ProjectionPortBase, enhancing coverage for query and lifecycle services, including validation of constructor arguments and method behaviors under various conditions. These changes aim to improve test coverage and ensure the reliability of projection functionalities across the project.
…ne YAML - Introduced a comprehensive audit report detailing the architecture review of the Workflow execution chain and Inline YAML validation. - Documented audit scope, input sources, verification results, and overall scoring, highlighting critical issues and recommendations. - Identified three P1 blocking issues and one P2 major issue, emphasizing the need for immediate fixes before merging. - Provided a detailed breakdown of findings, including specific code evidence and suggested remediation steps for each identified issue. - Included a regression testing matrix to ensure coverage of newly identified scenarios and maintain workflow integrity.
…ion improvements - Added a new `DemoWorkflowModulePack` to encapsulate demo modules for CSV markdown conversion, JSON path extraction, and template processing. - Implemented role-level event handling in `DemoCsvMarkdownModule`, `DemoJsonPickModule`, and `DemoTemplateModule` to ensure proper processing of ChatRequest events. - Introduced validation for workflow definitions at runtime, ensuring that only known step types are utilized and enhancing error handling for invalid workflows. - Updated CI workflow to streamline regression testing and coverage checks, improving overall stability and reliability of the build process. - Enhanced documentation for LLM streaming architecture and audit scorecard, providing clearer insights into the system's capabilities and performance metrics.
…ecard - Added a detailed markdown report for the PR review of the Workflow Runtime and Inline YAML, outlining the audit scope, verification results, and scoring. - Documented the closure of four critical issues, including three P1 and one P2, with evidence and remediation steps. - Enhanced the evaluation criteria and provided a regression testing matrix to ensure comprehensive coverage of identified scenarios. - Updated the GuardModule and WorkflowLoopModule to reflect changes in metadata handling and step evaluation logic. - Introduced new tests to validate the behavior of while loops and next step transitions, ensuring robustness in workflow execution.
…atar into feature/primitives
…lta semantics - Replaced `StreamToolCallAccumulator` with `StreamingToolCallAccumulator` in `RoleGAgent` and `ChatRuntime` to enhance streaming capabilities. - Updated `MEAILLMProvider` and `TornadoLLMProvider` to utilize new delta conversion methods, ensuring proper handling of tool calls with missing IDs. - Added unit tests to validate behavior when tool call IDs appear late in the streaming process, ensuring robust functionality and adherence to new standards. These changes aim to improve the reliability and clarity of tool call management across the chat and LLM provider components.
…tar into feature/primitives
- Updated `ValidateWorkflowDefinition` in `WorkflowGAgent` to include built-in canonical types for improved workflow validation. - Modified `ConnectorCallModule` to ensure the correct assignment of `RunId`, enhancing the reliability of connector requests. These changes aim to strengthen the validation process and improve the handling of connector requests within the workflow system.
- Introduced a new `WaitSignalTimeoutFiredEvent` and `WorkflowStepTimeoutFiredEvent` to improve event handling for signal timeouts and workflow step timeouts. - Updated `WaitSignalModule` to require `step_id` for disambiguating multiple waiters on the same signal, ensuring correct processing of concurrent waiters. - Refactored `WorkflowLoopModule` to normalize `run_id` and handle timeouts more effectively, improving reliability in workflow execution. - Added unit tests to validate the new timeout handling and ensure correct behavior when multiple waiters are present for the same signal. These changes aim to enhance the robustness and clarity of signal and timeout management within the workflow system.
- Introduced a new markdown document detailing the PR review audit for workflow concurrent run correlation, including audit scope, verification results, and scoring. - Updated `WaitingForSignalEvent` to include `run_id` for improved event handling and traceability. - Enhanced `WaitSignalModule` and `WorkflowLoopModule` to enforce `run_id` checks, ensuring robust handling of concurrent workflows. - Refactored `MakerRecursiveModule` and `MakerVoteModule` to propagate `run_id` through events, maintaining isolation across runs. - Added comprehensive unit tests to validate the new behavior and ensure correct propagation of `run_id` in various scenarios. These changes aim to enhance the reliability and clarity of workflow execution and auditing processes.
- Introduced a new script `workflow_runid_guard.sh` to enforce the requirement that `RunId` must be explicitly set in `StepRequestEvent` and `StepCompletedEvent` initializers. - Updated `architecture_guards.sh` to include the execution of the new run-id guard as part of the CI workflow. These changes aim to enhance the validation of workflow events and ensure proper handling of run identifiers in the workflow system.
…workflow - Updated `WorkflowRunActorResolver` to prevent in-place reconfiguration of already bound actors when inline YAML is provided, ensuring isolated actor creation for new runs. - Enhanced `WorkflowImplicitModuleDependencyExpander` to include implicit mapping of `cache` to `llm_call`, addressing previously identified dependency gaps. - Added regression tests to validate the new behavior and ensure proper handling of workflow configurations. - Achieved a score of 96/100 in the remediation review, recommending merge based on successful closure of critical issues. These changes aim to improve the reliability and consistency of workflow execution with inline YAML and cache dependencies.
- Introduced multiple unit tests for the `CacheModule`, covering scenarios such as metadata handling, cache key shortening, and behavior when multiple requests share the same cache key. - Enhanced `WorkflowImplicitModuleDependencyExpander` to ensure that implicit modules correctly include `llm_call` for various workflow types, improving dependency management. - Added tests to validate the behavior of the `ResolveOrCreateAsync` method in `WorkflowRunActorResolver`, ensuring proper error handling for missing actors and unsupported actor types. These changes aim to strengthen the testing framework and improve the reliability of workflow execution and module interactions.
- Introduced comprehensive unit tests for `ChatSessionKeys`, validating the creation of session IDs with various inputs and ensuring proper exception handling for invalid cases. - Added tests for `ProjectionRuntimeCoverage`, verifying the resolution of metadata from providers and handling of missing providers, along with compensation event dispatching in the projection store. - Enhanced the testing framework to improve reliability and coverage of workflow execution and projection functionalities.
…eryService - Introduced new tests for the `WaitSignalModule`, covering scenarios for timeout handling, step completion, and behavior when no payload paths are present. - Added a test for `WorkflowExecutionProjectionQueryService` to ensure it returns an empty root subgraph when the actor ID is null. - Enhanced the testing framework to improve coverage and reliability of workflow execution and projection functionalities.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This branch introduces a major workflow capability upgrade against
dev, centered on:/api/chat&/api/ws/chatChatInput enhancement to support inlineworkflowYaml.Why
Key Changes
1) Workflow primitives expansion + closed-world completeness
src/workflow/Aevatar.Workflow.Core/WorkflowCoreModulePack.csswitch,race,map_reduce,evaluate,reflect,guard,delay,emit,cache,wait_signal,human_input,human_approvalwhile,foreach,parallel,workflow_call,workflow_loopsrc/workflow/Aevatar.Workflow.Core/Primitives/WorkflowPrimitiveCatalog.cssrc/workflow/Aevatar.Workflow.Core/Expressions/WorkflowExpressionEvaluator.cstest/Aevatar.Integration.Tests/WorkflowTuringCompletenessTests.csworkflows/turing-completeness/counter-addition.yamlworkflows/turing-completeness/minsky-inc-dec-jz.yamltools/ci/workflow_closed_world_guards.sh2)
/api/chatChatInput and run-start semanticsChatInputnow acceptsworkflowYaml:src/workflow/Aevatar.Workflow.Application.Abstractions/Runs/WorkflowCapabilityApiModels.cssrc/workflow/Aevatar.Workflow.Application.Abstractions/Runs/WorkflowChatRunModels.cssrc/workflow/Aevatar.Workflow.Infrastructure/CapabilityApi/ChatEndpoints.cssrc/workflow/Aevatar.Workflow.Infrastructure/CapabilityApi/ChatWebSocketCommandParser.cssrc/workflow/Aevatar.Workflow.Application/Runs/WorkflowRunActorResolver.cssrc/workflow/Aevatar.Workflow.Application.Abstractions/Runs/IWorkflowRunActorPort.cssrc/workflow/Aevatar.Workflow.Infrastructure/Runs/WorkflowRunActorPort.csINVALID_WORKFLOW_YAMLWORKFLOW_NAME_MISMATCHsrc/workflow/Aevatar.Workflow.Infrastructure/CapabilityApi/ChatRunStartErrorMapper.cs3) WhileModule / WorkflowLoop runtime fixes
WhileModulerefactored to run-scoped state, expression-based continuation, and sub-parameter propagation:src/workflow/Aevatar.Workflow.Core/Modules/WhileModule.csWorkflowLoopModulestrengthened with:run_id)src/workflow/Aevatar.Workflow.Core/Modules/WorkflowLoopModule.cs4) Human-in-the-loop workflow support
run_idadded to core workflow execution eventsWorkflowSuspendedEvent,WorkflowResumedEvent,WaitingForSignalEvent,SignalReceivedEventsrc/workflow/Aevatar.Workflow.Abstractions/workflow_execution_messages.protosrc/workflow/Aevatar.Workflow.Core/Modules/HumanInputModule.cssrc/workflow/Aevatar.Workflow.Core/Modules/HumanApprovalModule.cssrc/workflow/Aevatar.Workflow.Core/Modules/WaitSignalModule.cssrc/workflow/Aevatar.Workflow.Projection/Reducers/WorkflowSuspendedEventReducer.cs5) Workflow demo suite and docs
demos/Aevatar.Demos.Workflow01to47YAML workflows covering data/control/composition/HITL cases.demos/Aevatar.Demos.Workflow.Web(UI + custom demo modules + interactive execution).docs/WORKFLOW.mddocs/WORKFLOW_PRIMITIVES.mddocs/architecture/workflow-closed-world-turing-completeness.mdAPI Impact
POST /api/chatand WS command payload now supportworkflowYaml.INVALID_WORKFLOW_YAMLWORKFLOW_NAME_MISMATCHrun_idand suspension/signal event types.Compatibility Notes
dev), not a narrow patch.Test Plan
dotnet test test/Aevatar.Integration.Tests/Aevatar.Integration.Tests.csproj --filter "FullyQualifiedName~WorkflowLoopModuleCoverageTests|FullyQualifiedName~WorkflowAdditionalModulesCoverageTests|FullyQualifiedName~WorkflowValidatorCoverageTests|FullyQualifiedName~WorkflowTuringCompletenessTests" --nologodotnet test test/Aevatar.Workflow.Application.Tests/Aevatar.WorkflowApplication.Tests.csproj --nologodotnet test test/Aevatar.Workflow.Host.Api.Tests/Aevatar.Workflow.Host.Api.Tests.csproj --nologodotnet test test/Aevatar.Workflow.Core.Tests/Aevatar.Workflow.Core.Tests.csproj --nologo