Skip to content

Add inline workflow bundle routing and YAML validation gate for chat runs#21

Merged
eanzhao merged 22 commits intodevfrom
feature/workflow-call
Mar 5, 2026
Merged

Add inline workflow bundle routing and YAML validation gate for chat runs#21
eanzhao merged 22 commits intodevfrom
feature/workflow-call

Conversation

@eanzhao
Copy link
Copy Markdown
Contributor

@eanzhao eanzhao commented Feb 27, 2026

Problem

  • Chat run entry accepted a single inline YAML and had inconsistent normalization across HTTP, command, and WebSocket paths.
  • Dynamic workflow execution could proceed with malformed YAML fences.
  • workflow_call sub-workflow resolution only relied on registry names and could not consume inline workflow bundles.

Solution

  • Add a shared ChatRunRequestNormalizer and use it in /api/chat, command endpoint, and WebSocket coordinator.
  • Expand request model from workflowYaml to workflowYamls and support inline bundle parsing/validation in WorkflowRunActorResolver.
  • Add FileBackedWorkflowNameCatalog for file-backed workflow existence checks before run startup.
  • Extend ConfigureWorkflowEvent and workflow state with inline_workflow_yamls so parent and child workflow calls can share inline definitions.
  • Add workflow_yaml_validate module and wire it into built-in auto and auto_review flows before approval and execution.
  • Tighten DynamicWorkflowModule extraction and validation by selecting the last fenced YAML block and running parser/validator checks before reconfigure.
  • Update demo UI handling and Host/API capability docs to align with validation-first behavior.

Impact

  • API input shape changes from workflowYaml to workflowYamls (array). Clients should migrate.
  • Prompt-only chat requests normalize to auto when neither workflow name nor inline YAML bundle is provided.
  • Inline multi-workflow scenarios can now resolve workflow_call targets from actor state.

Testing

  • Updated test/Aevatar.Workflow.Application.Tests/WorkflowApplicationLayerTests.cs for:
    • fallback behavior with inline YAML bundles
    • actor resolution with inline-bundle precedence and invalid YAML handling
    • updated port signatures carrying inline workflow maps

Verification Checklist

  • dotnet build aevatar.slnx --nologo
  • dotnet test test/Aevatar.Workflow.Application.Tests/Aevatar.Workflow.Application.Tests.csproj --nologo
  • dotnet test test/Aevatar.Workflow.Host.Api.Tests/Aevatar.Workflow.Host.Api.Tests.csproj --nologo

Docs

  • docs/workflow-chat-ws-api-capability.md
  • src/workflow/Aevatar.Workflow.Host.Api/CHAT_API_CAPABILITIES.md
  • src/workflow/Aevatar.Workflow.Host.Api/README.md

…g management

- Introduced a new extension model for RoleGAgent, allowing behavior customization without creating derived classes.
- Added standard events for app state and config management: SetRoleAppConfigEvent and SetRoleAppStateEvent.
- Updated RoleGAgentState to include app state and config properties, enabling event sourcing recovery.
- Enhanced documentation in ROLE.md to outline the new extension conventions and codec/versioning rules.
- Added comprehensive tests to validate the new functionality and ensure backward compatibility.
- Added environment variables for API keys in the CI configuration to support external integrations.
- Updated project references in the Aevatar.Demos.Workflow project to include new application and infrastructure modules.
- Implemented multilevel workflow calling with new YAML definitions for sub-workflows, enhancing the workflow execution capabilities.
- Improved the web application UI with new features for workflow interaction and logging, including state persistence and detailed log entries.
- Enhanced documentation for workflow primitives and added a comprehensive audit scorecard for uncommitted changes.

These changes aim to improve the overall functionality, usability, and maintainability of the workflow system.
… hotplug design

- Created a comprehensive document outlining the capabilities of the Workflow Chat API, detailing endpoints, input models, and automatic orchestration features.
- Introduced a design document for the Aevatar Event Module hotplug extension, discussing the background, proposed solutions, and lifecycle management for dynamic module loading.
- Added a new Workflow YAML validation module to enhance workflow processing and error handling.
- Implemented a normalization utility for chat run requests to streamline input handling and workflow name resolution.
- Established a file-backed workflow name catalog for improved workflow name management.

These changes aim to enhance the documentation and modularity of the workflow system, improving usability and extensibility.
- Introduced a new `WaitSignalTimeoutFiredEvent` and updated `SignalReceivedEvent` to include `step_id`, improving the granularity of signal handling.
- Refactored `WaitSignalModule` to utilize a `PendingSignalKey` structure, allowing for better management of pending signals with step context.
- Updated `WorkflowLoopModule` to handle step timeouts more effectively, ensuring that timeouts are associated with the correct step and run context.
- Enhanced `MakerRecursiveModule` and `MakerVoteModule` to incorporate run ID in their processing, ensuring consistency across workflow executions.
- Added comprehensive tests to validate the new signal handling and timeout mechanisms, ensuring robustness in concurrent scenarios.

These changes aim to improve the reliability and clarity of workflow execution, particularly in handling signals and timeouts, thereby enhancing overall system performance.
…w examples

- Introduced ergonomic aliases for connector calls, including `foreach_llm`, `map_reduce_llm`, and semantic aliases for HTTP methods, improving usability and clarity in workflow definitions.
- Added new examples demonstrating the use of ergonomic aliases in YAML workflows, showcasing their application in various scenarios.
- Updated documentation to reflect the new aliases and their normalization to canonical primitives, enhancing the understanding of workflow primitives.
- Included a new demo for connector integration using a local CLI connector, illustrating practical usage within workflows.

These changes aim to improve the developer experience and streamline the integration of connectors in workflow definitions.
@eanzhao eanzhao requested a review from loning March 4, 2026 03:15
loning added 15 commits March 5, 2026 01:02
- Removed the file-backed workflow name catalog, streamlining workflow name resolution to utilize registered workflows (both built-in and file-loaded).
- Updated the `ChatInput` model and related endpoints to reflect changes in workflow name handling, ensuring clarity in the use of `workflow` and `workflowYamls`.
- Enhanced the normalization process for chat run requests, ensuring that the absence of `workflow` and `workflowYamls` defaults to `auto` only when creating new actors.
- Revised documentation across multiple files to align with the new workflow semantics and improve clarity for developers.

These changes aim to enhance the usability and maintainability of the workflow system, ensuring a more consistent and intuitive experience for users.
- Introduced a comprehensive blueprint for the complete rearchitecture of the GAgent configuration system, outlining key decisions and architectural constraints.
- Removed legacy configuration models and established a new single source of truth for configuration profiles, enhancing clarity and reducing update complexity.
- Defined new domain models and responsibilities, including `ConfigProfile`, `InstanceConfigBinding`, and `RunConfigLease`, to streamline configuration management.
- Documented migration strategies and risks associated with the transition to the new architecture, ensuring a clear path for implementation.

These changes aim to improve the robustness and maintainability of the GAgent configuration system, addressing existing complexities and enhancing overall system performance.
- Introduced a new document outlining the proposed architectural changes for the actor runtime message activation mechanism, emphasizing a shift to message-driven activation.
- Removed the `RestoreAllAsync` method from `IActorRuntime` and eliminated the default behavior of host-level restoration during startup.
- Updated the framework to ensure that recovery strategies are implemented internally within the runtime, rather than being exposed at the abstraction or host levels.
- Documented the implications of these changes on the local and Orleans runtime implementations, ensuring clarity on the new activation semantics and responsibilities.

These changes aim to enhance the clarity and maintainability of the actor runtime, aligning it with the new architectural principles of message-driven activation.
- Removed the `ActorRestoreHostedService` and its associated configuration, streamlining the actor runtime initialization process.
- Updated `IActorRuntime` to eliminate the `RestoreAllAsync` method, focusing on actor lifecycle management without external restoration dependencies.
- Enhanced the `LocalActorRuntime` to ensure actors are materialized on demand, improving efficiency and clarity in actor retrieval.
- Revised documentation to reflect changes in actor management and runtime responsibilities, ensuring alignment with the new architecture.

These changes aim to simplify the actor runtime and improve its maintainability by reducing unnecessary complexity in the restoration process.
- Introduced `Aevatar.Foundation.Runtime.Implementations.Local`, providing a local in-process implementation of the actor runtime.
- Added `LocalActor`, `LocalActorRuntime`, and `LocalActorPublisher` to manage actor lifecycle, event routing, and stream subscriptions.
- Implemented dependency injection extensions for seamless integration of the local runtime into existing services.
- Updated documentation to reflect the new local runtime capabilities and usage patterns, enhancing clarity for developers.

These changes aim to improve the flexibility and performance of the Aevatar framework by enabling a robust local execution environment for actors.
- Overhauled the GAgent configuration system, eliminating legacy models and establishing a zero-manifest architecture to streamline configuration management.
- Removed `AgentManifest` and `IAgentManifestStore` from `Foundation.Abstractions`, ensuring that configuration profiles are solely managed by the host and infrastructure.
- Introduced a local activation index for lazy materialization of actors, enhancing performance while maintaining clear separation of concerns.
- Updated documentation to reflect the new architecture and configuration strategies, ensuring clarity for developers and aligning with the revised activation semantics.

These changes aim to improve the robustness and maintainability of the GAgent framework by addressing existing complexities and enhancing overall system performance.
- Incremented document version to `v5` and refined architectural details regarding class default configuration management.
- Introduced requirements for cluster-wide management and hot reload capabilities for class defaults, ensuring seamless updates without runtime restarts.
- Enhanced documentation to clarify the new configuration constraints and responsibilities, emphasizing the importance of a unified configuration source and event-driven updates.
- Updated consistency semantics to ensure that class defaults are applied effectively across instances in a distributed environment.

These changes aim to improve the clarity and robustness of the actor runtime's configuration management, aligning with the latest architectural principles.
- Eliminated the `IAgentManifestStore` and associated `AgentManifest` from the configuration system, transitioning to a zero-manifest architecture for improved clarity and performance.
- Updated `GAgentBase` and related classes to remove all references to manifest persistence, ensuring that configuration is solely managed through event sourcing and runtime defaults.
- Introduced a local activation index to enhance actor materialization efficiency while maintaining separation of concerns.
- Revised documentation to reflect the new architecture and configuration strategies, ensuring clarity for developers.

These changes aim to streamline the GAgent framework, enhancing maintainability and performance by addressing legacy complexities.
- Incremented document version to `v7` and expanded architectural details regarding the unified configuration management for `GAgentBase<TState, TConfig>`.
- Introduced new responsibilities for configuration merging and event sourcing, clarifying the roles of class defaults and state overrides.
- Enhanced documentation to reflect the updated configuration strategies, including hot reload capabilities and the prohibition of independent configuration persistence.
- Revised lifecycle semantics to ensure effective configuration updates without requiring process restarts, improving overall system responsiveness.

These changes aim to further refine the actor runtime's configuration management, ensuring clarity and robustness in line with the latest architectural principles.
- Introduced `AIAgentConfigOverrides` message to encapsulate configuration options for AI agents, allowing for more granular control over agent settings.
- Refactored `RoleGAgentState` to utilize the new configuration overrides, improving the clarity and maintainability of state management.
- Updated `AIGAgentBase` to merge class defaults with state overrides, ensuring effective configuration updates during state changes.
- Enhanced dependency injection to support class defaults provider, streamlining configuration retrieval and application.
- Revised tests to validate the new configuration structure and ensure proper functionality of state overrides.

These changes aim to improve the flexibility and robustness of the AIGAgent framework, aligning with the latest architectural principles for configuration management.
- Replaced `ConfigureRoleAgentEvent` with `InitializeRoleAgentEvent` to clarify the distinction between initialization parameters and configuration settings.
- Removed deprecated app configuration and state events, streamlining the initialization process for `RoleGAgent`.
- Updated `RoleGAgent` and related classes to utilize the new initialization event, enhancing clarity and maintainability.
- Revised documentation to reflect the changes in initialization semantics and configuration boundaries, ensuring alignment with architectural principles.
- Enhanced tests to validate the new initialization flow and ensure proper functionality of the `RoleGAgent`.

These changes aim to improve the robustness and clarity of the `RoleGAgent` framework, aligning with the latest architectural standards for configuration management.
- Removed `SetRoleName` method and `RoleAgentInitialization` class from `IRoleAgent` interface to simplify the initialization process.
- Updated `RoleGAgent` to utilize `InitializeRoleAgentEvent` for handling initialization, enhancing clarity and maintainability.
- Revised related tests to reflect changes in initialization flow and ensure proper functionality of the `RoleGAgent`.

These changes aim to streamline the role agent's initialization and improve the overall architecture of the agent framework.
- Replaced instances of ConfigureWorkflowEvent with BindWorkflowDefinitionEvent across multiple files to improve clarity in workflow binding semantics.
- Updated related methods and documentation to reflect the new event structure, ensuring consistency in workflow management.
- Revised tests to validate the changes and ensure proper functionality of the workflow binding process.

These changes aim to enhance the clarity and maintainability of the workflow system by establishing a more descriptive event for binding workflow definitions.
- Renamed `Config` to `EffectiveConfig` across multiple files to clarify its role as the merged configuration of class defaults and state overrides.
- Updated related methods and documentation to reflect the new naming and ensure consistency in configuration management.
- Revised tests to validate the changes and ensure proper functionality of the `EffectiveConfig` implementation.

These changes aim to enhance clarity and maintainability in the GAgent framework's configuration management.
- Removed the RoleExtensionsInput class and integrated its functionality directly into RoleConfigurationInput, streamlining the normalization process for event modules and routes.
- Updated RoleGAgentFactory and WorkflowParser to utilize a new PreferTopLevelText method for prioritizing top-level event fields over extensions, enhancing clarity in event configuration.
- Revised tests to reflect changes in the normalization logic and ensure proper functionality of event field binding.

These changes aim to improve the maintainability and clarity of role configuration management within the Aevatar framework.
@loning
Copy link
Copy Markdown
Contributor

loning commented Mar 4, 2026

我把configure跟initialize分开了, 把manifest下沉到了localactor, 把localactor弄出去单独开了个项目以免搞混

- Updated the starting version in the CreateEvents method from 2 to 1 to ensure proper handling of optimistic concurrency conflicts during event appending.
- Revised the test to validate that an InvalidOperationException is thrown with the expected message when a conflict occurs.

This change aims to enhance the reliability of the integration tests for the event store by accurately simulating concurrency scenarios.
@eanzhao eanzhao merged commit 8fab1a1 into dev Mar 5, 2026
8 checks passed
eanzhao added a commit that referenced this pull request May 8, 2026
Address review batch on PR #562 (10 inline comments). All in files I have
recent ownership of and require no architectural shifts:

- #16 (blocker, security): ssh_exec is now opt-in via NyxIdToolOptions.
  EnableSshExecTool. Hosts that haven't wired the approval middleware no
  longer see the tool by default. Mainnet host opts in (Lark bot needs it).
- #21 (major, bug): code_execute keeps the modern /execute + {language,
  script} contract, but on a NyxID-proxy upstream 404 it retries the legacy
  /run + {language, code} contract so deployments still pinned to old
  chrono-sandbox-service builds keep working.
- #22 (major, bug): SkillRegistry.IsFresh now exempts SkillSource != Remote
  from TTL — local skills are baked in at registration and don't need
  expiring; prior behavior dropped them from use_skill after the first 5min.
- #18 (major, bug): TurnRunner.TryResolveSenderBindingAsync narrows the
  catch to transient infra errors (Http/Timeout/IO/JSON) and surfaces
  non-transient (logic, NRE, serialization) at Error level so ops can
  distinguish "sender unbound" from "binding store broken".
- #19 (major, bug): ConversationReplyGenerator narrows the
  sender-route-fallback catch to transient errors via
  IsRetryableSenderRouteFailure. Programmer errors no longer cost an LLM
  round on retry.
- #29 + #30 (minor): inbox runtime gives metadata enrichment its own 15s
  budget separate from the LLM run, surfacing
  errorCode=llm_reply_metadata_timeout when scope/UserConfig lookup is
  slow. ResolveFallbackTimeout treats ResponseTimeoutSeconds<=0 as "no
  timeout" rather than silently snapping back to 120s.
- #12 (minor): ConversationGAgent's stream-chunk and final-stream-chunk
  edits run under a 10s CTS now; the failure path already uses one. A hung
  relay can no longer pin the actor turn forever.
- #27 (minor, security): ConstantTimeEquals docstring tightened — removed
  the "for future callers" line and added a SCOPE comment that this helper
  is rebuild-admin-only and shouldn't be promoted to internal/public
  without replacing its length-leak with a length-padding scheme.
- #23 (major, bug): CLI ornn skills slug default → ornn-api (matches the
  registered slug; bare "ornn" is the SPA frontend that returns HTML).

Build clean (NyxId / Skills / NyxidChat / Mainnet hosts), 30 AI tests +
15 inbox runtime tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants