-
Notifications
You must be signed in to change notification settings - Fork 545
Deployed pipelines #3920
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Deployed pipelines #3920
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- Add core direct execution engine that can run ZenML pipelines locally - Implement step-by-step execution with proper artifact handling - Add support for parameter injection and step output resolution - Include comprehensive logging and error handling
…o/zenml into feature/served-pipelines
This update removes the complex output resolution logic from the DirectExecutionEngine, allowing it to directly use the step output returned by the step function. This change simplifies the code and improves performance by eliminating unnecessary exception handling for output resolution. Additionally, comprehensive logging has been maintained to ensure clarity in the execution process. No functional changes are introduced, and the code remains backward compatible.
This update introduces a new method, `_resolve_step_output`, to handle the resolution of specific outputs from a step's return value. The method accommodates various output formats, including single values, dictionaries, and tuples/lists, improving the flexibility and robustness of output handling.
This commit introduces a new example demonstrating a conversational AI chat agent pipeline that integrates with ZenML's serving infrastructure. The pipeline allows for real-time chat applications, utilizing OpenAI's API for generating responses based on user input. Additionally, the README.md has been updated to include this new example, along with a brief overview of its features and usage instructions. New files: - `examples/serving/chat_agent_pipeline.py`: Implementation of the chat agent pipeline. - Updates to `examples/serving/README.md` to document the new example.
…o/zenml into feature/served-pipelines
…o/zenml into feature/served-pipelines
This update introduces Docker settings for the chat and weather agent pipelines, allowing them to utilize the OpenAI API key from environment variables. Additionally, the pipeline decorators have been updated to include these settings. Also, CORS middleware has been added to the FastAPI application to enable frontend access, with a note to restrict origins in production for security. Enhancements to the parameter schema extraction in the PipelineServingService have been implemented, improving the extraction of function signatures and parameter types. New request and response models for pipeline execution and chat interface have been added to the pipeline endpoints. Fixes #3904
…o/zenml into feature/served-pipelines
This commit introduces a comprehensive framework for managing capture policies in ZenML's pipeline serving. It includes five distinct capture modes to control the observability of requests, balancing privacy and performance. Additionally, step-level capture annotations have been implemented, allowing for fine-grained control over which inputs and outputs are captured for each step. This enhancement provides users with the ability to specify capture behavior directly in their pipeline definitions. New documentation has been added to explain the capture policies and their configurations, along with examples demonstrating their usage in both pipeline and step contexts. Fixes #3911
This commit introduces a new dependency injection system for ZenML's serving components, enhancing modularity and testability. Key changes include the creation of a `ServingContainer` class to manage service instances and their initialization order. The FastAPI application now utilizes dependency injection for accessing services like `PipelineServingService`, `JobRegistry`, and `StreamManager`. Additionally, several global service instances have been removed to streamline the architecture, and the lifespan management of the FastAPI application has been improved. This refactor lays the groundwork for better service management and easier testing.
…o/zenml into feature/served-pipelines
…into feature/served-pipelines
…o/zenml into feature/served-pipelines
…into feature/served-pipelines
schustmi
approved these changes
Sep 26, 2025
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Describe changes
This PR introduces pipeline deployment (i.e. serving a pipeline as a HTTP endpoint) capability to ZenML.
Deployer Stack Components
A new Deployer stack component type is introduced with 3 different implementations / flavors:
docker
: manages pipeline deployments as locally running Docker containersgcp
: manages pipeline deployments as GCP Cloud Run instancesaws
: manages pipeline deployments as AWS App Runner instancesAdd a Deployer to your stack to unlock the ability to deploy pipelines and snapshots:
Example:
Deployments
Introducing Deployments as entities resulting from deploying pipelines. These are representations of the HTTP servers used to run pipelines in online / inference mode. They can be managed independently of pipelines and the active stack.
Examples:
Pipeline Deployment devX
It is possible to deploy any existing pipeline without any modifications. However, the utility comes from having parameterized pipeline steps:
The pipeline steps that expect input parameters are those that dictate the signature of the HTTP endpoint:
Two new pipeline-level hooks are supported for initialization and cleanup. These hooks are only executed once per deployment, during server startup and shutdown. The initialization hook may also take input parameters and return a value which is preserved as the global state:
Pre-requisites
Please ensure you have done the following:
develop
and the open PR is targetingdevelop
. If your branch wasn't based on develop read Contribution guide on rebasing branch to develop.Types of changes