Skip to content

Conversation

stefannica
Copy link
Contributor

@stefannica stefannica commented Sep 2, 2025

Describe changes

This PR introduces pipeline deployment (i.e. serving a pipeline as a HTTP endpoint) capability to ZenML.

Deployer Stack Components

A new Deployer stack component type is introduced with 3 different implementations / flavors:

  • docker: manages pipeline deployments as locally running Docker containers
  • gcp: manages pipeline deployments as GCP Cloud Run instances
  • aws: manages pipeline deployments as AWS App Runner instances

Add a Deployer to your stack to unlock the ability to deploy pipelines and snapshots:

Example:

zenml deployer register docker -f docker
zenml stack register docker-deployer -o default -a default -D docker --set
zenml pipeline deploy --name my-endpoint my_module.my_pipeline 
zenml pipeline snapshot deploy --name my-endpoint my-snapshot 

curl -X POST "http://localhost:8000/invoke"   -H "Content-Type: application/json"   -d '{"parameters": {"city": "Munich"}}'

or

zenml pipeline endpoint invoke my-endpoint --param-one=value-one --param-two=value-two

Deployments

Introducing Deployments as entities resulting from deploying pipelines. These are representations of the HTTP servers used to run pipelines in online / inference mode. They can be managed independently of pipelines and the active stack.

Examples:

zenml deployment list
zenml deployment describe my-endpoint
zenml deployment invoke my-endpoint --param-one=value-one --param-two=value-two
zenml deployment logs -f my-endpoint
zenml deployment deprovision my-endpoint

Pipeline Deployment devX

It is possible to deploy any existing pipeline without any modifications. However, the utility comes from having parameterized pipeline steps:

@pipeline
def weather_agent_pipeline(city: str = "London") -> str:
    weather_data = get_weather(city=city)
    result = analyze_weather_with_llm(weather_data=weather_data, city=city)
    return result

The pipeline steps that expect input parameters are those that dictate the signature of the HTTP endpoint:

curl -X POST "http://localhost:8000/invoke"   -H "Content-Type: application/json"   -d '{"parameters": {"city": "Munich"}}'

Two new pipeline-level hooks are supported for initialization and cleanup. These hooks are only executed once per deployment, during server startup and shutdown. The initialization hook may also take input parameters and return a value which is preserved as the global state:

def init_hook(**kwargs: Any) ->  openai.OpenAI:
    return  openai.OpenAI(**config)

@step
def analyze_weather_with_llm(weather_data: Dict[str, float], city: str) -> str:
    step_context = get_step_context()
    llm = step_context.pipeline_state
    ...

@pipeline(
    on_init=init_hook,
    on_init_kwargs={"url"="openapi.org/..."}
)
def weather_agent_pipeline(city: str = "London") -> str:
    weather_data = get_weather(city=city)
    result = analyze_weather_with_llm(weather_data=weather_data, city=city)
    return result

Pre-requisites

Please ensure you have done the following:

  • I have read the CONTRIBUTING.md document.
  • I have added tests to cover my changes.
  • I have based my new branch on develop and the open PR is targeting develop. If your branch wasn't based on develop read Contribution guide on rebasing branch to develop.
  • IMPORTANT: I made sure that my changes are reflected properly in the following resources:
    • ZenML Docs
    • Dashboard: Needs to be communicated to the frontend team.
    • Templates: Might need adjustments (that are not reflected in the template tests) in case of non-breaking changes and deprecations.
    • Projects: Depending on the version dependencies, different projects might get affected.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Other (add details above)

stefannica and others added 30 commits August 25, 2025 22:54
- Add core direct execution engine that can run ZenML pipelines locally
- Implement step-by-step execution with proper artifact handling
- Add support for parameter injection and step output resolution
- Include comprehensive logging and error handling
This update removes the complex output resolution logic from the DirectExecutionEngine, allowing it to directly use the step output returned by the step function. This change simplifies the code and improves performance by eliminating unnecessary exception handling for output resolution.

Additionally, comprehensive logging has been maintained to ensure clarity in the execution process.

No functional changes are introduced, and the code remains backward compatible.
This update introduces a new method, `_resolve_step_output`, to handle the resolution of specific outputs from a step's return value. The method accommodates various output formats, including single values, dictionaries, and tuples/lists, improving the flexibility and robustness of output handling.
This commit introduces a new example demonstrating a conversational AI chat agent pipeline that integrates with ZenML's serving infrastructure. The pipeline allows for real-time chat applications, utilizing OpenAI's API for generating responses based on user input.

Additionally, the README.md has been updated to include this new example, along with a brief overview of its features and usage instructions.

New files:
- `examples/serving/chat_agent_pipeline.py`: Implementation of the chat agent pipeline.
- Updates to `examples/serving/README.md` to document the new example.
This update introduces Docker settings for the chat and weather agent pipelines, allowing them to utilize the OpenAI API key from environment variables. Additionally, the pipeline decorators have been updated to include these settings.

Also, CORS middleware has been added to the FastAPI application to enable frontend access, with a note to restrict origins in production for security.

Enhancements to the parameter schema extraction in the PipelineServingService have been implemented, improving the extraction of function signatures and parameter types.

New request and response models for pipeline execution and chat interface have been added to the pipeline endpoints.

Fixes #3904
This commit introduces a comprehensive framework for managing capture policies in ZenML's pipeline serving. It includes five distinct capture modes to control the observability of requests, balancing privacy and performance.

Additionally, step-level capture annotations have been implemented, allowing for fine-grained control over which inputs and outputs are captured for each step. This enhancement provides users with the ability to specify capture behavior directly in their pipeline definitions.

New documentation has been added to explain the capture policies and their configurations, along with examples demonstrating their usage in both pipeline and step contexts.

Fixes #3911
This commit introduces a new dependency injection system for ZenML's serving components, enhancing modularity and testability. Key changes include the creation of a `ServingContainer` class to manage service instances and their initialization order. The FastAPI application now utilizes dependency injection for accessing services like `PipelineServingService`, `JobRegistry`, and `StreamManager`.

Additionally, several global service instances have been removed to streamline the architecture, and the lifespan management of the FastAPI application has been improved. This refactor lays the groundwork for better service management and easier testing.
@stefannica stefannica merged commit 495f1f3 into develop Sep 26, 2025
42 of 44 checks passed
@stefannica stefannica deleted the feature/served-pipelines branch September 26, 2025 13:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request internal To filter out internal PRs and issues run-slow-ci
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants