0.95.0
Known Issues
- Enabling debug logs while having a GCP, Azure or S3 artifact log store will cause a deadlock at the end of a pipeline or step run. Fixed in
0.95.1.
Breaking Changes
- PR #4844: ZenML now supports Python 3.14, and environments using the
localorserverextras must also accommodate the SQLModel upgrade from 0.18.0 to 0.38.0. If you depend on those extras, review and update any pinned SQLModel-related dependencies before upgrading. - PR #4900: Local MLflow tracking now uses a SQLite backend by default when no
tracking_uriis configured. New tracking metadata is stored in<LOCAL_ARTIFACT_STORE>/mlflow.dband artifacts under the local artifact store, so users relying on the previous default local MLflow layout or behavior should update their local setup and migration expectations. - PR #4790: ZenML now requires
opentelemetry-sdk==1.40.0instead of 1.38.0. If your environment pins OpenTelemetry packages, update them to compatible versions before upgrading ZenML. - PR #4875: Step and pipeline hooks have been reworked into a new lifecycle-based hook system with persisted hook invocation records. If you use hooks or related internal APIs, review your existing integrations and update them to the new hook semantics and lifecycle events.
- PR #4919: ZenML server rate limiting no longer trusts raw
X-Forwarded-Forheaders by default. If you run ZenML behind an ingress or reverse proxy, make sure proxy header handling is explicitly configured so login rate limiting continues to use the correct client IPs. - PR #4459: CLI
listcommands now return the newest items first by default instead of the oldest first. If you have scripts or workflows that assumed the previous ordering, update them to explicitly sort or handle the new default order. - PR #4566: The deprecated singular
tagfield has been removed fromTaggableFilters. Update any API or client code to use the supported tag filtering format instead of passing a singletagvalue. - PR #4950: Pipeline execution may now raise different exception types depending on how step futures are awaited. If you catch exceptions around pipeline execution, review and update your error-handling logic to account for
StepExecutionExceptionbeing raised in implicit await scenarios. - PR #4867: ZenML now requires
modal>=1.4.0,<2.0.0when using the Modal integration.
New ways to run code and pipelines
This release expands how you can execute work in ZenML, from async Python to arbitrary commands and new remote execution backends.
- Define steps and hooks with
async def; ZenML now runs async functions on a fresh event loop for both normal and dynamic pipeline usage. PR #4913 - Run arbitrary commands as pipeline steps with
CommandStep(...), including non-Python commands and Python callables that do not require ZenML in the execution environment. PR #4904 - Invoke deployments asynchronously: a new deployment endpoint can submit a pipeline run and return immediately instead of waiting for completion. PR #4906
Sandboxes and Modal execution
ZenML now includes the core sandbox abstraction for isolated execution, plus new backend support for Kubernetes and Modal-based workloads.
- Added the core
Sandboxstack component abstraction for running untrusted or generated code in isolated sessions, including a built-inlocalflavor for subprocess-based execution. PR #4866 - Added a
kubernetessandbox flavor where each sandbox session runs in a dedicated Kubernetes pod, with streamed command execution and support for re-attaching to running sessions. PR #4926 - Added a Modal orchestrator flavor so complete ZenML pipelines can run on Modal, using Modal sandboxes for orchestration and step execution. PR #4915
Integrations and deployment improvements
Several integrations and deployment paths are more flexible and production-ready.
- Kubernetes deployments now merge
pod_settings.resourcesinto the deployment template context, making it possible to set pod resource limits required by cluster policies such as OPA Gatekeeper constraints. PR #4523 - Databricks-managed MLflow deployments now support machine-to-machine OAuth authentication via service principals. PR #4947
Performance and scalability
Common list and hydration operations should be faster and more reliable on larger ZenML deployments.
- Improved list endpoint ordering so descending sorts can use matching index scans instead of forcing expensive mixed-direction database sorts. PR #4890
- Added targeted database indexes for common pagination and hydration query patterns across pipeline runs, snapshots, step configurations, step runs, and artifact versions. PR #4942
- Adjusted request timeout behavior so only deduplicated/cacheable requests may return a timeout or backpressure response while work continues in the background. PR #4942
Security and permissions
This release tightens authorization checks around API keys, stack deployments, secrets, and tag-resource relationships.
- Service-account API key validation now handles omitted internal verification values and client-provided key values consistently, while preserving internal re-authentication behavior. PR #4920
GET /api/v1/stack-deployment/stacknow verifies READ permissions for both the returned stack and its associated service connector before returning deployment metadata. PR #4917- Secret reference resolution now prevents users from attaching private secrets owned by others, or internal ZenML-managed secrets, to their own resources. PR #4923
- Tag-resource endpoints now require UPDATE permissions on the referenced resource before tag relationships can be created or deleted, including batch operations. PR #4927
- Tag-resource RBAC enforcement now lives in the RBAC store layer for more consistent behavior, and tag reads remain broadly available as server-wide resources. PR #4938
Fixed
- Fixed several dynamic pipeline edge cases around retries, stopping runs, and isolated step launch states:
- Step failures that happen while launching a retry are now detected.
- Steps no longer move to
RETRYINGif the run is alreadySTOPPINGorSTOPPED. - Runs that fail while
STOPPINGnow transition toSTOPPEDinstead ofFAILED. - Isolated steps now use
PROVISIONINGwhile they are being launched. PR #4916
- Fixed an
IndexErrorwhen step inputs annotated as barelistortuplereceived multiple input artifacts. ZenML now loads each artifact using its stored data type, matching behavior forAnyor unannotated inputs. PR #4929 - Fixed GKE Kubernetes API endpoint selection in the GCP service connector by only using the DNS endpoint when it allows external traffic; otherwise, ZenML falls back to the IP-based endpoint. PR #4934
What's Changed
- Add version 0.94.5 to legacy docs by @github-actions[bot] in #4884
- Support python 3.14 by @schustmi in #4844
- Document GCP private workspace connectivity by @safoinme in #4842
- Split the SDK reference into ZenML and Kitaru sections by @htahir1 in #4896
- Fix time format in dependabot.yml by @strickvl in #4898
- Bump the minor-and-patch group with 7 updates by @dependabot[bot] in #4891
- Fix docs link checker domain policies by @strickvl in #4901
- Upgrade Modal step operator to support
modal>=1by @strickvl in #4038 - Bump the minor-and-patch group with 6 updates by @dependabot[bot] in #4909
- Fix breaking CI checks by @Json-Andriopoulos in #4900
- Improve tiebreaker sort direction in list calls by @schustmi in #4890
- Merge
pod_settings.resourcesinto deployment template context by @eliottiti in #4523 - Fix Skypilot Kubernetes orchestrator ignoring down and idle_minutes_to_autostop settings by @XnetLoL in #4704
- k8s deployer: select Service on deployment-id, not overridable app label by @joaquinhuigomez in #4748
- Fix SkyPilot docker-run env vars lost under sudo by @demian-overflow in #4777
- Show more info on recoverable pending pods by @Json-Andriopoulos in #4887
- Updating the
opentelemetry-sdkdependency by @bcdurak in #4790 - Add dependency audit workflow and remediate flagged dependencies by @strickvl in #4805
- Fix timing out test child pipelines test by @Json-Andriopoulos in #4914
- refactor: renamed structlog helper and introduced a new clear contextvar helper by @amitvikramraj in #4918
- Misc dynamic pipeline fixes by @schustmi in #4916
- Improved step and pipeline hooks by @schustmi in #4875
- Add Sandbox stack component (core abstraction) by @htahir1 in #4866
- Fix broken hooks docs link failing the absolute-links check by @htahir1 in #4925
- Tighten API key validity checks by @stefannica in #4920
- Add missing authorization checks to stack deployment endpoint by @stefannica in #4917
- Bump JulienKode/team-labeler-action from 2.0.2 to 3.0.0 by @dependabot[bot] in #4910
- Prevent referencing private and internal secrets internally. by @stefannica in #4923
- Fix IndexError for bare list step input annotations by @schustmi in #4929
- Fix the rate-limiting by @stefannica in #4919
- Enforce update permissions for tag-resource endpoints by @mosskappa in #4927
- Enable async steps and hooks by @schustmi in #4913
- Fix unit test by @stefannica in #4933
- Fix default sorting for CLI commands to return newest first by @MukeshK17 in #4459
- Kubernetes sandbox by @schustmi in #4926
- Fix the GCP service connector DNS endpoint access by @stefannica in #4934
- Link to correct hooks page by @schustmi in #4940
- updated base Dockerfile to include 'otel' in ZenML package installation dependencies by @amitvikramraj in #4932
- Fix local sandbox unit tests on Windows by @htahir1 in #4937
- Support snapshots are source types for platform triggers by @Json-Andriopoulos in #4924
- Async deployment invocation by @schustmi in #4906
- Flaky tests: Fix CI tests v2 by @Json-Andriopoulos in #4935
- Update reviewer agent guidance by @strickvl in #4939
- Run arbitrary (non-)python commands as steps by @schustmi in #4904
- Fix hook test by @schustmi in #4948
- Advanced filters: Phase 1 - new ops, multiple filters by @bcdurak in #4566
- Bump actions/github-script from 8.0.0 to 9.0.0 by @dependabot[bot] in #4945
- Bump the minor-and-patch group with 5 updates by @dependabot[bot] in #4944
- Normalize REST store URL before moving credentials by @Json-Andriopoulos in #4951
- Improve internal future awaiting error handling by @schustmi in #4950
- Add DB indexes and request timeout fix by @safoinme in #4942
- Print pytest rerun failures by @schustmi in #4952
- Support databricks oauth m2m for MLFlow by @Json-Andriopoulos in #4947
- Add Modal orchestrator by @strickvl in #4915
- Add Modal Sandbox flavor by @htahir1 in #4867
- Use uv as a default when fetching installed python packages by @schustmi in #4955
- Improved tag resource RBAC checks and consistent RBAC access to tags by @stefannica in #4938
- Only show OSS dashboard warning for non-default projects by @qubeena07 in #4833
- Prepare release 0.95.0 by @github-actions[bot] in #4956
New Contributors
- @eliottiti made their first contribution in #4523
- @XnetLoL made their first contribution in #4704
- @demian-overflow made their first contribution in #4777
- @mosskappa made their first contribution in #4927
- @MukeshK17 made their first contribution in #4459
- @qubeena07 made their first contribution in #4833
Full Changelog: 0.94.6...0.95.0