Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions doc/source/serve/advanced-guides/advanced-autoscaling.md
Original file line number Diff line number Diff line change
Expand Up @@ -710,3 +710,16 @@ Programmatic configuration of application-level autoscaling policies through `se
:::{note}
When you specify both a deployment-level policy and an application-level policy, the application-level policy takes precedence. Ray Serve logs a warning if you configure both.
:::

:::{warning}
### Gotchas and limitations

When you provide a custom policy, Ray Serve can fully support it as long as it's simple, self-contained Python code that relies only on the standard library. Once the policy becomes more complex, such as depending on other custom modules or packages, you need to bundle those modules into the Docker image or environment. This is because Ray Serve uses `cloudpickle` to serialize custom policies and it doesn't vendor transitive dependencies—if your policy inherits from a superclass in another module or imports custom packages, those must exist in the target environment. Additionally, environment parity matters: differences in Python version, `cloudpickle` version, or library versions can affect deserialization.

#### Alternatives for complex policies

When your custom autoscaling policy has complex dependencies or you want better control over versioning and deployment, you have several alternatives:

- **Contribute to Ray Serve**: If your policy is general-purpose and might benefit others, consider contributing it to Ray Serve as a built-in policy by opening a feature request or pull request on the [Ray GitHub repository](https://github.com/ray-project/ray/issues). The recommended location for the implementation is `python/ray/serve/autoscaling_policy.py`.
- **Ensure dependencies in your environment**: Make sure that the external dependencies are installed in your Docker image or environment.
:::
15 changes: 15 additions & 0 deletions doc/source/serve/advanced-guides/custom-request-router.md
Original file line number Diff line number Diff line change
Expand Up @@ -156,3 +156,18 @@ You can customize the emission of these statistics by overriding `record_routing
in the definition of the deployment class. The custom request router can then get the
updated routing stats by looking up the `routing_stats` attribute of the running
replicas and use it in the routing policy.


:::{warning}
## Gotchas and limitations

When you provide a custom router, Ray Serve can fully support it as long as it's simple, self-contained Python code that relies only on the standard library. Once the router becomes more complex, such as depending on other custom modules or packages, you need to ensure those modules are bundled into the Docker image or environment. This is because Ray Serve uses `cloudpickle` to serialize custom routers and it doesn't vendor transitive dependencies—if your router inherits from a superclass in another module or imports custom packages, those must exist in the target environment. Additionally, environment parity matters: differences in Python version, `cloudpickle` version, or library versions can affect deserialization.

### Alternatives for complex routers

When your custom request router has complex dependencies or you want better control over versioning and deployment, you have several alternatives:

- **Use built-in routers**: Consider using the routers shipped with Ray Serve—these are well-tested, production-ready, and guaranteed to work across different environments.
- **Contribute to Ray Serve**: If your router is general-purpose and might benefit others, consider contributing it to Ray Serve as a built-in router by opening a feature request or pull request on the [Ray GitHub repository](https://github.com/ray-project/ray/issues). The recommended location for the implementation is `python/ray/serve/_private/request_router/`.
- **Ensure dependencies in your environment**: Make sure that the external dependencies are installed in your Docker image or environment.
:::
28 changes: 26 additions & 2 deletions python/ray/serve/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -243,7 +243,19 @@ def _serialize_request_router_cls(self) -> None:

def get_request_router_class(self) -> Callable:
"""Deserialize the request router from cloudpickled bytes."""
return cloudpickle.loads(self._serialized_request_router_cls)
try:
return cloudpickle.loads(self._serialized_request_router_cls)
except (ModuleNotFoundError, ImportError) as e:
raise ImportError(
f"Failed to deserialize custom request router: {e}\n\n"
"This typically happens when the router depends on external modules "
"that aren't available in the current environment. To fix this:\n"
" - Ensure all dependencies are installed in your Docker image or environment\n"
" - Package your router as a Python package and install it\n"
" - Place the router module in PYTHONPATH\n\n"
"For more details, see: https://docs.ray.io/en/latest/serve/advanced-guides/"
"custom-request-router.html#gotchas-and-limitations"
) from e


DEFAULT_METRICS_INTERVAL_S = 10.0
Expand Down Expand Up @@ -313,7 +325,19 @@ def is_default_policy_function(self) -> bool:

def get_policy(self) -> Callable:
"""Deserialize policy from cloudpickled bytes."""
return cloudpickle.loads(self._serialized_policy_def)
try:
return cloudpickle.loads(self._serialized_policy_def)
except (ModuleNotFoundError, ImportError) as e:
raise ImportError(
f"Failed to deserialize custom autoscaling policy: {e}\n\n"
"This typically happens when the policy depends on external modules "
"that aren't available in the current environment. To fix this:\n"
" - Ensure all dependencies are installed in your Docker image or environment\n"
" - Package your policy as a Python package and install it\n"
" - Place the policy module in PYTHONPATH\n\n"
"For more details, see: https://docs.ray.io/en/latest/serve/advanced-guides/"
"advanced-autoscaling.html#gotchas-and-limitations"
) from e


@PublicAPI(stability="stable")
Expand Down