[serve] Fix AttributeError when request_router is None in update_deployment_config#63180
Conversation
There was a problem hiding this comment.
Code Review
This pull request addresses a potential AttributeError in AsyncioRouter.update_deployment_config by ensuring the request_router is initialized before accessing its replicas. It also includes regression tests for various initialization states. The review feedback identifies a need to explicitly shut down AsyncioRouter instances in the new tests to prevent background task leaks.
|
Thanks for the review! I've addressed all three Medium Priority comments in the follow-up commit Each of the three test cases in await router.shutdown()
router.long_poll_client.stop()after the assertion, consistent with the teardown pattern in the existing |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Reviewed by Cursor Bugbot for commit 87306cc. Configure here.
…nfig
When AsyncioRouter.update_deployment_config() is called before the
request_router has been lazily initialised (e.g. request_router_class
is None, or the first config update arrives before any replica is
assigned), the self.request_router property returns None.
The previous code unconditionally evaluated:
len(self.request_router.curr_replicas)
which raises AttributeError: 'NoneType' object has no attribute
'curr_replicas'.
Fix: cache the property result in a local variable and fall back to 0
when it is None. Zero is semantically correct because no replicas are
active at that point, so MetricsManager should not trigger a
scaled-to-zero optimised push.
Signed-off-by: chenshi5012 <chenshi5012@163.com>
87306cc to
4a013be
Compare
|
@cursoragent review again |
|
Unable to authenticate your request. Please make sure to connect your GitHub account to Cursor. Go to Cursor |
…oyment_config (ray-project#63180) ## Description `AsyncioRouter.update_deployment_config()` unconditionally evaluates `len(self.request_router.curr_replicas)` at the end of the method. The `request_router` property performs lazy initialisation and returns `None` when `_request_router_class` is `None` and `_request_router` has not yet been initialised. In that state the call raises: ``` AttributeError: 'NoneType' object has no attribute 'curr_replicas' ``` **Trigger conditions** 1. `AsyncioRouter` is constructed without a `request_router_class` (the parameter is `Optional`), and the Controller's first long-poll config push arrives before any replica is assigned — the lazy-init path is never entered, so the property returns `None`. 2. A live deployment is hot-updated to remove a custom `request_router_class`; `update_deployment_config()` overwrites `self._request_router_class` with the new value before the guard `if self._request_router:` is evaluated, leaving both fields `None`. 3. Race condition: the Controller pushes a new config between construction and the first call to `assign_request()` that would trigger lazy init. **Fix** Cache the property result in a local variable and fall back to `0` when it is `None`. Zero is semantically correct: no replicas are active yet, so `MetricsManager` should not trigger a scaled-to-zero optimised push. ```python # before curr_num_replicas=len(self.request_router.curr_replicas), # after _router = self.request_router curr_num_replicas = len(_router.curr_replicas) if _router is not None else 0 ``` ## Related issues None — this is a self-contained bug fix. ## Checks - [x] I've signed off every commit with `Signed-off-by` - [x] I've run `scripts/format.sh` to lint the changes in this PR - [x] I've included any doc changes needed for this PR Signed-off-by: chenshi5012 <chenshi5012@163.com>
…oyment_config (ray-project#63180) ## Description `AsyncioRouter.update_deployment_config()` unconditionally evaluates `len(self.request_router.curr_replicas)` at the end of the method. The `request_router` property performs lazy initialisation and returns `None` when `_request_router_class` is `None` and `_request_router` has not yet been initialised. In that state the call raises: ``` AttributeError: 'NoneType' object has no attribute 'curr_replicas' ``` **Trigger conditions** 1. `AsyncioRouter` is constructed without a `request_router_class` (the parameter is `Optional`), and the Controller's first long-poll config push arrives before any replica is assigned — the lazy-init path is never entered, so the property returns `None`. 2. A live deployment is hot-updated to remove a custom `request_router_class`; `update_deployment_config()` overwrites `self._request_router_class` with the new value before the guard `if self._request_router:` is evaluated, leaving both fields `None`. 3. Race condition: the Controller pushes a new config between construction and the first call to `assign_request()` that would trigger lazy init. **Fix** Cache the property result in a local variable and fall back to `0` when it is `None`. Zero is semantically correct: no replicas are active yet, so `MetricsManager` should not trigger a scaled-to-zero optimised push. ```python # before curr_num_replicas=len(self.request_router.curr_replicas), # after _router = self.request_router curr_num_replicas = len(_router.curr_replicas) if _router is not None else 0 ``` ## Related issues None — this is a self-contained bug fix. ## Checks - [x] I've signed off every commit with `Signed-off-by` - [x] I've run `scripts/format.sh` to lint the changes in this PR - [x] I've included any doc changes needed for this PR Signed-off-by: chenshi5012 <chenshi5012@163.com>
…oyment_config (ray-project#63180) ## Description `AsyncioRouter.update_deployment_config()` unconditionally evaluates `len(self.request_router.curr_replicas)` at the end of the method. The `request_router` property performs lazy initialisation and returns `None` when `_request_router_class` is `None` and `_request_router` has not yet been initialised. In that state the call raises: ``` AttributeError: 'NoneType' object has no attribute 'curr_replicas' ``` **Trigger conditions** 1. `AsyncioRouter` is constructed without a `request_router_class` (the parameter is `Optional`), and the Controller's first long-poll config push arrives before any replica is assigned — the lazy-init path is never entered, so the property returns `None`. 2. A live deployment is hot-updated to remove a custom `request_router_class`; `update_deployment_config()` overwrites `self._request_router_class` with the new value before the guard `if self._request_router:` is evaluated, leaving both fields `None`. 3. Race condition: the Controller pushes a new config between construction and the first call to `assign_request()` that would trigger lazy init. **Fix** Cache the property result in a local variable and fall back to `0` when it is `None`. Zero is semantically correct: no replicas are active yet, so `MetricsManager` should not trigger a scaled-to-zero optimised push. ```python # before curr_num_replicas=len(self.request_router.curr_replicas), # after _router = self.request_router curr_num_replicas = len(_router.curr_replicas) if _router is not None else 0 ``` ## Related issues None — this is a self-contained bug fix. ## Checks - [x] I've signed off every commit with `Signed-off-by` - [x] I've run `scripts/format.sh` to lint the changes in this PR - [x] I've included any doc changes needed for this PR Signed-off-by: chenshi5012 <chenshi5012@163.com> Signed-off-by: anindyam1969 <amukherjee@kinetica.com>
…oyment_config (ray-project#63180) ## Description `AsyncioRouter.update_deployment_config()` unconditionally evaluates `len(self.request_router.curr_replicas)` at the end of the method. The `request_router` property performs lazy initialisation and returns `None` when `_request_router_class` is `None` and `_request_router` has not yet been initialised. In that state the call raises: ``` AttributeError: 'NoneType' object has no attribute 'curr_replicas' ``` **Trigger conditions** 1. `AsyncioRouter` is constructed without a `request_router_class` (the parameter is `Optional`), and the Controller's first long-poll config push arrives before any replica is assigned — the lazy-init path is never entered, so the property returns `None`. 2. A live deployment is hot-updated to remove a custom `request_router_class`; `update_deployment_config()` overwrites `self._request_router_class` with the new value before the guard `if self._request_router:` is evaluated, leaving both fields `None`. 3. Race condition: the Controller pushes a new config between construction and the first call to `assign_request()` that would trigger lazy init. **Fix** Cache the property result in a local variable and fall back to `0` when it is `None`. Zero is semantically correct: no replicas are active yet, so `MetricsManager` should not trigger a scaled-to-zero optimised push. ```python # before curr_num_replicas=len(self.request_router.curr_replicas), # after _router = self.request_router curr_num_replicas = len(_router.curr_replicas) if _router is not None else 0 ``` ## Related issues None — this is a self-contained bug fix. ## Checks - [x] I've signed off every commit with `Signed-off-by` - [x] I've run `scripts/format.sh` to lint the changes in this PR - [x] I've included any doc changes needed for this PR Signed-off-by: chenshi5012 <chenshi5012@163.com>

Description
AsyncioRouter.update_deployment_config()unconditionally evaluateslen(self.request_router.curr_replicas)at the end of the method.The
request_routerproperty performs lazy initialisation and returnsNonewhen_request_router_classisNoneand_request_routerhas not yet been initialised. In that state the call raises:
Trigger conditions
AsyncioRouteris constructed without arequest_router_class(theparameter is
Optional), and the Controller's first long-poll configpush arrives before any replica is assigned — the lazy-init path is
never entered, so the property returns
None.request_router_class;update_deployment_config()overwritesself._request_router_classwith the new value before the guardif self._request_router:is evaluated, leaving both fieldsNone.and the first call to
assign_request()that would trigger lazy init.Fix
Cache the property result in a local variable and fall back to
0whenit is
None. Zero is semantically correct: no replicas are active yet,so
MetricsManagershould not trigger a scaled-to-zero optimised push.Related issues
None — this is a self-contained bug fix.
Checks
Signed-off-byscripts/format.shto lint the changes in this PR