Skip to content

[serve] Fix AttributeError when request_router is None in update_deployment_config#63180

Merged
abrarsheikh merged 2 commits into
ray-project:masterfrom
chenshi5012:fix/serve-router-null-deref-update-deployment-config
May 8, 2026
Merged

[serve] Fix AttributeError when request_router is None in update_deployment_config#63180
abrarsheikh merged 2 commits into
ray-project:masterfrom
chenshi5012:fix/serve-router-null-deref-update-deployment-config

Conversation

@chenshi5012
Copy link
Copy Markdown
Contributor

@chenshi5012 chenshi5012 commented May 7, 2026

Description

AsyncioRouter.update_deployment_config() unconditionally evaluates
len(self.request_router.curr_replicas) at the end of the method.
The request_router property performs lazy initialisation and returns
None when _request_router_class is None and _request_router
has not yet been initialised. In that state the call raises:

AttributeError: 'NoneType' object has no attribute 'curr_replicas'

Trigger conditions

  1. AsyncioRouter is constructed without a request_router_class (the
    parameter is Optional), and the Controller's first long-poll config
    push arrives before any replica is assigned — the lazy-init path is
    never entered, so the property returns None.
  2. A live deployment is hot-updated to remove a custom
    request_router_class; update_deployment_config() overwrites
    self._request_router_class with the new value before the guard
    if self._request_router: is evaluated, leaving both fields None.
  3. Race condition: the Controller pushes a new config between construction
    and the first call to assign_request() that would trigger lazy init.

Fix

Cache the property result in a local variable and fall back to 0 when
it is None. Zero is semantically correct: no replicas are active yet,
so MetricsManager should not trigger a scaled-to-zero optimised push.

# before
curr_num_replicas=len(self.request_router.curr_replicas),

# after
_router = self.request_router
curr_num_replicas = len(_router.curr_replicas) if _router is not None else 0

Related issues

None — this is a self-contained bug fix.

Checks

  • I've signed off every commit with Signed-off-by
  • I've run scripts/format.sh to lint the changes in this PR
  • I've included any doc changes needed for this PR

@chenshi5012 chenshi5012 requested a review from a team as a code owner May 7, 2026 03:28
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses a potential AttributeError in AsyncioRouter.update_deployment_config by ensuring the request_router is initialized before accessing its replicas. It also includes regression tests for various initialization states. The review feedback identifies a need to explicitly shut down AsyncioRouter instances in the new tests to prevent background task leaks.

Comment thread python/ray/serve/tests/unit/test_router.py Outdated
Comment thread python/ray/serve/tests/unit/test_router.py Outdated
Comment thread python/ray/serve/tests/unit/test_router.py Outdated
@chenshi5012
Copy link
Copy Markdown
Contributor Author

Thanks for the review! I've addressed all three Medium Priority comments in the follow-up commit 87306cc.

Each of the three test cases in TestUpdateDeploymentConfigNullRouter now calls:

await router.shutdown()
router.long_poll_client.stop()

after the assertion, consistent with the teardown pattern in the existing setup_router fixture.

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Reviewed by Cursor Bugbot for commit 87306cc. Configure here.

Comment thread python/ray/serve/tests/unit/test_router.py Outdated
@ray-gardener ray-gardener Bot added serve Ray Serve Related Issue community-contribution Contributed by the community labels May 7, 2026
…nfig

When AsyncioRouter.update_deployment_config() is called before the
request_router has been lazily initialised (e.g. request_router_class
is None, or the first config update arrives before any replica is
assigned), the self.request_router property returns None.

The previous code unconditionally evaluated:
    len(self.request_router.curr_replicas)
which raises AttributeError: 'NoneType' object has no attribute
'curr_replicas'.

Fix: cache the property result in a local variable and fall back to 0
when it is None. Zero is semantically correct because no replicas are
active at that point, so MetricsManager should not trigger a
scaled-to-zero optimised push.

Signed-off-by: chenshi5012 <chenshi5012@163.com>
@chenshi5012 chenshi5012 force-pushed the fix/serve-router-null-deref-update-deployment-config branch from 87306cc to 4a013be Compare May 7, 2026 11:30
@chenshi5012
Copy link
Copy Markdown
Contributor Author

@cursoragent review again

@cursor
Copy link
Copy Markdown

cursor Bot commented May 7, 2026

Unable to authenticate your request. Please make sure to connect your GitHub account to Cursor. Go to Cursor

@abrarsheikh abrarsheikh added the go add ONLY when ready to merge, run all tests label May 7, 2026
@abrarsheikh abrarsheikh merged commit 3eb5a1d into ray-project:master May 8, 2026
6 checks passed
chillCode404 pushed a commit to chillCode404/ray-contrib that referenced this pull request May 9, 2026
…oyment_config (ray-project#63180)

## Description

`AsyncioRouter.update_deployment_config()` unconditionally evaluates
`len(self.request_router.curr_replicas)` at the end of the method.
The `request_router` property performs lazy initialisation and returns
`None` when `_request_router_class` is `None` and `_request_router`
has not yet been initialised. In that state the call raises:

```
AttributeError: 'NoneType' object has no attribute 'curr_replicas'
```

**Trigger conditions**

1. `AsyncioRouter` is constructed without a `request_router_class` (the
   parameter is `Optional`), and the Controller's first long-poll config
   push arrives before any replica is assigned — the lazy-init path is
   never entered, so the property returns `None`.
2. A live deployment is hot-updated to remove a custom
   `request_router_class`; `update_deployment_config()` overwrites
   `self._request_router_class` with the new value before the guard
   `if self._request_router:` is evaluated, leaving both fields `None`.
3. Race condition: the Controller pushes a new config between
construction
and the first call to `assign_request()` that would trigger lazy init.

**Fix**

Cache the property result in a local variable and fall back to `0` when
it is `None`. Zero is semantically correct: no replicas are active yet,
so `MetricsManager` should not trigger a scaled-to-zero optimised push.

```python
# before
curr_num_replicas=len(self.request_router.curr_replicas),

# after
_router = self.request_router
curr_num_replicas = len(_router.curr_replicas) if _router is not None else 0
```

## Related issues

None — this is a self-contained bug fix.

## Checks

- [x] I've signed off every commit with `Signed-off-by`
- [x] I've run `scripts/format.sh` to lint the changes in this PR
- [x] I've included any doc changes needed for this PR

Signed-off-by: chenshi5012 <chenshi5012@163.com>
dancingactor pushed a commit to dancingactor/ray that referenced this pull request May 13, 2026
…oyment_config (ray-project#63180)

## Description

`AsyncioRouter.update_deployment_config()` unconditionally evaluates
`len(self.request_router.curr_replicas)` at the end of the method.
The `request_router` property performs lazy initialisation and returns
`None` when `_request_router_class` is `None` and `_request_router`
has not yet been initialised. In that state the call raises:

```
AttributeError: 'NoneType' object has no attribute 'curr_replicas'
```

**Trigger conditions**

1. `AsyncioRouter` is constructed without a `request_router_class` (the
   parameter is `Optional`), and the Controller's first long-poll config
   push arrives before any replica is assigned — the lazy-init path is
   never entered, so the property returns `None`.
2. A live deployment is hot-updated to remove a custom
   `request_router_class`; `update_deployment_config()` overwrites
   `self._request_router_class` with the new value before the guard
   `if self._request_router:` is evaluated, leaving both fields `None`.
3. Race condition: the Controller pushes a new config between
construction
and the first call to `assign_request()` that would trigger lazy init.

**Fix**

Cache the property result in a local variable and fall back to `0` when
it is `None`. Zero is semantically correct: no replicas are active yet,
so `MetricsManager` should not trigger a scaled-to-zero optimised push.

```python
# before
curr_num_replicas=len(self.request_router.curr_replicas),

# after
_router = self.request_router
curr_num_replicas = len(_router.curr_replicas) if _router is not None else 0
```

## Related issues

None — this is a self-contained bug fix.

## Checks

- [x] I've signed off every commit with `Signed-off-by`
- [x] I've run `scripts/format.sh` to lint the changes in this PR
- [x] I've included any doc changes needed for this PR

Signed-off-by: chenshi5012 <chenshi5012@163.com>
am-kinetica pushed a commit to kineticadb/ray that referenced this pull request May 14, 2026
…oyment_config (ray-project#63180)

## Description

`AsyncioRouter.update_deployment_config()` unconditionally evaluates
`len(self.request_router.curr_replicas)` at the end of the method.
The `request_router` property performs lazy initialisation and returns
`None` when `_request_router_class` is `None` and `_request_router`
has not yet been initialised. In that state the call raises:

```
AttributeError: 'NoneType' object has no attribute 'curr_replicas'
```

**Trigger conditions**

1. `AsyncioRouter` is constructed without a `request_router_class` (the
   parameter is `Optional`), and the Controller's first long-poll config
   push arrives before any replica is assigned — the lazy-init path is
   never entered, so the property returns `None`.
2. A live deployment is hot-updated to remove a custom
   `request_router_class`; `update_deployment_config()` overwrites
   `self._request_router_class` with the new value before the guard
   `if self._request_router:` is evaluated, leaving both fields `None`.
3. Race condition: the Controller pushes a new config between
construction
and the first call to `assign_request()` that would trigger lazy init.

**Fix**

Cache the property result in a local variable and fall back to `0` when
it is `None`. Zero is semantically correct: no replicas are active yet,
so `MetricsManager` should not trigger a scaled-to-zero optimised push.

```python
# before
curr_num_replicas=len(self.request_router.curr_replicas),

# after
_router = self.request_router
curr_num_replicas = len(_router.curr_replicas) if _router is not None else 0
```

## Related issues

None — this is a self-contained bug fix.

## Checks

- [x] I've signed off every commit with `Signed-off-by`
- [x] I've run `scripts/format.sh` to lint the changes in this PR
- [x] I've included any doc changes needed for this PR

Signed-off-by: chenshi5012 <chenshi5012@163.com>
Signed-off-by: anindyam1969 <amukherjee@kinetica.com>
Lucas61000 pushed a commit to Lucas61000/ray that referenced this pull request May 15, 2026
…oyment_config (ray-project#63180)

## Description

`AsyncioRouter.update_deployment_config()` unconditionally evaluates
`len(self.request_router.curr_replicas)` at the end of the method.
The `request_router` property performs lazy initialisation and returns
`None` when `_request_router_class` is `None` and `_request_router`
has not yet been initialised. In that state the call raises:

```
AttributeError: 'NoneType' object has no attribute 'curr_replicas'
```

**Trigger conditions**

1. `AsyncioRouter` is constructed without a `request_router_class` (the
   parameter is `Optional`), and the Controller's first long-poll config
   push arrives before any replica is assigned — the lazy-init path is
   never entered, so the property returns `None`.
2. A live deployment is hot-updated to remove a custom
   `request_router_class`; `update_deployment_config()` overwrites
   `self._request_router_class` with the new value before the guard
   `if self._request_router:` is evaluated, leaving both fields `None`.
3. Race condition: the Controller pushes a new config between
construction
and the first call to `assign_request()` that would trigger lazy init.

**Fix**

Cache the property result in a local variable and fall back to `0` when
it is `None`. Zero is semantically correct: no replicas are active yet,
so `MetricsManager` should not trigger a scaled-to-zero optimised push.

```python
# before
curr_num_replicas=len(self.request_router.curr_replicas),

# after
_router = self.request_router
curr_num_replicas = len(_router.curr_replicas) if _router is not None else 0
```

## Related issues

None — this is a self-contained bug fix.

## Checks

- [x] I've signed off every commit with `Signed-off-by`
- [x] I've run `scripts/format.sh` to lint the changes in this PR
- [x] I've included any doc changes needed for this PR

Signed-off-by: chenshi5012 <chenshi5012@163.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-contribution Contributed by the community go add ONLY when ready to merge, run all tests serve Ray Serve Related Issue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants