Skip to content

fix(resources): honor caller-supplied imageName in Live* resources#339

Open
deanq wants to merge 2 commits into
mainfrom
deanq/ae-3153-byoi-endpoint-image
Open

fix(resources): honor caller-supplied imageName in Live* resources#339
deanq wants to merge 2 commits into
mainfrom
deanq/ae-3153-byoi-endpoint-image

Conversation

@deanq
Copy link
Copy Markdown
Member

@deanq deanq commented May 22, 2026

Summary

  • LiveServerlessMixin and the Live*/CpuLive* subclasses silently swallowed caller-supplied imageName: the setter was a no-op and the @model_validator(mode="before") hooks unconditionally rewrote data["imageName"] to the Flash runtime image. This made Endpoint(image=...) client-mode unreachable under FLASH_IS_LIVE_PROVISIONING=true — user image was discarded, Flash's wrapper deployed instead, jobs returned empty envelopes.
  • Treat the Flash runtime image as a default, not a lock: only assign data["imageName"] when the caller didn't supply one. Drop the property override so reads/writes use Pydantic's normal field machinery (keeps model_dump, drift hashes, and setattr consistent).
  • Endpoint.__init__ now logs an info line when the caller passes image= so the substitution choice is observable.
  • Refs AE-3153.

Test plan

  • make quality-check passes (2696 unit + 53 integration tests, coverage 85.4%).
  • New / updated tests cover both paths for all four Live* classes:
    • default path: no imageName -> Flash runtime image
    • override path: caller imageName="custom/..." -> stored verbatim
  • Decorator-mode regression check: LiveServerless(name=...) (no image) still resolves to runpod/flash:py3.12-*.
  • Manual smoke: Endpoint(name=..., image="runpod/worker-v1-vllm:v2.18.1", gpu=...) under FLASH_IS_LIVE_PROVISIONING=true provisions a template whose imageName matches the supplied value.

LiveServerlessMixin and the Live*/CpuLive* subclasses previously locked
imageName to the Flash runtime image:

- The imageName property setter was a no-op (silently dropped user input).
- The @model_validator(mode="before") hooks unconditionally rewrote
  data["imageName"] to get_image_name(<type>, python_version).

Together they made Endpoint(image=...) client-mode unreachable under
FLASH_IS_LIVE_PROVISIONING=true: the user's image was discarded and
Flash's wrapper was deployed instead, returning empty envelopes with no
warning.

Treat the Flash runtime image as a *default*, not a *lock*:

- _apply_default_live_image() only writes data["imageName"] when the
  caller did not supply one.
- Drop the imageName property override on the mixin; reads/writes go
  through Pydantic's normal field machinery so model_dump, drift hashes,
  and setattr stay consistent.
- Endpoint.__init__ logs an info line when the caller supplies image= so
  the substitution choice is observable.

Update three existing tests that asserted the old "locked" behavior to
cover both the default-fallback path and the BYO-image override path
across all four Live* classes.

Refs AE-3153
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes Live* serverless resource handling so that a caller-supplied imageName is honored under live provisioning, instead of being silently overwritten by the Flash runtime image. This restores Endpoint(image=...) client-mode behavior when FLASH_IS_LIVE_PROVISIONING=true while keeping the Flash runtime image as the default for decorator-mode usage.

Changes:

  • Updated Live* resource model validators to only default imageName when the caller didn’t supply one, and removed the no-op imageName property override so Pydantic field behavior is preserved.
  • Added an info log in Endpoint.__init__ when image= is provided to make image selection observable.
  • Updated/added unit and integration tests to cover both default and override imageName paths for LiveServerless/CpuLiveServerless/LiveLoadBalancer/CpuLiveLoadBalancer.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tests/unit/resources/test_live_serverless.py Updates unit tests to validate imageName defaulting vs override for Live* resources.
tests/integration/test_lb_remote_execution.py Adjusts integration coverage to verify LB Live* default image plus override behavior.
tests/integration/test_cpu_disk_sizing.py Updates integration tests to reflect default vs override semantics for LiveServerless variants.
src/runpod_flash/endpoint.py Logs when a user-supplied image= is provided to Endpoint.
src/runpod_flash/core/resources/live_serverless.py Changes Live* defaulting logic to honor caller imageName and removes the “locked image” property override.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/runpod_flash/core/resources/live_serverless.py Outdated
Comment thread tests/unit/resources/test_live_serverless.py
Address Copilot review on PR #339:
- Guard _apply_default_live_image against non-dict input in
  @model_validator(mode="before"), matching network_volume.py:109 pattern.
  Prevents AttributeError when Pydantic passes a model instance during
  revalidation or nested construction.
- Update test docstrings to reflect "default unless overridden" semantics
  (replacing stale "locked image" wording).
- Add regression test exercising LiveServerless.model_validate on a model
  instance to cover the guard path.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants