Skip to content

fix: reject multiple @remote functions on the same queue-based resour…#234

Open
KAJdev wants to merge 2 commits intomainfrom
zeke/ae-2275-flash-multi-function-routing-only-wires-first-function
Open

fix: reject multiple @remote functions on the same queue-based resour…#234
KAJdev wants to merge 2 commits intomainfrom
zeke/ae-2275-flash-multi-function-routing-only-wires-first-function

Conversation

@KAJdev
Copy link
Contributor

@KAJdev KAJdev commented Feb 27, 2026

Two @remote functions on the same queue-based resource config silently routes the second to the first function's handler

  • adds _queue_resource_owners dict tracking which queue-based resource config (by id()) has already been claimed by a @remote function
  • when a second @remote tries to use the same config, raises ValueError with a clear message naming both functions and suggesting a separate resource config

@KAJdev KAJdev force-pushed the zeke/ae-2275-flash-multi-function-routing-only-wires-first-function branch from 9937a9d to afd7754 Compare February 27, 2026 22:19
@runpod-Henrik
Copy link

QA Report

Status: WARN
PR: #234 — fix: reject multiple @Remote functions on the same queue-based resource config
Agent: test-qa (PR mode)

Scope

Full Run (escalated) — PR modifies client.py, the P0 @remote decorator (3 execution modes, 185 lines). Any change here warrants full-suite validation.

Quality Gate

Check Result
Format (ruff format --check) FAILsrc/runpod_flash/client.py needs reformatting. This will fail CI.
Lint (ruff check) PASS — all checks passed

Test Results

Area Tests Passed Failed Skipped
Full suite (no coverage) 1298 1285 12 1
Full suite (with coverage) 1298 1284 12-14 1
New tests only 4 4 0 0

All 12 failures are pre-existing (known P4 issues on main):

  • test_class_execution_integration.py — 2 failures (AttributeError on _class_type / _constructor_args)
  • test_remote_concurrency.py — 10 failures (unhashable dict / not awaitable)

Coverage: 73.11-73.21% (threshold: 65%) — PASS

Flaky Test Check

New test file test_queue_resource_one_function.py passed 3/3 consecutive runs in serial mode and 3/3 runs with -n 4 parallel workers. No flakiness detected.

Coverage Impact

No meaningful coverage regression. client.py coverage unchanged at ~48% for the new test file's scope (the new guard lines are exercised by the 4 new tests).

PR Diff Analysis

  • No bare except
  • No print statements
  • No mutable default arguments
  • No hardcoded secrets
  • No missing await
  • LB resources correctly exempted via if not is_lb_resource: guard
  • Uses object.__setattr__ to bypass Pydantic model validation (correct — BaseResource config is not frozen but uses Pydantic v2 validation)
  • Approach changed from module-level _queue_resource_owners dict (per PR description) to per-object _remote_function_name attribute (per actual diff). The attribute-on-object approach is better: no module-level mutable state, no id() identity tracking, no cross-test leakage risk.

Design observations:

  1. All 3 @Remote modes covered correctly. The check runs after the LB route handler early-return (line 220-224) and before local/stub/class modes (lines 245+). All queue-based paths hit it.
  2. Class-based @Remote: Classes have __name__, so hasattr(func_or_class, "__name__") is True. The check works for classes too.
  3. Cross-repo impact: None. This is a client-side validation that fires at decoration time. No protocol/manifest/image changes. flash-worker and flash-examples are unaffected (flash-examples should never have two @remote on the same config — if they do, they have a latent bug this now surfaces).
  4. Test isolation: The old test_live_serverless_stub.py shared a module-level dummy_config = LiveServerless(...). The PR correctly changes this to a _fresh_config() factory function, avoiding cross-test contamination from the new per-object attribute.

Test Quality (PR adds tests)

4 tests in test_queue_resource_one_function.py:

  1. test_second_remote_on_same_config_raises — happy path: second decorator raises ValueError
  2. test_separate_configs_are_independent — separate configs don't interfere
  3. test_lb_resource_allows_multiple_functions — LB exemption works
  4. test_error_message_includes_both_function_names — error message quality

Missing coverage (non-blocking):

  • No test for class-based @remote on queue resource (classes also pass through this path)
  • No test for local=True mode with duplicate config (the check fires before the local guard, so it would raise — is that intended?)
  • No test verifying the attribute name _remote_function_name doesn't collide with Pydantic field validation or _hashed_fields

Style notes:

  • Tests use @patch.dict(os.environ, {}, clear=True) correctly to isolate from RUNPOD_ENDPOINT_ID
  • AAA pattern followed
  • Good error message assertions with regex matching

Known Issue Check

  • The 12 pre-existing integration failures match the known P4 count (2 class_execution + 10 concurrency)
  • test_file_locking flaky failures (seen in one coverage run) are also pre-existing

Recommendation

BLOCKruff format --check fails on client.py. CI will reject this PR. The author needs to run ruff format src/runpod_flash/client.py before merge. Once formatting is fixed, this is a clean MERGE.

The implementation approach (per-object attribute instead of module-level dict) is sound and avoids the test isolation concerns mentioned in the PR description. The test coverage is adequate for the feature, with minor gaps noted above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants