feat: implicit flash endpoint resolution + CLI overhaul#324
feat: implicit flash endpoint resolution + CLI overhaul#324
Conversation
|
Promptless prepared a documentation update related to this change. Triggered by runpod/flash PR #324 Updated CLI documentation for the |
runpod-Henrik
left a comment
There was a problem hiding this comment.
QA Review — PR 324
Finding 1: get_flash_context() docstring contradicts the code
flash_context.py — the docstring says:
precedence:
1. FLASH_IS_LIVE_PROVISIONING=true forces live (flash dev)
2. FLASH_APP + FLASH_ENV both set -> sentinel
3. anything else -> live flow
The code does not fall through to a "live flow" for case 3. It returns None, which causes the caller (client.py, endpoint.py) to raise RuntimeError. Any developer reading this docstring will expect fallback — there is none.
Finding 2: Breaking change not documented in the PR description
Before this PR, a @remote-decorated function called outside of flash dev would fall through to ResourceManager for dynamic provisioning. After this PR, the same call raises RuntimeError("no flash context for endpoint '...'"). This is a behaviour change for any user who:
- calls
@remotefunctions programmatically withoutflash dev - has code that uses
ResourceManagerfor ad-hoc provisioning
The PR description mentions the new sentinel flow but does not say the ResourceManager fallback is removed. If this removal is intentional, it needs a migration note.
Finding 3: Partial env var config gives a misleading error
If a user sets FLASH_APP but not FLASH_ENV (or vice versa), get_flash_context() returns None — same as no env vars at all — and the caller raises:
RuntimeError: no flash context for endpoint '...'. either:
- use 'flash dev' for local development
- set FLASH_APP and FLASH_ENV to target a deployed environment
The error gives no indication which env var is missing. A user who half-configured their environment sees the same message as a user who configured nothing. Adding "FLASH_APP is set but FLASH_ENV is missing" (or vice versa) to the error would avoid a debugging round-trip.
Finding 4: _handle_sentinel_response silently passes through unexpected response shapes
flash_sentinel.py:
output = data.get("output", data)
if isinstance(output, dict) and "error" in output:
raise RuntimeError(...)
return outputIf the sentinel returns a response with neither a status nor an output key (e.g., an unexpected shape from ai-api), data.get("output", data) returns the full raw dict. If that dict happens not to contain an "error" key, it's returned to the caller as if it were a successful result. The user gets back an unexpected dict with no error raised. The QB path should assert the expected keys are present rather than falling through silently.
Finding 5: LB sentinel path has no error handling for FAILED status
The QB path routes through _handle_sentinel_response() which checks data.get("status") == "FAILED". The LB path (sentinel_lb_request) calls response.raise_for_status() and returns response.json() directly — no status check, no error extraction. A FAILED response from ai-api with HTTP 200 and {"status": "FAILED", "error": "..."} body would be returned as a raw dict to the LB caller with no exception raised.
Question: What does _normalize_resource_name() assume about resource names?
endpoint.py passes _normalize_resource_name(self.name) as the sentinel endpoint name — stripping a live- prefix and -fb suffix if present. Are deployed endpoint names always generated with these affixes by the platform, making user-set resource config names unreliable as sentinel targets? Or does this normalization only apply to a subset of resources? If a user names a resource live-worker, normalization produces worker — potentially routing to the wrong endpoint.
No issues with the core sentinel flow mechanics. The header-based routing design is clear, sentinel_qb_execute + _args_to_kwargs is straightforward, and the new test coverage for sentinel vs live dispatch paths is solid.
| raises: | ||
| RuntimeError: if remote execution fails | ||
| """ | ||
| body: Dict[str, Any] = {"method": request.method_name} |
There was a problem hiding this comment.
I think this doesn't work if decorated classes require constructors, because this only forwards method kwargs, i think those need to get passed as well. this fails for any classes w/ a constructor that has args
Two major changes:
Flash now resolves deployed endpoints by app/environment/endpoint name instead of requiring endpoint IDs. Uses
FLASH_APPandFLASH_ENVenv vars withX-Flash-App,X-Flash-Environment,X-Flash-Endpointheaders routed through ai-api middleware.Redesigns output for
flash dev,flash deploy,flash app,flash env, andflash undeployto be clean, consistent, and useful for real development workflows.The main flow is now

flash devand then

flash deployin order to deploy all of your codeif you want to call your code programmatically, instead of via ai-api, you can just modify your script to call the function and then run it
e.g.
running this script will hit your deployed worker.
the
FLASH_APPandFLASH_ENVenvironment variables override the app and environment assumptions.