SEG-52: add v2 async helpers (submit_async, run_async, AsyncJob) by shrey-rajvanshi · Pull Request #1 · segmind/segmind-python

shrey-rajvanshi · 2026-06-13T09:09:25Z

Summary

Adds v2 async support to the Python SDK — submit-and-poll for any heimdall `/v2/{slug}` model — so callers can stop wrapping their own request/poll loops around `requests.post`.

```python
import segmind

One-shot: submit + poll until COMPLETED

result = segmind.run_async("seedance-1-pro", prompt="A sunset", timeout=300)

Or split for finer control (parallelism, request_id tracking)

job = segmind.submit_async("seedance-1-pro", prompt="A sunset")
print(job.request_id)
result = job.wait(timeout=300)
```

Defaults: `interval=1.0s`, `timeout=600s`. Override per call for very slow models (Veo, Seedance video) or use webhooks (SEG-93) instead.

What's in

New module `segmind/v2.py` — `AsyncJob`, `InferenceFailed`, `InferenceTimeout`, internal `submit()` / `run()` / `_v2_base()`.
`SegmindClient.submit_async()` and `.run_async()` methods on the existing client.
Module-level `segmind.submit_async()` / `segmind.run_async()` resolving through the lazy default client.
11 respx-mocked unit tests covering submit, wait, polling progression, FAILED, TIMEOUT, one-shot, staging URL derivation, and module-level exports.

What's deliberately NOT in

Polling-hints consumption — SEG-243 was cancelled; this is a plain client. Caller tunes `timeout`/`interval` per call.
Async/await variant (`httpx.AsyncClient`) — separate ticket if/when wanted; the SDK is otherwise sync.
Webhook helpers — SEG-93 is the heimdall side; SDK webhook surface is a different concern.

Design notes

Exception names skip the `Error` suffix (`InferenceFailed`/`InferenceTimeout`). Reads naturally in caller code; per-file `# ruff: noqa: N818`.
`_v2_base()` derives the v2 prefix from `client.base_url` — callers using `api-latest.segmind.com/v1` for staging automatically get the matching `/v2` host with no extra config.
Loud-fail on a 2xx submit missing `request_id` / `status_url` / `response_url` instead of polling forever on a missing URL. Server contract is to always return all three.
`FAILED` path fetches the full result body (not just the status one) so the exception carries metrics + request_id alongside the error string.

Smoke

11 / 11 unit tests pass via `pytest tests/test_v2.py`.
`ruff check` clean.
`black --check` clean.
Live one-shot vs `api-latest.segmind.com/v2/mock-inference` (sleep=1, credits=1e-6) returned `COMPLETED` with `inference_time=1.013s`.
Live timeout path (`sleep=5, timeout=1`) raised `InferenceTimeout` cleanly with the expected request_id in the message.

Tickets

Parent: SEG-52 "SDK async" (Phase 1 — Async core).
Cancelled sibling: SEG-243 (server-side polling hints — decided we don't need them; client picks defaults).
Related: SEG-93 (webhooks, for slow models).

Adds support for the v2 submit-then-poll inference path: client.submit_async(slug, **params) -> AsyncJob client.run_async(slug, **params, timeout=600, interval=1.0) -> dict AsyncJob.wait(timeout, interval) -> dict AsyncJob.status() / AsyncJob.result() Plus module-level shortcuts segmind.submit_async / segmind.run_async that resolve through the lazy default SegmindClient. Design choices: * 1.0s poll interval, 600s timeout defaults. No consumption of server-side polling hints (SEG-243 was cancelled in favour of a plain client). Callers tune timeout/interval per call for slow models, or use webhooks (SEG-93). * Two new exceptions, InferenceFailed + InferenceTimeout, both subclassing the existing SegmindError so callers can broad-catch. Names deliberately omit the 'Error' suffix for natural reading (per-file ruff noqa). * _v2_base() derives the v2 prefix from client.base_url so callers who override for staging (api-latest.segmind.com/v1) keep working without a separate v2 base_url. * If a 2xx submit response lacks request_id/status_url/response_url we raise SegmindError immediately rather than polling forever on a missing URL. Tests (11/11 pass, respx-mocked, no network): * submit returns AsyncJob with the right URLs * submit propagates 4xx via raise_for_status * submit raises on missing request_id in a 2xx body * wait returns result on COMPLETED * wait polls through QUEUED -> PROCESSING -> COMPLETED * wait raises InferenceFailed on FAILED with server error string * wait raises InferenceTimeout when deadline elapses * run_async one-shot end-to-end * v2 URL derives correctly from a staging base_url override * module-level run_async uses the lazy default client Live smoke against api-latest.segmind.com /v2/mock-inference: * run_async(sleep=1) -> status=COMPLETED, inference_time=1.013s * run_async(sleep=5, timeout=1) -> InferenceTimeout raised cleanly Linear: SEG-52 (parent, Phase 1 - Async core).

gemini-code-assist

Code Review

This pull request introduces v2 async inference capabilities to the Segmind Python SDK, adding submit_async and run_async methods alongside an AsyncJob class for polling and retrieving results. The feedback focuses on optimizing the polling loop in AsyncJob.wait to avoid sleeping past the timeout deadline, removing the unused _TERMINAL_STATES constant, and exposing SegmindError at the package level to simplify exception handling and unit tests.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

gemini-code-assist · 2026-06-13T09:10:58Z

+        deadline = time.monotonic() + timeout
+        while True:
+            status_body = self.status()
+            state = status_body.get("status")
+
+            if state == "COMPLETED":
+                return self.result()
+
+            if state == "FAILED":
+                # /status carries the error for FAILED; pull the full body
+                # so the exception caller has metrics + request_id alongside.
+                final = self.result()
+                err = final.get("error") or status_body.get("error")
+                raise InferenceFailed(detail=err, response_body=final)
+
+            if time.monotonic() >= deadline:
+                raise InferenceTimeout(
+                    request_id=self.request_id,
+                    elapsed_s=timeout,
+                )
+
+            time.sleep(interval)


The current polling loop can sleep past the deadline and make an unnecessary HTTP request after the timeout has already expired. Additionally, if the timeout is reached during the sleep, the loop will still perform another status check before raising InferenceTimeout. We can optimize this by checking the deadline at the start of the loop, checking if the remaining time is exceeded before sleeping, and capping the sleep interval to the remaining time.

deadline = time.monotonic() + timeout while True: if time.monotonic() >= deadline: raise InferenceTimeout( request_id=self.request_id, elapsed_s=timeout, ) status_body = self.status() state = status_body.get("status") if state == "COMPLETED": return self.result() if state == "FAILED": # /status carries the error for FAILED; pull the full body # so the exception caller has metrics + request_id alongside. final = self.result() err = final.get("error") or status_body.get("error") raise InferenceFailed(detail=err, response_body=final) remaining = deadline - time.monotonic() if remaining <= 0: raise InferenceTimeout( request_id=self.request_id, elapsed_s=timeout, ) time.sleep(min(interval, remaining))

gemini-code-assist · 2026-06-13T09:10:58Z

+# Status strings reported by the v2 status endpoint. Anything outside
+# this set is treated as in-progress (forward-compat with future states).
+_TERMINAL_STATES = ("COMPLETED", "FAILED")


The constant _TERMINAL_STATES is defined but never used anywhere in the module. It should be removed to keep the codebase clean and maintainable.

gemini-code-assist · 2026-06-13T09:10:58Z

+from segmind.v2 import (
+    DEFAULT_POLL_INTERVAL_S,
+    DEFAULT_POLL_TIMEOUT_S,
+    AsyncJob,
+    InferenceFailed,
+    InferenceTimeout,
+)


Exposing SegmindError at the package level (segmind.SegmindError) is highly recommended so that users of the SDK can easily import and catch the base exception class without needing to import from internal modules. This also avoids workarounds in the test suite.

Suggested change

from segmind.v2 import (

DEFAULT_POLL_INTERVAL_S,

DEFAULT_POLL_TIMEOUT_S,

AsyncJob,

InferenceFailed,

InferenceTimeout,

)

from segmind.exceptions import SegmindError

from segmind.v2 import (

DEFAULT_POLL_INTERVAL_S,

DEFAULT_POLL_TIMEOUT_S,

AsyncJob,

InferenceFailed,

InferenceTimeout,

)

gemini-code-assist · 2026-06-13T09:10:58Z

+    "DEFAULT_POLL_INTERVAL_S",
+    "DEFAULT_POLL_TIMEOUT_S",
+    "AsyncJob",
+    "InferenceFailed",
+    "InferenceTimeout",


Add SegmindError to all to explicitly export it as part of the public API.

Suggested change

"DEFAULT_POLL_INTERVAL_S",

"DEFAULT_POLL_TIMEOUT_S",

"AsyncJob",

"InferenceFailed",

"InferenceTimeout",

"DEFAULT_POLL_INTERVAL_S",

"DEFAULT_POLL_TIMEOUT_S",

"AsyncJob",

"InferenceFailed",

"InferenceTimeout",

"SegmindError",

gemini-code-assist · 2026-06-13T09:10:58Z

+        return_value=httpx.Response(401, json={"error": "Invalid API key"})
+    )
+
+    with pytest.raises(segmind.SegmindError if hasattr(segmind, "SegmindError") else Exception):


With SegmindError exposed at the package level, we can simplify this assertion and remove the conditional hasattr check.

Suggested change

with pytest.raises(segmind.SegmindError if hasattr(segmind, "SegmindError") else Exception):

with pytest.raises(segmind.SegmindError):

Self-review pass before the tester session reports back. Simplicity: * Remove _TERMINAL_STATES constant — defined but never referenced. * Drop the SEG-243 self-reference from the module docstring; replace with product-facing 'use larger timeout/interval for slow models, or webhooks for fire-and-forget'. Optimization: * FAILED path no longer makes a second HTTP round-trip to the response URL. heimdall's /v2/requests/{id}/status already carries the error string on FAILED (SEG-97), so we can build the InferenceFailed exception from the status body alone. * Rename InferenceFailed.response_body -> .status_body to reflect that it now holds the status payload, not the full result. Callers who want server metrics on failure can still call AsyncJob.result() themselves after catching. Test: * test_wait_raises_inference_failed_on_failed asserts via result_route.called == False that the optimization holds — any regression that re-introduces the extra GET fails this test. All 11 tests still pass; ruff + black clean.

shrey-rajvanshi · 2026-06-13T09:46:18Z

Self-review pass — pushed `2c34e83`

Two cleanups before the tester session reports back. Both are pure simplifications — no public-API surface changes other than one field rename on the failure exception.

1. Removed unused `_TERMINAL_STATES`

The constant was defined and never referenced; `wait()` checks `state == "COMPLETED"` / `state == "FAILED"` directly. Deleted.

2. FAILED path no longer does an extra HTTP round-trip

Before:
```python
if state == "FAILED":
final = self.result() # extra GET /v2/requests/{id}
err = final.get("error") or status_body.get("error")
raise InferenceFailed(detail=err, response_body=final)
```

After:
```python
if state == "FAILED":
raise InferenceFailed(
detail=status_body.get("error"),
status_body=status_body,
)
```

`/v2/requests/{id}/status` already carries the error string on FAILED (heimdall SEG-97), so the second GET was redundant. Saves one round-trip per failure path.

Trade-off: the exception now carries the status payload (not the full result body). Callers who want server-side metrics on failure can still call `AsyncJob.result()` themselves after catching. The simpler default felt right for a no-over-engineering pass.

Field rename: `InferenceFailed.response_body` → `InferenceFailed.status_body`. The new test enforces this with `assert result_route.called is False` so any regression that re-introduces the extra GET fails the suite.

3. Module docstring polish

Dropped the `SEG-243 cancelled` reference — internal noise for SDK readers. Replaced with product-facing guidance.

Verified

11 / 11 tests pass.
`ruff check` + `black --check` clean.
Live sanity against api-latest: a bogus slug surfaces as `SegmindError(404, "Model information not found")` at submit time (caught by `raise_for_status`), confirming the FAILED-state path is reachable separately for a queued-then-failed task.

What I considered but did NOT change

`submit_response` field on `AsyncJob` — keeps forward-compat for new server response keys; cheap to carry.
`_v2_base()` URL derivation — robust already; cosmetic micro-rewrite not worth the loss of clarity.
First-poll latency — the loop runs `status()` immediately (no initial sleep), so fast models pay only one poll-interval if the result didn't land within first round-trip.
`result()` caching — YAGNI; users rarely call it twice.

…ED + nits Tester-session findings on SEG-52 (scenarios 5 and #4 nit). Reproduced end-to-end against api-latest. Headline bug — InferenceFailed was unreachable for worker-side failures. * heimdall returns HTTP 422 on /v2/requests/{id}/status when the task is in terminal FAILED state, while still carrying the {status: 'FAILED', error: '...'} body. The SDK's _request -> raise_for_status raised SegmindError(422) before wait() could inspect the body, so the FAILED branch never fired. * Fix: add AsyncJob._fetch_terminal_tolerant(url) which uses the underlying httpx client directly (no raise_for_status). If the body announces status COMPLETED or FAILED, return it as a valid payload regardless of HTTP code; otherwise fall through to the existing raise_for_status so genuine 401/404/5xx still surface as SegmindError. status() and result() both route through the helper. * _TERMINAL_STATES constant re-added (now used by the helper). * Two new tests: - 4xx-with-FAILED-body -> InferenceFailed - genuine 401 with no terminal body -> SegmindError(401), NOT a wrapped InferenceFailed * Live verified end-to-end: - sleep=99999 -> InferenceFailed('Validation error...sleep must be between 0 and 900.0 seconds') - bad slug (submit-time) -> SegmindError(404, 'Model information not found') - happy path unchanged -> COMPLETED, inference_time=1.013s Nits from the tester comment: * InferenceTimeout.elapsed_s was stamped as the *configured* timeout arg, not real wall time. Now computed via time.monotonic() - start (live: timeout=2.0 -> elapsed_s=2.264, including last poll's sleep). * wait() docstring now notes that the result dict shape is model-dependent (status/output/metrics are reliable; the rest is model-specific). Tests: 13/13 pass. ruff + black clean. Tester verdict: 'requires one change before ship' — this commit addresses that change. Host URL leak (staging returns prod URLs) is heimdall-side, not SDK; will note as a separate follow-up.

@v4

The Documentation workflow auto-failed on this PR because actions/upload-pages-artifact@v2 transitively pulls in actions/upload-artifact@v3, which GitHub auto-fails as of 2024-04-16. The if: refs/heads/main gates don't help — the deprecation check runs at workflow parse time, before any step runs. Bump to the current major versions, all of which use actions/upload-artifact@v4 internally: actions/setup-python @v4 -> @v5 actions/configure-pages @V3 -> @v5 actions/upload-pages-artifact@v2 -> @V3 actions/deploy-pages @v2 -> @v4 No behaviour change in the steps themselves; the docs build job still runs on PR (build smoke) and the deploy steps still gate on github.ref == 'refs/heads/main'.

Back-merged origin/main (PR #1 v2-async + the docs.yml deprecated-actions CI fix) into this branch so build-and-deploy passes — it was red only because the branch predated the Pages-actions bump now on main. Bump __version__ 1.0.0 -> 1.1.0: 1.0.0 is already on PyPI, and main gained the v2 async feature (submit_async / run_async / AsyncJob) since 1.0.0 — a minor bump. This branch also carries the SEG-319 X-Initiator: SDK-PY change, so 1.1.0 ships both. Full suite green after the merge.

* feat(client): send X-Initiator: SDK-PY so SDK traffic is attributable (SEG-319) The SDK sent X-Initiator: segmind-python-sdk/0.1.0, which spot-backend's SQS worker rejects (not in InitiatorType) and coerces to OTHERS — so SDK calls are indistinguishable from raw requests/curl in the DB. Send the stable token X-Initiator: SDK-PY instead. Heimdall passes it through verbatim on the sync path (-> SDK-PY) and suffixes -V2 on the v2-async path (-> SDK-PY-V2). Both are added to InitiatorType in the paired spot-backend PR. Version detail stays in the User-Agent header (segmind-python-sdk/0.1.0), which heimdall logs — so we don't lose version telemetry. Updated test_http_client_headers assertions accordingly. Full suite: 256 passed, 7 skipped. * chore(release): back-merge main + bump version to 1.1.0 Back-merged origin/main (PR #1 v2-async + the docs.yml deprecated-actions CI fix) into this branch so build-and-deploy passes — it was red only because the branch predated the Pages-actions bump now on main. Bump __version__ 1.0.0 -> 1.1.0: 1.0.0 is already on PyPI, and main gained the v2 async feature (submit_async / run_async / AsyncJob) since 1.0.0 — a minor bump. This branch also carries the SEG-319 X-Initiator: SDK-PY change, so 1.1.0 ships both. Full suite green after the merge. * chore(release): use patch bump 1.0.1 (not 1.1.0) --------- Co-authored-by: Shrey Kant Rajvanshi <shrey@segmind.com>

gemini-code-assist Bot reviewed Jun 13, 2026

View reviewed changes

Shrey Kant Rajvanshi added 2 commits June 13, 2026 15:26

shrey-rajvanshi merged commit 104f176 into main Jun 14, 2026
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

SEG-52: add v2 async helpers (submit_async, run_async, AsyncJob)#1

SEG-52: add v2 async helpers (submit_async, run_async, AsyncJob)#1
shrey-rajvanshi merged 4 commits into
mainfrom
shrey/seg-52-v2-async

shrey-rajvanshi commented Jun 13, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Jun 13, 2026

Uh oh!

gemini-code-assist Bot Jun 13, 2026

Uh oh!

gemini-code-assist Bot Jun 13, 2026

Uh oh!

gemini-code-assist Bot Jun 13, 2026

Uh oh!

gemini-code-assist Bot Jun 13, 2026

Uh oh!

shrey-rajvanshi commented Jun 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	with pytest.raises(segmind.SegmindError if hasattr(segmind, "SegmindError") else Exception):
	with pytest.raises(segmind.SegmindError):

Uh oh!

Conversation

shrey-rajvanshi commented Jun 13, 2026

Summary

One-shot: submit + poll until COMPLETED

Or split for finer control (parallelism, request_id tracking)

What's in

What's deliberately NOT in

Design notes

Smoke

Tickets

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jun 13, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 13, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 13, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 13, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 13, 2026

Choose a reason for hiding this comment

Uh oh!

shrey-rajvanshi commented Jun 13, 2026

Self-review pass — pushed 2c34e83

1. Removed unused _TERMINAL_STATES

2. FAILED path no longer does an extra HTTP round-trip

3. Module docstring polish

Verified

What I considered but did NOT change

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Self-review pass — pushed `2c34e83`

1. Removed unused `_TERMINAL_STATES`