Merged
Conversation
LoRA adapter checkpoints are standalone PEFT exports — the trainer does not XOR-compress them regardless of checkpoint_type. Sending incremental_snapshot_metadata for LoRA hotloads causes the serving container to crash with a KeyError during delta decompression, because raw adapter safetensors have no per-tensor Alder32 checksums in their metadata. When lora_rank > 0, WeightSyncer now forces checkpoint_type="base" and returns None for incremental metadata, so every hotload is FULL. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Contributor
Author
|
🧪 Testing To try out this version of the SDK: Expires at: Sun, 03 May 2026 21:01:04 GMT |
Contributor
Author
stainless-app Bot
pushed a commit
that referenced
this pull request
Apr 17, 2026
* feat(ci): add mock server tests, e2e tests, coverage, and security scanning Add CI infrastructure needed for stable release readiness: - test-mock-server job: Prism mock server + multi-Python (3.9/3.11/3.13) matrix to unskip 884 API resource tests via RUN_MOCK_SERVER_TESTS env var - test-coverage job: pytest-cov with 60% threshold and artifact upload - ci-e2e.yml: daily e2e tests against live Fireworks API with tests/functional/ suite covering chat completions, streaming, async, multi-turn, raw response, legacy completions, and model listing - ci-security.yml: weekly pip-audit dependency vulnerability scan - post-publish.yml: smoke test after PyPI publish on 3 Python versions - conftest.py hook to dynamically remove mock server skip markers - pytest-cov and pip-audit added to dev dependencies Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: add fireworks.client compat shim and split training extras Backwards compatibility: - Add fireworks/client/__init__.py that re-exports Fireworks and AsyncFireworks from the top-level package with a DeprecationWarning. This preserves the old `from fireworks.client import Fireworks` import path used by langchain-fireworks, instructor, and internal services. Training extras split: - Add `training-sdk` optional extra with only the deps the SDK actually imports at runtime (tinker + requests). The SDK never imports torch, transformers, datasets, tiktoken, numpy, or wandb. - Keep `training` extra as the full set for cookbook/notebook workflows. `pip install fireworks-ai[training-sdk]` saves ~5GB vs `[training]`. Downstream compat tests: - Add test_downstream_compat.py verifying old import paths, new import paths, client interface surface, and deprecation warning behavior. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: add acreate() compat alias for langchain and instructor langchain-fireworks and instructor both call `client.chat.completions.acreate()` — a method that existed on the 0.x SDK but not the Stainless-generated 1.x SDK (which uses async-native `create()` instead). Add `acreate` as a deprecated alias on both `AsyncCompletionsResource` classes (chat and legacy completions) via a patch in `lib/_legacy_compat.py`. The alias calls `create()` and emits a DeprecationWarning. This means langchain-fireworks and instructor will work unchanged once they bump their `<1.0.0` upper bound — no code changes needed on their side beyond the version constraint. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test: add downstream compat unit and e2e tests for langchain/instructor patterns Unit tests (respx-mocked) verify exact calling patterns used by langchain-fireworks and instructor: sync/async create, acreate alias, isinstance dispatch via old import paths, response.model_dump(), and streaming chunk.delta access. E2e tests replicate the same patterns against the live Fireworks API. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: use realistic base_url=None in downstream compat unit tests langchain-fireworks defaults fireworks_api_base to None, so the SDK uses its hardcoded absolute inference URLs (e.g. https://api.fireworks.ai/inference/v1/chat/completions). Updated mocks to match this real-world behavior instead of overriding base_url. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore: fix ruff target, disable prerelease mode, remove --pre from README - Ruff target-version py38 → py39 to match requires-python >= 3.9 - release-please: versioning default, prerelease false — ready for stable 1.0.0 release - README: drop --pre flag from install instructions Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * style: fix ruff lint errors (import sorting, unused import, lambda arg) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore: remove .coverage from tracking Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: resolve pyright strict mode errors in test files and legacy compat - Add type annotations to _patch_acreate_aliases (Any params/return) - Add pyright pragmas to test files that exercise monkey-patched acreate - Use type: ignore for runtime-patched attribute access (acreate) - Annotate streaming chunk lists as list[object] for type narrowing - Inline kwargs in e2e test to avoid dict[str, str|int] unpacking issue - Use isinstance(content, str) instead of assert content for type guard Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: add union-attr to type ignore for mypy compatibility Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address PR review comments for downstream compat - Filter DeprecationWarnings by message text (stacklevel=2 attributes to caller, not fireworks.client module) - Relax version assertion to startswith("1.") for alpha/rc compat - Add Prism health check with curl -sf failure detection in CI - Replace sleep-based PyPI wait with retry loop in post-publish - Loosen tinker pin from ==0.15.0 to >=0.15.0 - Remove per-test warnings.catch_warnings blocks (use global filter) - Add test_async_acreate_streaming covering langchain _astream() path Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: use curl -s (not -sf) for Prism health check Prism returns 404 for GET / since it only serves spec endpoints. The -f flag causes curl to fail on HTTP errors. We only need to check that Prism is reachable (any response = server is up). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(ci): bump aiohttp to 3.13.5 and tighten coverage gate to 70% - aiohttp 3.13.3 → 3.13.5 in lock files (resolves 10 CVEs: CVE-2026-34513 through CVE-2026-34520, CVE-2026-34525, CVE-2026-22815) - Scope coverage to SDK infrastructure + compat shims; exclude Stainless autogen (resources/, types/) and training/ (which has its own test suite) - Raise --cov-fail-under from 60 to 70; local run gets 71.20% Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Merged
Hecate0821
added a commit
that referenced
this pull request
Apr 22, 2026
* feat: downstream compat shims, CI gates, and test coverage (#62) * feat(ci): add mock server tests, e2e tests, coverage, and security scanning Add CI infrastructure needed for stable release readiness: - test-mock-server job: Prism mock server + multi-Python (3.9/3.11/3.13) matrix to unskip 884 API resource tests via RUN_MOCK_SERVER_TESTS env var - test-coverage job: pytest-cov with 60% threshold and artifact upload - ci-e2e.yml: daily e2e tests against live Fireworks API with tests/functional/ suite covering chat completions, streaming, async, multi-turn, raw response, legacy completions, and model listing - ci-security.yml: weekly pip-audit dependency vulnerability scan - post-publish.yml: smoke test after PyPI publish on 3 Python versions - conftest.py hook to dynamically remove mock server skip markers - pytest-cov and pip-audit added to dev dependencies Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: add fireworks.client compat shim and split training extras Backwards compatibility: - Add fireworks/client/__init__.py that re-exports Fireworks and AsyncFireworks from the top-level package with a DeprecationWarning. This preserves the old `from fireworks.client import Fireworks` import path used by langchain-fireworks, instructor, and internal services. Training extras split: - Add `training-sdk` optional extra with only the deps the SDK actually imports at runtime (tinker + requests). The SDK never imports torch, transformers, datasets, tiktoken, numpy, or wandb. - Keep `training` extra as the full set for cookbook/notebook workflows. `pip install fireworks-ai[training-sdk]` saves ~5GB vs `[training]`. Downstream compat tests: - Add test_downstream_compat.py verifying old import paths, new import paths, client interface surface, and deprecation warning behavior. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: add acreate() compat alias for langchain and instructor langchain-fireworks and instructor both call `client.chat.completions.acreate()` — a method that existed on the 0.x SDK but not the Stainless-generated 1.x SDK (which uses async-native `create()` instead). Add `acreate` as a deprecated alias on both `AsyncCompletionsResource` classes (chat and legacy completions) via a patch in `lib/_legacy_compat.py`. The alias calls `create()` and emits a DeprecationWarning. This means langchain-fireworks and instructor will work unchanged once they bump their `<1.0.0` upper bound — no code changes needed on their side beyond the version constraint. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test: add downstream compat unit and e2e tests for langchain/instructor patterns Unit tests (respx-mocked) verify exact calling patterns used by langchain-fireworks and instructor: sync/async create, acreate alias, isinstance dispatch via old import paths, response.model_dump(), and streaming chunk.delta access. E2e tests replicate the same patterns against the live Fireworks API. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: use realistic base_url=None in downstream compat unit tests langchain-fireworks defaults fireworks_api_base to None, so the SDK uses its hardcoded absolute inference URLs (e.g. https://api.fireworks.ai/inference/v1/chat/completions). Updated mocks to match this real-world behavior instead of overriding base_url. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore: fix ruff target, disable prerelease mode, remove --pre from README - Ruff target-version py38 → py39 to match requires-python >= 3.9 - release-please: versioning default, prerelease false — ready for stable 1.0.0 release - README: drop --pre flag from install instructions Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * style: fix ruff lint errors (import sorting, unused import, lambda arg) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore: remove .coverage from tracking Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: resolve pyright strict mode errors in test files and legacy compat - Add type annotations to _patch_acreate_aliases (Any params/return) - Add pyright pragmas to test files that exercise monkey-patched acreate - Use type: ignore for runtime-patched attribute access (acreate) - Annotate streaming chunk lists as list[object] for type narrowing - Inline kwargs in e2e test to avoid dict[str, str|int] unpacking issue - Use isinstance(content, str) instead of assert content for type guard Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: add union-attr to type ignore for mypy compatibility Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address PR review comments for downstream compat - Filter DeprecationWarnings by message text (stacklevel=2 attributes to caller, not fireworks.client module) - Relax version assertion to startswith("1.") for alpha/rc compat - Add Prism health check with curl -sf failure detection in CI - Replace sleep-based PyPI wait with retry loop in post-publish - Loosen tinker pin from ==0.15.0 to >=0.15.0 - Remove per-test warnings.catch_warnings blocks (use global filter) - Add test_async_acreate_streaming covering langchain _astream() path Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: use curl -s (not -sf) for Prism health check Prism returns 404 for GET / since it only serves spec endpoints. The -f flag causes curl to fail on HTTP errors. We only need to check that Prism is reachable (any response = server is up). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(ci): bump aiohttp to 3.13.5 and tighten coverage gate to 70% - aiohttp 3.13.3 → 3.13.5 in lock files (resolves 10 CVEs: CVE-2026-34513 through CVE-2026-34520, CVE-2026-34525, CVE-2026-22815) - Scope coverage to SDK infrastructure + compat shims; exclude Stainless autogen (resources/, types/) and training/ (which has its own test suite) - Raise --cov-fail-under from 60 to 70; local run gets 71.20% Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * perf(client): optimize file structure copying in multipart requests * fix(compat): make acreate(stream=True) iterable without await (#123) langchain-fireworks does `async for chunk in async_client.acreate(stream=True, ...)` without an intermediate await. The previous shim was an `async def`, so the call returned a coroutine, raising `TypeError: 'async for' requires an object with __aiter__`. Wrap the underlying coroutine in `_AsyncCreateProxy`, which implements both `__await__` (for `await acreate(...)`) and `__aiter__` (for direct iteration). Both patterns now work; verified live against api.fireworks.ai with langchain-fireworks 1.1.0 and instructor 1.14.5. The previous test_async_acreate_streaming did `stream = await acreate(...)` first — it never exercised langchain's actual no-await pattern. Replaced with two tests, one per pattern. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * sdk: add load_adapter client method for HF PEFT warm-start (#122) * Add load_adapter method to FiretitanTrainingClient Exposes the server-side /api/v1/load_adapter endpoint (defined in firetitan tinker_api.py) to Python callers. Returns a Future that resolves when the HF PEFT adapter has been loaded into the current LoRA session. Weights-only load — matches V1 WithInputPeftAddon warm-start semantics (fresh optimizer, fresh LR, fresh data cursor). The stainless-generated client does not have a typed method for this endpoint yet; call the endpoint directly via client.post() and wrap the UntypedAPIFuture response in _APIFuture for polling. Pattern mirrors save_weights_for_sampler_ext. Unblocks cookbook-side HF warm-start plumbing and the control-plane Model-to-GcsUri resolver that depends on it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * review: fix adapter_path stripping and misleading docstring Code review findings: - Strip leading/trailing whitespace from adapter_path before sending. Prior: validation used .strip() but the raw value was forwarded, so " gs://bucket " would pass validation and fail server-side with a less clear error. - Correct the docstring's Raises: section. The method doesn't raise RuntimeError; server rejections surface as httpx errors through the returned Future. Document that explicitly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Trim load_adapter docstring Remove internal / cross-repo references from the public-facing API docstring. Keep only the contract: what the method does, what args it accepts, what it returns, what it raises. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Fix load_adapter result type to LoadAdapterResponse Initial implementation cast the _APIFuture to UntypedAPIFuture, which requires a request_id field. The server's final response (after the LoadAdapterOp completes) is LoadAdapterResponse — {model_id, adapter_path, type: 'load_adapter'} — missing request_id, so Pydantic validation errored out on completion. Define a local LoadAdapterResponse matching the server schema and route _APIFuture through it. Initial POST still casts to UntypedAPIFuture since that IS the enqueue response shape. Caught by a cookbook-direct end-to-end run on pyroworks against a real Qwen3-4B LoRA adapter — trainer spun up, load_adapter POST succeeded, but the client polling raised on the response shape. With this fix, same flow completes: adapter loaded in 5.4s, three optim steps run with loss 4.30 -> 1.66 (well below random-init territory). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Trim load_adapter docstrings and simplify input guard Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Fix import sort order (ruff I001) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(rollouts): add BYOT RL rollout guides * ci: add Sync Text Completion OpenAPI workflow * release: 1.0.0-alpha.63 --------- Co-authored-by: Yufei (Benny) Chen <1585539+benjibc@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: stainless-app[bot] <142633134+stainless-app[bot]@users.noreply.github.com> Co-authored-by: Chengxi Li <114854555+Hecate0821@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Automated Release PR
1.0.0-alpha.58 (2026-04-03)
Full Changelog: v1.0.0-alpha.57...v1.0.0-alpha.58
Bug Fixes
This pull request is managed by Stainless's GitHub App.
The semver version number is based on included commit messages. Alternatively, you can manually set the version number in the title of this pull request.
For a better experience, it is recommended to use either rebase-merge or squash-merge when merging this pull request.
🔗 Stainless website
📚 Read the docs
🙋 Reach out for help or questions