Skip to content

release: 1.0.0-alpha.58#62

Merged
Hecate0821 merged 3 commits intomainfrom
release-please--branches--main--changes--next
Apr 3, 2026
Merged

release: 1.0.0-alpha.58#62
Hecate0821 merged 3 commits intomainfrom
release-please--branches--main--changes--next

Conversation

@stainless-app
Copy link
Copy Markdown
Contributor

@stainless-app stainless-app Bot commented Apr 3, 2026

Automated Release PR

1.0.0-alpha.58 (2026-04-03)

Full Changelog: v1.0.0-alpha.57...v1.0.0-alpha.58

Bug Fixes

  • training-sdk: disable delta hotload for LoRA adapters (#114) (4cfd1e7)

This pull request is managed by Stainless's GitHub App.

The semver version number is based on included commit messages. Alternatively, you can manually set the version number in the title of this pull request.

For a better experience, it is recommended to use either rebase-merge or squash-merge when merging this pull request.

🔗 Stainless website
📚 Read the docs
🙋 Reach out for help or questions

stainless-app Bot and others added 3 commits April 3, 2026 18:32
LoRA adapter checkpoints are standalone PEFT exports — the trainer
does not XOR-compress them regardless of checkpoint_type. Sending
incremental_snapshot_metadata for LoRA hotloads causes the serving
container to crash with a KeyError during delta decompression, because
raw adapter safetensors have no per-tensor Alder32 checksums in their
metadata.

When lora_rank > 0, WeightSyncer now forces checkpoint_type="base" and
returns None for incremental metadata, so every hotload is FULL.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@stainless-app
Copy link
Copy Markdown
Contributor Author

stainless-app Bot commented Apr 3, 2026

🧪 Testing

To try out this version of the SDK:

pip install 'https://pkg.stainless.com/s/fireworks-ai-python/4cfd1e7b3865802299c52f36c902731672633a4b/fireworks_ai-1.0.0a57-py3-none-any.whl'

Expires at: Sun, 03 May 2026 21:01:04 GMT
Updated at: Fri, 03 Apr 2026 21:01:04 GMT

@Hecate0821 Hecate0821 merged commit 2e7818d into main Apr 3, 2026
7 checks passed
@stainless-app
Copy link
Copy Markdown
Contributor Author

stainless-app Bot commented Apr 3, 2026

stainless-app Bot pushed a commit that referenced this pull request Apr 17, 2026
* feat(ci): add mock server tests, e2e tests, coverage, and security scanning

Add CI infrastructure needed for stable release readiness:

- test-mock-server job: Prism mock server + multi-Python (3.9/3.11/3.13)
  matrix to unskip 884 API resource tests via RUN_MOCK_SERVER_TESTS env var
- test-coverage job: pytest-cov with 60% threshold and artifact upload
- ci-e2e.yml: daily e2e tests against live Fireworks API with
  tests/functional/ suite covering chat completions, streaming, async,
  multi-turn, raw response, legacy completions, and model listing
- ci-security.yml: weekly pip-audit dependency vulnerability scan
- post-publish.yml: smoke test after PyPI publish on 3 Python versions
- conftest.py hook to dynamically remove mock server skip markers
- pytest-cov and pip-audit added to dev dependencies

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: add fireworks.client compat shim and split training extras

Backwards compatibility:
- Add fireworks/client/__init__.py that re-exports Fireworks and
  AsyncFireworks from the top-level package with a DeprecationWarning.
  This preserves the old `from fireworks.client import Fireworks` import
  path used by langchain-fireworks, instructor, and internal services.

Training extras split:
- Add `training-sdk` optional extra with only the deps the SDK actually
  imports at runtime (tinker + requests). The SDK never imports torch,
  transformers, datasets, tiktoken, numpy, or wandb.
- Keep `training` extra as the full set for cookbook/notebook workflows.
  `pip install fireworks-ai[training-sdk]` saves ~5GB vs `[training]`.

Downstream compat tests:
- Add test_downstream_compat.py verifying old import paths, new import
  paths, client interface surface, and deprecation warning behavior.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: add acreate() compat alias for langchain and instructor

langchain-fireworks and instructor both call
`client.chat.completions.acreate()` — a method that existed on the 0.x
SDK but not the Stainless-generated 1.x SDK (which uses async-native
`create()` instead).

Add `acreate` as a deprecated alias on both `AsyncCompletionsResource`
classes (chat and legacy completions) via a patch in `lib/_legacy_compat.py`.
The alias calls `create()` and emits a DeprecationWarning.

This means langchain-fireworks and instructor will work unchanged once
they bump their `<1.0.0` upper bound — no code changes needed on their
side beyond the version constraint.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* test: add downstream compat unit and e2e tests for langchain/instructor patterns

Unit tests (respx-mocked) verify exact calling patterns used by
langchain-fireworks and instructor: sync/async create, acreate alias,
isinstance dispatch via old import paths, response.model_dump(), and
streaming chunk.delta access. E2e tests replicate the same patterns
against the live Fireworks API.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: use realistic base_url=None in downstream compat unit tests

langchain-fireworks defaults fireworks_api_base to None, so the SDK
uses its hardcoded absolute inference URLs (e.g.
https://api.fireworks.ai/inference/v1/chat/completions). Updated mocks
to match this real-world behavior instead of overriding base_url.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: fix ruff target, disable prerelease mode, remove --pre from README

- Ruff target-version py38 → py39 to match requires-python >= 3.9
- release-please: versioning default, prerelease false — ready for
  stable 1.0.0 release
- README: drop --pre flag from install instructions

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* style: fix ruff lint errors (import sorting, unused import, lambda arg)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: remove .coverage from tracking

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: resolve pyright strict mode errors in test files and legacy compat

- Add type annotations to _patch_acreate_aliases (Any params/return)
- Add pyright pragmas to test files that exercise monkey-patched acreate
- Use type: ignore for runtime-patched attribute access (acreate)
- Annotate streaming chunk lists as list[object] for type narrowing
- Inline kwargs in e2e test to avoid dict[str, str|int] unpacking issue
- Use isinstance(content, str) instead of assert content for type guard

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: add union-attr to type ignore for mypy compatibility

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: address PR review comments for downstream compat

- Filter DeprecationWarnings by message text (stacklevel=2 attributes
  to caller, not fireworks.client module)
- Relax version assertion to startswith("1.") for alpha/rc compat
- Add Prism health check with curl -sf failure detection in CI
- Replace sleep-based PyPI wait with retry loop in post-publish
- Loosen tinker pin from ==0.15.0 to >=0.15.0
- Remove per-test warnings.catch_warnings blocks (use global filter)
- Add test_async_acreate_streaming covering langchain _astream() path

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: use curl -s (not -sf) for Prism health check

Prism returns 404 for GET / since it only serves spec endpoints.
The -f flag causes curl to fail on HTTP errors. We only need to
check that Prism is reachable (any response = server is up).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(ci): bump aiohttp to 3.13.5 and tighten coverage gate to 70%

- aiohttp 3.13.3 → 3.13.5 in lock files (resolves 10 CVEs: CVE-2026-34513
  through CVE-2026-34520, CVE-2026-34525, CVE-2026-22815)
- Scope coverage to SDK infrastructure + compat shims; exclude Stainless
  autogen (resources/, types/) and training/ (which has its own test suite)
- Raise --cov-fail-under from 60 to 70; local run gets 71.20%

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
@stainless-app stainless-app Bot mentioned this pull request Apr 17, 2026
Hecate0821 added a commit that referenced this pull request Apr 22, 2026
* feat: downstream compat shims, CI gates, and test coverage (#62)

* feat(ci): add mock server tests, e2e tests, coverage, and security scanning

Add CI infrastructure needed for stable release readiness:

- test-mock-server job: Prism mock server + multi-Python (3.9/3.11/3.13)
  matrix to unskip 884 API resource tests via RUN_MOCK_SERVER_TESTS env var
- test-coverage job: pytest-cov with 60% threshold and artifact upload
- ci-e2e.yml: daily e2e tests against live Fireworks API with
  tests/functional/ suite covering chat completions, streaming, async,
  multi-turn, raw response, legacy completions, and model listing
- ci-security.yml: weekly pip-audit dependency vulnerability scan
- post-publish.yml: smoke test after PyPI publish on 3 Python versions
- conftest.py hook to dynamically remove mock server skip markers
- pytest-cov and pip-audit added to dev dependencies

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: add fireworks.client compat shim and split training extras

Backwards compatibility:
- Add fireworks/client/__init__.py that re-exports Fireworks and
  AsyncFireworks from the top-level package with a DeprecationWarning.
  This preserves the old `from fireworks.client import Fireworks` import
  path used by langchain-fireworks, instructor, and internal services.

Training extras split:
- Add `training-sdk` optional extra with only the deps the SDK actually
  imports at runtime (tinker + requests). The SDK never imports torch,
  transformers, datasets, tiktoken, numpy, or wandb.
- Keep `training` extra as the full set for cookbook/notebook workflows.
  `pip install fireworks-ai[training-sdk]` saves ~5GB vs `[training]`.

Downstream compat tests:
- Add test_downstream_compat.py verifying old import paths, new import
  paths, client interface surface, and deprecation warning behavior.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: add acreate() compat alias for langchain and instructor

langchain-fireworks and instructor both call
`client.chat.completions.acreate()` — a method that existed on the 0.x
SDK but not the Stainless-generated 1.x SDK (which uses async-native
`create()` instead).

Add `acreate` as a deprecated alias on both `AsyncCompletionsResource`
classes (chat and legacy completions) via a patch in `lib/_legacy_compat.py`.
The alias calls `create()` and emits a DeprecationWarning.

This means langchain-fireworks and instructor will work unchanged once
they bump their `<1.0.0` upper bound — no code changes needed on their
side beyond the version constraint.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* test: add downstream compat unit and e2e tests for langchain/instructor patterns

Unit tests (respx-mocked) verify exact calling patterns used by
langchain-fireworks and instructor: sync/async create, acreate alias,
isinstance dispatch via old import paths, response.model_dump(), and
streaming chunk.delta access. E2e tests replicate the same patterns
against the live Fireworks API.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: use realistic base_url=None in downstream compat unit tests

langchain-fireworks defaults fireworks_api_base to None, so the SDK
uses its hardcoded absolute inference URLs (e.g.
https://api.fireworks.ai/inference/v1/chat/completions). Updated mocks
to match this real-world behavior instead of overriding base_url.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: fix ruff target, disable prerelease mode, remove --pre from README

- Ruff target-version py38 → py39 to match requires-python >= 3.9
- release-please: versioning default, prerelease false — ready for
  stable 1.0.0 release
- README: drop --pre flag from install instructions

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* style: fix ruff lint errors (import sorting, unused import, lambda arg)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: remove .coverage from tracking

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: resolve pyright strict mode errors in test files and legacy compat

- Add type annotations to _patch_acreate_aliases (Any params/return)
- Add pyright pragmas to test files that exercise monkey-patched acreate
- Use type: ignore for runtime-patched attribute access (acreate)
- Annotate streaming chunk lists as list[object] for type narrowing
- Inline kwargs in e2e test to avoid dict[str, str|int] unpacking issue
- Use isinstance(content, str) instead of assert content for type guard

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: add union-attr to type ignore for mypy compatibility

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: address PR review comments for downstream compat

- Filter DeprecationWarnings by message text (stacklevel=2 attributes
  to caller, not fireworks.client module)
- Relax version assertion to startswith("1.") for alpha/rc compat
- Add Prism health check with curl -sf failure detection in CI
- Replace sleep-based PyPI wait with retry loop in post-publish
- Loosen tinker pin from ==0.15.0 to >=0.15.0
- Remove per-test warnings.catch_warnings blocks (use global filter)
- Add test_async_acreate_streaming covering langchain _astream() path

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: use curl -s (not -sf) for Prism health check

Prism returns 404 for GET / since it only serves spec endpoints.
The -f flag causes curl to fail on HTTP errors. We only need to
check that Prism is reachable (any response = server is up).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(ci): bump aiohttp to 3.13.5 and tighten coverage gate to 70%

- aiohttp 3.13.3 → 3.13.5 in lock files (resolves 10 CVEs: CVE-2026-34513
  through CVE-2026-34520, CVE-2026-34525, CVE-2026-22815)
- Scope coverage to SDK infrastructure + compat shims; exclude Stainless
  autogen (resources/, types/) and training/ (which has its own test suite)
- Raise --cov-fail-under from 60 to 70; local run gets 71.20%

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* perf(client): optimize file structure copying in multipart requests

* fix(compat): make acreate(stream=True) iterable without await (#123)

langchain-fireworks does `async for chunk in async_client.acreate(stream=True, ...)`
without an intermediate await. The previous shim was an `async def`, so the
call returned a coroutine, raising `TypeError: 'async for' requires an object
with __aiter__`.

Wrap the underlying coroutine in `_AsyncCreateProxy`, which implements both
`__await__` (for `await acreate(...)`) and `__aiter__` (for direct iteration).
Both patterns now work; verified live against api.fireworks.ai with
langchain-fireworks 1.1.0 and instructor 1.14.5.

The previous test_async_acreate_streaming did `stream = await acreate(...)`
first — it never exercised langchain's actual no-await pattern. Replaced with
two tests, one per pattern.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* sdk: add load_adapter client method for HF PEFT warm-start (#122)

* Add load_adapter method to FiretitanTrainingClient

Exposes the server-side /api/v1/load_adapter endpoint (defined in
firetitan tinker_api.py) to Python callers. Returns a Future that
resolves when the HF PEFT adapter has been loaded into the current
LoRA session. Weights-only load — matches V1 WithInputPeftAddon
warm-start semantics (fresh optimizer, fresh LR, fresh data cursor).

The stainless-generated client does not have a typed method for this
endpoint yet; call the endpoint directly via client.post() and wrap
the UntypedAPIFuture response in _APIFuture for polling. Pattern
mirrors save_weights_for_sampler_ext.

Unblocks cookbook-side HF warm-start plumbing and the control-plane
Model-to-GcsUri resolver that depends on it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* review: fix adapter_path stripping and misleading docstring

Code review findings:
- Strip leading/trailing whitespace from adapter_path before sending.
  Prior: validation used .strip() but the raw value was forwarded,
  so "  gs://bucket  " would pass validation and fail server-side
  with a less clear error.
- Correct the docstring's Raises: section. The method doesn't raise
  RuntimeError; server rejections surface as httpx errors through
  the returned Future. Document that explicitly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Trim load_adapter docstring

Remove internal / cross-repo references from the public-facing API
docstring. Keep only the contract: what the method does, what args
it accepts, what it returns, what it raises.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Fix load_adapter result type to LoadAdapterResponse

Initial implementation cast the _APIFuture to UntypedAPIFuture, which
requires a request_id field. The server's final response (after the
LoadAdapterOp completes) is LoadAdapterResponse — {model_id,
adapter_path, type: 'load_adapter'} — missing request_id, so Pydantic
validation errored out on completion.

Define a local LoadAdapterResponse matching the server schema and
route _APIFuture through it. Initial POST still casts to
UntypedAPIFuture since that IS the enqueue response shape.

Caught by a cookbook-direct end-to-end run on pyroworks against a
real Qwen3-4B LoRA adapter — trainer spun up, load_adapter POST
succeeded, but the client polling raised on the response shape. With
this fix, same flow completes: adapter loaded in 5.4s, three optim
steps run with loss 4.30 -> 1.66 (well below random-init territory).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Trim load_adapter docstrings and simplify input guard

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Fix import sort order (ruff I001)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(rollouts): add BYOT RL rollout guides

* ci: add Sync Text Completion OpenAPI workflow

* release: 1.0.0-alpha.63

---------

Co-authored-by: Yufei (Benny) Chen <1585539+benjibc@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: stainless-app[bot] <142633134+stainless-app[bot]@users.noreply.github.com>
Co-authored-by: Chengxi Li <114854555+Hecate0821@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant