Skip to content

chore: Prepare for release (0a2e9115c308)#1940

Merged
Luca Forstner (lforst) merged 27 commits into
releasefrom
prepare-release/0a2e9115c308
May 4, 2026
Merged

chore: Prepare for release (0a2e9115c308)#1940
Luca Forstner (lforst) merged 27 commits into
releasefrom
prepare-release/0a2e9115c308

Conversation

@github-actions
Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot commented May 4, 2026

Prepares a release by updating changelogs and package versions, and synchronizing everything to the release branch.

github-actions Bot and others added 27 commits April 20, 2026 21:49
Synchronizes the main branch with the release branch. (changed files
should generally only be package versions, changeset files, and
changelogs)

---------

Co-authored-by: Luca Forstner <luca.forstner@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Moves us away from discontinued models for testing.
We were ignoring streamed tool calls.

Fixes
#1846
Now looks like this

<img width="288" height="301" alt="Screenshot 2026-04-22 at 13 32 24"
src="https://github.com/user-attachments/assets/26f96a2c-659f-4a42-8429-f91fb034a22f"
/>
### Summary

This PR adds dataset snapshot and environment tag support to the JS SDK.
See feature spec here:
braintrustdata/braintrust-spec#14

### Background

This change adds two friendlier ways to reference dataset versions:

- Snapshots, which are stable human-readable names for a specific
dataset version
- Environment tags, which are movable aliases like ppe or production
that can be repointed over time

These are still just ways of referring to a concrete dataset version
(xact_id). The SDK resolves snapshot names and environment tags down to
the underlying xact_id before experiment or eval registration, so we
keep the existing reproducibility guarantees while making version
selection much easier to use.

### This PR adds:
- SDK support for initializing datasets by:
  - explicit version (xact_id)
  - snapshot name
  - environment tag
- Resolution of snapshot and environment selectors to a concrete dataset
version internally before eval / experiment registration
- SDK helpers for dataset snapshots, including:
  - create
  - list
  - update via register/upsert for the current dataset version
  - patch snapshot metadata by id
  - delete
- restore and restore/preview to return the dataset head to the state at
a particular version
- Dev server support for forwarding dataset version and environment when
resolving datasets for remote evals
- Tests and example coverage for the new version-selection paths
We should also wipe `util/dist`
Stores _internal_btql filters for experiment datasets in the experiment
metadata.

Right now we don’t persist those filter options, so we lose the ability
to reconstruct the exact subset of rows an experiment ran against. If we
save them with the experiment, we can recreate the same row set later
instead of having to guess.

This unlocks a few useful things:
- Re-running an experiment on the exact same data it originally saw.
- Showing the BTQL filter used by an experiment in the Braintrust UI.
- Anything else that depends on reconstructing the precise rows that
were initially fed into an experiment.
It being a prerelease actually tripped up our release process because we
didn't define a tag. In general we should probably not have rc versions
in the package jsons too.
Automated regeneration of SDK types.

Co-authored-by: braintrust-bot[bot] <215900051+braintrust-bot[bot]@users.noreply.github.com>
for some reason the model started respond weirdly
I don't like that vitest is spamming ci summaries
Fixes
#1919

Can probably still be improved but a first iteration.

<img width="288" height="295" alt="Screenshot 2026-05-04 at 10 58 19"
src="https://github.com/user-attachments/assets/32321a58-807a-4512-aa91-ae9b519136a4"
/>
@lforst Luca Forstner (lforst) merged commit ecaee2e into release May 4, 2026
46 of 49 checks passed
@lforst Luca Forstner (lforst) deleted the prepare-release/0a2e9115c308 branch May 4, 2026 21:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants