feat: dataset versioning#1837
Conversation
| const newNormalized = normalizeClass(newClass); | ||
|
|
||
| // Check if normalized versions are similar (one contains significant portion of the other) | ||
| const similarityThreshold = Math.min(500, oldNormalized.length * 0.5); |
There was a problem hiding this comment.
This heuristic fails if you are adding substantially to the method bodies of a class - added validation through the TS parser as a fallback when this fails. Since this should be faster, keeping it for the normal case.
There was a problem hiding this comment.
Luca Forstner (@lforst) we really need to split this file up 😭
| }); | ||
| args["dataset_id"] = datasetSelection.datasetId; | ||
| if (datasetSelection.datasetVersion !== undefined) { | ||
| args["dataset_version"] = datasetSelection.datasetVersion; |
There was a problem hiding this comment.
This is a change to how this worked before because we had the } else { branch that would do args["dataset_version"] = await (dataset as AnyDataset).version();. Is that intentional? I guess we do save on having to do the dataset.version() call.
There was a problem hiding this comment.
No that should probably get fixed so this still works for subclasses/cases where the version is pinned manually - updated serializeDatasetForExperiment() so it will always hit .version() if we don't resolve through one of the other selections.
| const snapshots = await getDatasetSnapshots({ state, datasetId }); | ||
| const match = snapshots.find((snapshot) => snapshot.name === snapshotName); |
There was a problem hiding this comment.
can we add a backend endpoint to do this instead? Feels like a lot to scan through all the datasets client-side.
There was a problem hiding this comment.
Yes this should already be supported - pulled apart listSnapshots() and getSnapshot() so this makes use of that properly now.
| dataset_id: string; | ||
| dataset_version?: string; | ||
| dataset_environment?: string; | ||
| dataset_snapshot_name?: string; |
There was a problem hiding this comment.
do we need to pass dataset_snapshot_name into the remote evals created in
braintrust-sdk-javascript/js/dev/server.ts
Line 312 in 343634f
(I lack a lot of context w/ remote evals so lmk if I'm off the mark!)
There was a problem hiding this comment.
Discussed a bit offline - the remote eval path needs more api changes before it gets added in the sdk.
| xactId: string; | ||
| }; | ||
|
|
||
| type DatasetSnapshotLookup = |
There was a problem hiding this comment.
can we export this? it's used by public async getSnapshot
| --- | ||
|
|
||
| - (feat) Add dataset snapshot/environment selection support to `init()` and `initDataset()`, including snapshot CRUD helpers and `DatasetSnapshot` type exports. | ||
| - (feat) Update `braintrust/dev` to respect `dataset_version` and `dataset_environment` when resolving datasets for evals. |
There was a problem hiding this comment.
feel free to also add a little extra detail here, like an example code snippet!
Summary
This PR adds dataset snapshot and environment tag support to the JS SDK. See feature spec here: braintrustdata/braintrust-spec#14
Background
This change adds two friendlier ways to reference dataset versions:
These are still just ways of referring to a concrete dataset version (xact_id). The SDK resolves snapshot names and environment tags down to the underlying xact_id before experiment or eval registration, so we keep the existing reproducibility guarantees while making version selection much easier to use.
This PR adds: