fix(datasets): add schema arguments to create_dataset #1457
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Important
Add
input_schemaandexpected_output_schemaparameters tocreate_dataset()for JSON Schema validation inclient.py.input_schemaandexpected_output_schemaoptional parameters tocreate_dataset()inclient.pyfor JSON Schema validation of dataset items.create_dataset()to include these schemas inCreateDatasetRequest.test_create_dataset_item()andtest_get_dataset_runs()intest_datasets.pyto use dictionary inputs instead of JSON strings.This description was created by
for 030b82b. You can customize this summary. It will automatically update as commits are pushed.
Disclaimer: Experimental PR review
Greptile Overview
Greptile Summary
This PR extends the
create_dataset()method with two new optional parameters:input_schemaandexpected_output_schema. These parameters allow users to specify JSON Schemas that will be used to validate dataset items when they are created.input_schemaparameter to validate dataset item inputsexpected_output_schemaparameter to validate dataset item expected outputsThe implementation follows existing codebase patterns, using camelCase aliases when constructing the
CreateDatasetRequestPydantic model, consistent with how other aliased fields (likesourceTraceId,sourceObservationId) are handled elsewhere in the codebase.Confidence Score: 5/5
Important Files Changed
File Analysis
input_schemaandexpected_output_schemaoptional parameters tocreate_dataset()method, following existing codebase patterns for schema validation on dataset items.Sequence Diagram
sequenceDiagram participant User participant Langfuse Client participant CreateDatasetRequest participant Langfuse API User->>Langfuse Client: create_dataset(name, input_schema, expected_output_schema) Langfuse Client->>CreateDatasetRequest: Create request body with schemas Langfuse Client->>Langfuse API: POST /datasets Langfuse API-->>Langfuse Client: Dataset (with validation schemas) Langfuse Client-->>User: Dataset object