Skip to content

Conversation

@CalebCourier
Copy link
Collaborator

@CalebCourier CalebCourier commented Feb 20, 2024

Summary

Adds the ability to generate structured data where the outermost layer is a List/Array. This can be helpful when using LLM's to generate synthetic data, when mimicking REST APIs, etc.

This PR also removes an exception that what raised during validation when the data was not a dictionary. We now let the process enter schema validation regardless of its type. If the type is wrong, this will be caught and result in a SkeletonReAsk. Addresses #594.

Usage

A list structure can be specified as such:

The Desired List Structure

[
  { "field": "hello, world!" }
]

Using Pydantic

class Foo(BaseModel):
    field: str
    
guard = Guard.from_pydantic(output_class=List[Foo])

Using RAIL

<rail version="0.1">
  <output type="list">
    <object>
        <string name="field" description="Any random string value" />
    </object>
  </output>
</rail>

TODO

  • Unit Tests
    • Rail.output_type
    • JsonSchema.from_xml
    • JsonSchema.from_pydantic
    • validator_base.check_refrain
    • validator_base.filter_in_schema
  • Integration Tests
    • from_pydantic with List[BaseModel]
    • from_rail with output.type = "list"
  • Notebooks?
  • Documentation?

@CalebCourier
Copy link
Collaborator Author

Needs cross version test fixes. Can pick up mid week.

@ShreyaR ShreyaR marked this pull request as ready for review March 5, 2024 08:27
@ShreyaR ShreyaR requested a review from zsimjee March 5, 2024 08:28
@zsimjee zsimjee merged commit 5ce1d29 into main Mar 5, 2024
@CalebCourier CalebCourier deleted the list-support branch May 17, 2024 16:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants