Skip to content

Conversation

@apascal07
Copy link
Collaborator

@apascal07 apascal07 commented May 26, 2024

Summary of changes:

  • Flattened input schemas so the dev UI can auto-fill values for required fields.
  • Made Part fields exported because inferJSONSchema wasn't generating schema for them correctly which failed validation.
  • Added input validation pre-JSON unmarshaling to runJSON for both flows and actions. This is necessary because the unmarshaling throws away unknown/invalid fields and we want to enforce clean correct inputs.
  • Added input/output validation to the core Run method for actions and Execute method for flows.
  • Added output conformance to Generate methods for model actions.
  • Removed special case for empty struct (void) input because it failed input validation in conflicting ways between action input and flow input. No special case works fine.

To dos:

  • Config.OutputSchema needs to be *jsonschema.Schema instead of map[string]any. Tried replacing it but it seemed to cause a panic, but Config.InputSchema is already that type so there must be more to it...
  • Doesn't support null as an input right now; seems to convert it to an object at some point.
  • Needs a friendlier API for providing output schema when starting from a struct.
  • Input validation errors at the runJSON stage display terribly (unreadably, need to look in terminal) in dev UI right now since they run outside of the tracing and dev UI only reads from tracing right now.

@apascal07 apascal07 marked this pull request as ready for review May 26, 2024 05:20
@apascal07 apascal07 requested review from jba and pavelgj May 26, 2024 05:20
Copy link
Contributor

@jba jba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Alex, great stuff. Especially the extensive tests.

return text, nil
})

r := &jsonschema.Reflector{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should design an API to do this for people. Like maybe another field in dotprompt.Config that you can set to a value to reflect, instead of setting InputSchema. WDYT?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good idea. This PR is getting big so I added it to the 3 other TODOs to do in a follow up.

@apascal07 apascal07 marked this pull request as draft May 27, 2024 14:24
@apascal07 apascal07 changed the title [Go] Added model output conformance. [Go] [Draft] Added model output conformance. May 27, 2024
@apascal07 apascal07 changed the title [Go] [Draft] Added model output conformance. [Go] [Draft] Added input/output validation for flows and actions and model output conformance. May 27, 2024
@apascal07 apascal07 marked this pull request as ready for review May 28, 2024 21:40
@apascal07 apascal07 changed the title [Go] [Draft] Added input/output validation for flows and actions and model output conformance. [Go] Added input/output validation for flows and actions and model output conformance. May 28, 2024
Copy link
Contributor

@jba jba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, thanks for all this. Sorry it's such a beast.
And also sorry for our API updates, which will give you some merge headaches.
I would try to get this in quickly even if incomplete, to avoid future merge pain.

text string // valid for kind∈{text,blob}
toolRequest *ToolRequest // valid for kind==partToolRequest
toolResponse *ToolResponse // valid for kind==partToolResponse
Kind partKind `json:"kind,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's generally a bad idea to have an exported field of unexported type. It means users can't easily work with the field. For example, there is no way for someone to write a function that returns a value assignable to Part.Kind.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah, missed this. Unfortunately, had to undo all of this nice visibility control here because there's no way to generate the schema correctly any other way from what I can tell.

},
}
candidate, err := validCandidate(candidate, outputSchema)
assert.NoError(t, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Case in point: what does this line actually do?
The doc for NoError says "NoError asserts that a function returned no error," but what does "assert" actually mean? I had to go and read the source code for the package.

It turns out that this line ultimately calls t.Errorf. That is not the best choice here, in my opinion. It is pointless to continue testing if the JSON isn't valid, so t.Fatalf should be called.

In this case, it doesn't matter much. But it would be a more serious bug in a case like this:

somePtrValue, err := f()
assert.NoError(t, err)
assert.Equal(somePtrValue.Field, want)

Here, somePtrValue is probably nil if err is non-nil, so the third line panics.

If you wrote out the tests and t.Error/Fatal calls explicitly, it would be more work for you, the writer, but less work for every reader to check whether your code was correct.

return fmt.Errorf("data is not valid JSON: %w", err)
}

schemaLoader := gojsonschema.NewBytesLoader(schemaBytes)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, so you have to convert a jsonschema.Schema into a gojsonschema.Schema? That's unfortunate. Maybe @ianlancetaylor can help with that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at it but this is likely the best we can do for now.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah... this is not ideal. The package you were using is the better (and much newer) of the two but it doesn't provide any validation.

@apascal07
Copy link
Collaborator Author

Again, thanks for all this. Sorry it's such a beast. And also sorry for our API updates, which will give you some merge headaches. I would try to get this in quickly even if incomplete, to avoid future merge pain.

No worries! Thanks for the thorough review, this guidance was very helpful.

@apascal07 apascal07 requested a review from jba May 29, 2024 01:46
@apascal07 apascal07 enabled auto-merge (squash) May 29, 2024 16:36
@apascal07 apascal07 disabled auto-merge May 29, 2024 16:36
@apascal07 apascal07 enabled auto-merge (squash) May 29, 2024 16:36
@apascal07 apascal07 merged commit 78a55e2 into main May 29, 2024
@apascal07 apascal07 deleted the ap-go-schema branch May 29, 2024 16:37
ianlancetaylor pushed a commit that referenced this pull request May 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants