New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
jsonschema: Switch to map based encoding #76
Conversation
@rs, @yanfali. This provides the following public interface for custom validators to implement: type Builder interface {
// JSONSchema should return a map containing JSON Schema Draft 4 properties that can be set based on
// FieldValidator data. Application specific properties can be added as well, but should not conflict with any
// legal JSON Schema keys.
JSONSchema() (map[string]interface{}, error)
} This is, based on "Solution 2" from #35, even you preferred solution 1 there @rs. The internal logic is also switched to rely on maps. As far as I can see, this solution has the following benefits over solution 1:
It has the following disadvantages compared to solution 1:
What do you think of this approach? UPDATE: Renamed Formatter interface -> Builder based on review comments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I promised you a more complex encoding for benchmarking, I'll try and get to that this evening. It's definitely much cleaner, not sure how much slower it will be so it would be good to get some numbers.
// | | | "type": "number", | ||
// | | | "minimum": 0E+00 | ||
// | | | "minimum": 0, | ||
// | | | "type": "number" | ||
// | | }, | ||
// | | "foobar": {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not a fan of this. Shouldn't it at least output "type": "object", I understand the spec may allow this but it feels inconsistent with rest of document.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is still doing that (at the end). It's just that the keys will now be ordered alphabetically, so I had to reorder the keys/values in this example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the example. See comment below
const ( | ||
// schemaSizeHint is the maximum number of keys that can be added to a schema map based on schema.Schema. | ||
schemaSizeHint = 7 | ||
// fieldSizeHint is the maximum number of keys that can be added to a schema map based on schema.Field |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this comment matches the way you use it in the JSONSchema methods. It's actually the minimum and then you add a local value. I think it would be clearer if you updated this comment and used constants within each encoder file so it was clearer what it was you were trying to optimize here.
I'm also not sure how much value we get out of the specifying this hint, as many of the fields we may add are optional. The default allocator size is probably sufficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was a hard comment to write, which is never a good sign I guess.
It's not the minimum, but the minimum maximum in a way. The minimum number of fields is 0. This number is the maximum number of properties that can be added directly from the Field
struct, excluding the filed.Validator attribute, which is what I am trying to state...
I will see if there is a better way.
I'm also not sure how much value we get out of the specifying this hint, as many of the fields we may add are optional. The default allocator size is probably sufficient.
I will try without a size hints, and then rerun the benchmarks:-)
// | | | "type": "number", | ||
// | | | "minimum": 0E+00 | ||
// | | | "minimum": 0, | ||
// | | | "type": "number" | ||
// | | }, | ||
// | | "foobar": {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really like this much. I know it's permitted by the spec but it feels like we should be stricter when we output. At the very least "type": "object".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, it is still outputing the same, just ordered differently, and a different formating for the numbers (e.g. 0 instead of 0E+00).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry comment is about foobar example
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, foobar. That is what we should encode if the field.Validator is set to nil. We could change the example, but I think it's the correct think to write in this situation. If it's set to nil, we can't enforce it to be an object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the example to include a new description @yanfali. Hopefully it's at least a bit more clear from the example what's going on now...
if err != nil { | ||
return err | ||
} | ||
fm, err := formatter.JSONSchema() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/fm/fieldMap/
The terseness hurts the readability.
if err != nil { | ||
return err | ||
} | ||
formatField(fm, field) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this feels a bit backward, adding the meta fields after having created the map but making JSONSchema responsible for allocating the memory is a direct consequence of this. I would suggest renaming formatField. Maybe encodeFields and encodeField? Also I suggest reversing the order of the the parameters; it's often a nicer API that passes in read-only parameters in first and mutable ones last.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we come up with a new name, we should probably change that consistently, e.g. also for formatSchema.
I am not sure encodeFields is better though, as it's no longer encoding data. I will try to think of something...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quite right, ignore me.
// encodeValidator writes JSON Schema keys and values based on v, without the outer curly braces, to w. Note | ||
// that not all FieldValidator types are supported at the moment. | ||
func encodeValidator(w io.Writer, v schema.FieldValidator) (err error) { | ||
func validatorFormatter(v schema.FieldValidator) (f Formatter, err error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
early return would probably be slightly clearer. I'm wary of relying on initialized and named returned values for this use case as it may trip up people later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will change it do use early returns.
if v == nil { | ||
return nil | ||
f = formatterFunc(nilFormatter) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return formatterFunc(nilFormatter), nil
} | ||
switch t := v.(type) { | ||
case Formatter: | ||
f = t |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can leave out this use case if you just return early above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really. The case above is returning a formatter when filed.Validator == nil.
This case is implementing #35 by allowing custom Validators to implement the Formatter interface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also added a test for this case now btw.
case Formatter: | ||
f = t | ||
case *schema.Bool: | ||
f = (*boolFormatter)(t) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return (*boolFormatter)(t), nil
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I will change all of them:-)
default: | ||
return ErrNotImplemented | ||
err = ErrNotImplemented |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return nil, ErrNotImplemented
Maybe builder then. Formatter suggests final output
…On Sun, Jan 1, 2017, 12:38 Sindre Røkenes Myren ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In schema/encoding/jsonschema/schema.go
<#76>:
>
-func (fw *fieldWriter) resetPropertiesCount() {
- fw.propertiesCount = 0
-}
+ for key, field := range fields {
+ if field.Required {
+ required = append(required, key)
+ }
+ formatter, err := validatorFormatter(field.Validator)
I call it formatter in line with the ResponseForamtter interface
<http://rest-layer.io/#custom-response-formatter-sender> from rest-layer.
It's not an encoder anymore, as it doesn't encode the data, only format the
map which will be encoded later.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#76>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/ACK5U7SVMTPyTReFLkNlx_xwkvioSDt1ks5rOA7CgaJpZM4LYt8C>
.
|
I find setting null in json tends to be worthless. Much better to not set
the field at all and use undefined in JS or OK pattern in go.
I still think for nil it's much better to set the type for consistency at
the receiver when emitting as it's explicit. Implicit behavior tends to get
people doing different things because of interpretation. I'm arguing set
the type to object for the nil case.
…On Sun, Jan 1, 2017, 12:50 Sindre Røkenes Myren ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In schema/encoding/jsonschema/example_test.go
<#76>:
> // | | },
// | | "foo": {
- // | | | "type": "number",
- // | | | "minimum": 0E+00
+ // | | | "minimum": 0,
+ // | | | "type": "number"
// | | },
// | | "foobar": {}
ah, foobar. That is what we should encode if the field.Validator is set to
nil. We could change the example, but I think it's the correct think to
write in this situation. If it's set to nil, we can't enforce it to be an
object.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#76>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/ACK5U9QgZIL3c94Ky0kD9YRgOx4GblWWks5rOBGogaJpZM4LYt8C>
.
|
True, but that is besides the point...
Maybe you misunderstand? I meen if The behaviour as it stands is identical to what it was before this change. The only viable alternative would be to raise an error if we don't want to support Validator set to nil in the |
On another topic, the last change addresses most of the review comments. I will do one more commit soon where I remove the size hints. and rerun the benchmarks. I will leave it as one commit for now and rebase later, in case you want to add/run one more benchmark @yanfali. |
5d54a2a
to
2ec21a2
Compare
Removing the size hint appears to be a good call @yanfali. Compared to having the size hints in, we get the same number of allocations and a lower number of bytes allocated without size hints for the Student schema. As for the reported speedup, that's within the margin of error, but at least it leaves the code a bit simpler:
|
I will check what the schema package does today. If it does indeed raise an error, we should do the same in the jsonschema package for now at least. UPDATE: Yep, nil Validators appears valid (just like nil Handlers or any other schema.Field value that you may or may not set). Test code: https://gist.github.com/smyrman/ea15603ec2c4adcf727b4b651184692a |
So if a field is defined as nil in schema we are to output a JSONSchema was empty object which is a stand-in for any type? I can sort of understand that interpretation but it doesn't help the consumer of that schema understand that this field member is used for which is effectively ignore this field? This is the one major weakness of JSON as a specification, lack of comments. If we're going to do that, I would suggest we explicitly put a "description" that describes this behavior for this field. As a consumer of such an API I would be grateful for explicit documentation explaining this rather than leaving me to guess what it means. This wouldn't violate the JSONSchema spec and would be a reasonable compromise. I use JSONSchema as a way to communicate intent to other humans, not just machines. I also feel this better conveys the intent of leaving the Validator nil. |
func (fw *fieldWriter) resetPropertiesCount() { | ||
fw.propertiesCount = 0 | ||
} | ||
for key, field := range fields { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/key/fieldName/
} | ||
func addFields(m map[string]interface{}, fields schema.Fields) error { | ||
props := make(map[string]interface{}, len(fields)) | ||
required := make([]string, 0, len(fields)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
required := []string{}
should be fine.
ew.writeString(`"type": "string", "format": "date-time"`) | ||
return ew.err | ||
func (v timeBuilder) JSONSchema() (map[string]interface{}, error) { | ||
m := make(map[string]interface{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
avoid the allocation, package private var and then return it as is.
var timeJSONSchema map[string]interface{}
func init() {
timeJSONSchema = make(map[string]interface{})
timeJSONSchema["type"] = "string"
timeJSONSchema["format"] = "data-time"
)
Then simply return timeJSONSchema, nil
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That won't quite work. If new fields are added later, they would be added to the global var. E.g. if one timeJSONSchema field is set to be required, all later instances would as well.
I can do what @rs suggests below instead, and do initialization on the allocation line.
ew.writeString(`"type": "array"`) | ||
func (v arrayBuilder) JSONSchema() (map[string]interface{}, error) { | ||
m := make(map[string]interface{}) | ||
m["type"] = "array" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer:
m := map[string]interface{}{
"type": "array",
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, just lazy when I removed the sizeHint for the allocation. I will get it done.
_, ew.err = ew.w.Write([]byte(s)) | ||
} | ||
// builderFunc is an adapter that allows pure functions to implement the Builder interface. | ||
type builderFunc func() (map[string]interface{}, error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't we expose this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently I don't see any way end-users could implement a jsonschema.Builder
other than letting a FieldValidator
implementation also implement this interface, so I don't see any way in which BuilderFunc could be used by end-users....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PS! I am happy to expose it later, if we can find a real end-user use-case for it.
) | ||
|
||
func (v objectBuilder) JSONSchema() (map[string]interface{}, error) { | ||
m := make(map[string]interface{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, prefer map[string]interface{}{}
. Only use make when you want to pre-allocate.
abc4aff
to
b52e5af
Compare
Not necessarily an empty object, just an object without the UPDATE: To clarify: "if a field is defined as nil" -- I assumed you meant "if a field's Validator is defined as nil". The "filed" itself can't be set to nil, as
I understand it's confusing, but if we set a default description for some fields and not for others, that becomes an inconsistent and unexpected behaviour. Instead the one specifying the schema should add a fitting description to all of his fields. Hopefully it should be possible to understand that if the "type" property is not set in JSON, it means that any type can be passed in. UPDATE: I have changed the example test to include such descriptions btw, which should make it more obvious what's going on. UPDATE: -- Also, if we where to set a default description, which I don't think we should, that should happen in the |
b52e5af
to
dc7fbdf
Compare
OK, rebased to one commit as I don't see it's going to be beneficial to do any more benchmarking specifically for the sizeHint removal... |
dc7fbdf
to
6598b35
Compare
Updated the docs as well. |
I like this solution better finaly. GG. I would rename the method of the |
6598b35
to
ffd2077
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking really good now. Thanks!
ffd2077
to
4f57745
Compare
Let's merge #78 first, and I will run the benchmarks. |
4f57745
to
68d6b45
Compare
Benchmark results are here! To summarize, we see that:
Details:
Command used to produce benchmarks: $ go test -run=NONE -bench=. -benchmem -cpu 1 Comparison done using benchmp. |
68d6b45
to
caba50c
Compare
} | ||
``` | ||
|
||
To easier extend a `FieldValidator` from the `schema` package, you can call `ValidatorBuilder` inside `BuildJSONSchem()`: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have now exposed ValidatorBuilder to be a public function. Let me know if we should do this differently.
caba50c
to
0af3f21
Compare
OK, ready. |
Those numbers are pretty good for the realistic use case. I have no objection to merging. |
Provides a public interface for rs#35, and switches the internal encoding logic to rely on maps, as propsed in rs#51. This makes the implementation simpler and more straight forward. Compared to the poposed solution 1 in rs#35 that would rely on passing an io.Writer to an encoder method, this solution will more easily handle a corner-case where a custom validator writes zero bytes to the io.Writer. The reduction in complexity does not come for free though. This looks to be 5x slower for very tiny schemas, and aproximately 2x slower for more complex schemas. As jsonschema is used as API documentation, and can be easily cached, this performance loss should still be quite acceptable.
0af3f21
to
2000cdc
Compare
rebased again due to new commits to master (no changes). @rs, are you OK with merging this? Or do you want to consider an alternate implementation? |
Provides a public interface for #35, and switches the internal encoding
logic to rely on maps, as propsed in #51. This makes the implementation
simpler and more straight forward.
Compared to the poposed solution 1 in #35 that would rely on passing an
io.Writer to an encoder method, this solution will more easily handle a
corner-case where a custom validator writes zero bytes to the io.Writer.
The reduction in complexity does not come for free though. This looks to
be 4x slower for very tiny schemas, and aproximately 2x slower for small
schemas. As jsonschema is used as API documentation, and can be easily
cached, this performance loss should still be quite acceptable.