New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

json schema draft 04 support #17

Open
sjaeckel opened this Issue Jan 31, 2014 · 23 comments

Comments

Projects
None yet
7 participants
@sjaeckel
Contributor

sjaeckel commented Jan 31, 2014

Hi

I'd love to use this library and I'm interested if there are concrete plans when draft 04 will be supported.
Otherwise I'll go on based on draft 03

Thanks in advance.
Steffen

@penduin

This comment has been minimized.

Member

penduin commented Jan 31, 2014

I do plan to add draft 4 support, and I really would like to say "within a
month" or something, but I'm just not sure I'll have the time. When I do ,
draft 3 will continue to be supported, either as a build option or maybe a
setting on the document or some such.

I'll let you know once I get cracking; I'd sure love to have somebody
field-testing draft 4 schema once it's implemented.

On Fri, Jan 31, 2014 at 4:34 AM, Steffen Jaeckel
notifications@github.comwrote:

Hi

I'd love to use this library and I'm interested if there are concrete
plans when draft 04 will be supported.
Otherwise I'll go on based on draft 03

Thanks in advance.
Steffen


Reply to this email directly or view it on GitHubhttps://github.com//issues/17
.

@penduin penduin self-assigned this Feb 3, 2014

@Relequestual

This comment has been minimized.

Relequestual commented Jun 22, 2015

Have you made any progress on this? Looking at using it as a way to bring draft 4 support to perl. The author of the existing perl module will not be adding draft 4 support.

@penduin

This comment has been minimized.

Member

penduin commented Jun 22, 2015

Draft 4 support is still in-progress, but usable. There is a lot of overlap with v3, which all works, and several v4-specific features are supported too. Supporting new json-schema hasn't had a lot of priority, but I've put in a few things and taken several patches, which I'm always thrilled to do. ;^)

In particular, the hyper-schema and referenced-document stuff in v4 has not yet been touched yet in WJElement, I'd say that's its biggest (but not only) draft-4 blind spot.

It would be really fun to see this work its way into Perl. Would you like to tackle any of the remaining v4 implementation? I remain confident I'll get around to doing it myself, but I can't make any timeline promises. I will say this though: we'll be presenting some Netmail projects including WJElement at a conference in about a month, so my motivation to try and sneak in some extra work on it is in a good place right now. :^)

I'll try to do a better job of keeping this thread updated as more draft-4 schema features are supported.

later,
-Owen

-- Sent from my HP TouchPadOn Jun 22, 2015 3:49 AM, Ben Hutton notifications@github.com wrote:

Have you made any progress on this? Looking at using it as a way to bring draft 4 support to perl. The author of the existing perl module will not be adding draft 4 support.


Reply to this email directly or view it on GitHub.

@Relequestual

This comment has been minimized.

Relequestual commented Jun 22, 2015

Hey Owen... Slight problem in that I don't know any C.

I've had a bit of a chat with a colleague and I think we're most likely going to tackle updating the Perl module to complete draft 3 and then add draft 4 support, as we find time.

Myself and a few others have been vocal recently about json-schema and how we can get things going again to push for draft-5. It seems there's some leadership issues, so seeing if we can re-instate some previous devs on the project and get it moving again. Also diving up the work into sections. There may be a final RFC yet! =]

@penduin

This comment has been minimized.

Member

penduin commented Jun 22, 2015

No problem at all. We'll all keep doing what we can do as we have
time, eh? :^)

I confess I have more or less dropped off the scene after initially
being rather active in json-schema discussions. The fact is, draft 3
was more than sufficient for my purposes (and Netmail's) and many of
the discussions, at some point anyway, were so academic or
implementation-specific that I basically stopped paying attention.

An RFC would be very exciting, and would of course be great motivation
for this and plenty of other projects to go all-in and write up
complete json-schema support. I don't have a ton of time to
contribute, and fear I can't do much about leadership problems, but is
there anything I and/or the WJElement project might do to help?

On 6/22/15, Ben Hutton notifications@github.com wrote:

Hey Owen... Slight problem in that I don't know any C.

I've had a bit of a chat with a colleague and I think we're most likely
going to tackle updating the Perl module to complete draft 3 and then add
draft 4 support, as we find time.

Myself and a few others have been vocal recently about json-schema and how
we can get things going again to push for draft-5. It seems there's some
leadership issues, so seeing if we can re-instate some previous devs on the
project and get it moving again. Also diving up the work into sections.
There may be a final RFC yet! =]


Reply to this email directly or view it on GitHub:
#17 (comment)

@Relequestual

This comment has been minimized.

Relequestual commented Jun 23, 2015

Indeed!
I dunno, there seems to be a number of people more than willing to help out. How much help they can give, I do not know. I'm one such person. Are you subscrived to the google group? I think that's where most of the important stuff will happen. That and the github project issues.

@petehug

This comment has been minimized.

Contributor

petehug commented Sep 2, 2015

Darft-04 is fully supported by https://github.com/petehug/wjelement-cpp which is based on WJELEMENT.

@penduin

This comment has been minimized.

Member

penduin commented Sep 2, 2015

Nice work! I notice your schema stuff is a complete re-implementation rather than a patch we could pull back into WJE. WJElement's draft 4 support has continued to improve, though your implementation is more complete. Having that out there as a reference will be helpful when I do find the time to fill in the cracks of our own schema 4 support.

Micah and I are not OO people, but we're glad to see you've opened up WJElement to a world of C++ developers. Again, nice work!

@petehug

This comment has been minimized.

Contributor

petehug commented Sep 27, 2015

Thanks. Yes, I decided to totally re-implement Schema-04 support.I wanted bindings between verified elements and the relevant schema element(s). This makes it possible to build form generators.

It's been a while since I branched WJElement and I primarily did so because it didn't differentiate between rational and integral numbers. Another area of concern is that the addition of array and object members increases proportionally with the number of elements added. Have improvements been made in these areas over the past 15 months?

@penduin

This comment has been minimized.

Member

penduin commented Sep 30, 2015

WJElement hasn't changed in the ways you've branched for. We've had some ideas kicking around for optimized array access but nothing's in yet.

We haven't looked into rational numbers - if anything it feels like we have too many ways to deal with number data already. :^) I'm curious to know, what is your use case for which WJE's numbers don't cut it?

@petehug

This comment has been minimized.

Contributor

petehug commented Oct 1, 2015

IMHO. the fact that WJElement sequentially accesses array and object collections is a real concern. I'm using my library mainly to implement web services and at times, collections can become monstrous. The sequential access to collections makes lookups so expensive I'm considering adding direct access to my WJElement branch.

WRT handling integers, consider these JSON examples:

  1. {"value":1.1}
  2. {"value":1.0}
  3. {"value":1.}
  4. {"value":1}

If I parse these using WJElement, the type of each "value" is WJR_TYPE_NUMBER. In JSON Schema Draft 04 (JSD4), this is the correct type for 1, 2 and 3 but not 4.

I'm not sure how you implemented JSD4, but I simply let WJElement parse the document before my schema validators spring into action. IOW, my validators only start working if the document is valid JSON (in fact, another small change I made to WJElement in my branch was that any JSON parsing error throws an exception).

So if "value" is declared an "integer" in the schema, only sample 4 should validate. Without the addition of WJR_TYPE_INTEGER, the "value" validator would have to swallow samples 2 and 3 and thereby fail the JSD4 specs. The addition of WJR_TYPE_INTEGER ensures that the type of "value" in sample 4 is WJR_TYPE_INTEGER while all others are WJR_TYPE_NUMBER.

@minego

This comment has been minimized.

Member

minego commented Oct 1, 2015

On 10/01/15 13:55, Pete Hug wrote:

IMHO. the fact that WJElement sequentially accesses array and object
collections is a real concern. I'm using my library mainly to implement
web services and at times, collections can become monstrous. The
sequential access to collections makes lookups so expensive I'm
considering adding direct access to my WJElement branch.

There are pointers to elements exposed on the WJElement. On large arrays
they tend to be faster than using the selectors.

For example, the normal way to access members of an array (which can be
slow on large arrays):
l = NULL;
while ((value = _WJEString(doc, "names[]", WJE_GET, &l, NULL))) {
printf("Name: %s\n", value);
}

vs using the pointers, which is faster:

for (e = WJEArray(doc, "names[0]", WJE_GET); e; e = e->next) {
   printf("Name: %s\n", WJEString(e, NULL, WJE_GET, NULL));
}

Each WJElement has a next, prev, parent and child pointer. You should
not modify these directly but you can use them directly.

For an array it would be even faster to include an array of pointers
directly on the WJElement pointing to each of the children. This would
require a bit more memory but I think it is worth it and would allow
direct access to any element by index.

I've added this to my personal todo list (although I wouldn't complain
if I get a pull request to add it before I get it done.)

WRT handling integers, consider these JSON examples:

  1. {"value":1.1}
  2. {"value":1.0}
  3. {"value":1.}
  4. {"value":1}

If I parse these using WJElement, the type of each "value" is
WJR_TYPE_NUMBER. In JSON Schema Draft 04 (JSD4), this is the correct
type for 1, 2 and 3 but not 4.

For backwards compatibility I would probably leave the WJR_TYPE_NUMBER
value for all of these, but add another flag on the WJElement to
indicate if it was originally an integer or not.

I'm not sure how you implemented JSD4, but I simply let WJElement parse
the document before my schema validators spring into action. IOW, my
validators only start working if the document is valid JSON (in fact,
another small change I made to WJElement in my branch was that any JSON
parsing error throws an exception).

If you'd like to share that patch I'd love to see it. I've been planning
to add a callback to the WJReader layer that will be called for each
warning and error with as much extra information as possible.

So if "value" is declared an "integer" in the schema, only sample 4
should validate. Without my change, the "value" validator would have to
swallow samples 2 and 3 and thereby fail the JSD4 specs.


Reply to this email directly or view it on GitHub
#17 (comment).

@penduin

This comment has been minimized.

Member

penduin commented Oct 1, 2015

WJE does parse before validating schema, but doesn't do validation on
the json itself - it just loads everything it can. That's another
area we've talked about adding error reporting.

There may yet be bugs in number/integer validation (particularly v4,
which is incompletely supported), I'll look into that soon.

On 10/1/15, Pete Hug notifications@github.com wrote:

IMHO. the fact that WJElement sequentially accesses array and object
collections is a real concern. I'm using my library mainly to implement web
services and at times, collections can become monstrous. The sequential
access to collections makes lookups so expensive I'm considering adding
direct access to my WJElement branch.

WRT handling integers, consider these JSON examples:

  1. {"value":1.1}
  2. {"value":1.0}
  3. {"value":1.}
  4. {"value":1}

If I parse these using WJElement, the type of each "value" is
WJR_TYPE_NUMBER. In JSON Schema Draft 04 (JSD4), this is the correct type
for 1, 2 and 3 but not 4.

I'm not sure how you implemented JSD4, but I simply let WJElement parse the
document before my schema validators spring into action. IOW, my validators
only start working if the document is valid JSON (in fact, another small
change I made to WJElement in my branch was that any JSON parsing error
throws an exception).

So if "value" is declared an "integer" in the schema, only sample 4 should
validate. Without my change, the "value" validator would have to swallow
samples 2 and 3 and thereby fail the JSD4 specs.


Reply to this email directly or view it on GitHub:
#17 (comment)

@petehug

This comment has been minimized.

Contributor

petehug commented Oct 2, 2015

mingo wrote:
There are pointers to elements exposed on the WJElement. On large arrays they tend to be faster than using the selectors

I realise that and incidentally, this is what my WJPP::Node::iterator uses. The iterator is super lightweight and very efficient. But efficient traversal is only one requirement for indexed collection. Equally important is efficient random access and that is where the problem is.

I found Troy D. Hanson uthash library. It is a header only C library (macro based). It ships (amongst other things) uthash.h and utarray.h. I've played a little with it and can say Troy has done a fine job: it is blindingly fast. The only thing I didn't like very much was that the hash collection didn't order elements on insert, but it does provide high speed sort ops. I haven't played with utarray.h yet. I'm not a big fan of macros, but I guess if the inevitable code replication only occurs in the WJElement library it wouldn't be a big worry. However, changes to WJElement would be significant. If I find a bit of time I'll give it a shot.

For backwards compatibility I would probably leave the WJR_TYPE_NUMBER value for all of these, but add another flag on the WJElement to indicate if it was originally an integer or not.

I can understand that sentiment and it probably isn't wrong. There is really a bit of a problem between JSD4 and JSON itself in that the JSON specifications don't mention an integer type but JSD4 does. I don't have an easy answer but doubt the what-the-original-value-was flag will cut the mustard. All my changes are wrapped in compiler directives #ifdef WJE_DISTINGUISH_INTEGER_TYPE so it requires explicit enabling to take effect. I doubt there is anything else that can be done easily to ensure backward compatibility.

If you'd like to share that patch I'd love to see it. I've been planning to add a callback to the WJReader layer that will be called for each warning and error with as much extra information as possible.

The WJElement version I modified (branh wjelement++) was branched a while back, but with the right tools you should have no difficulty seeing what I'd done.

@minego

This comment has been minimized.

Member

minego commented Oct 2, 2015

I will look at the uthash library. You are right that random access is
important. The current solution is not sufficient for large objects.

On 10/01/15 23:23, Pete Hug wrote:

mingo wrote:
There are pointers to elements exposed on the WJElement. On large
arrays they tend to be faster than using the selectors

I realise that and incidentally, this is what my WJPP::Node::iterator
uses. The iterator is super lightweight and very efficient. But
efficient traversal is only one requirement for indexed collection.
Equally important is efficient random access and that is where the
problem is.

I found Troy D. Hanson /uthash/ library
https://github.com/troydhanson/uthash. It is a header only C library
(macro based). It ships (amongst other things) uthash.h and
utarray.h. I've played a little with it and can say Troy has done a
fine job: it is blindingly fast. The only thing I didn't like very much
was that [at least the hash] collection didn't order elements on insert
so it isn't a balanced tree. I still have to play utarray.h. But it
does provide high speed sort ops. I'm not a big fan of macros though but
I guess if the inevitable code replication only occurs in the WJElement
library it wouldn't be a big worry. However, changes to WJElement would
be significant. If I find a bit of time I'll give it a shot.

For backwards compatibility I would probably leave the
WJR_TYPE_NUMBER value for all of these, but add another flag on the
WJElement to indicate if it was originally an integer or not.

I can understand that sentiment and it probably isn't wrong. There is
really a bit of a problem between JSD4 and JSON itself in that the JSON
specifications don't mention an integer type but JSD4 does. I don't have
an easy answer but that a 'what-the-original-value-was' flag will cut
the mustard. All my changes are wrapped in compiler directives #ifdef
WJE_DISTINGUISH_INTEGER_TYPE
so it requires explicit enabling to take
effect. I doubt there is anything else that can be done easily to ensure
backward compatibility.

If you'd like to share that patch I'd love to see it. I've been
planning to add a callback to the WJReader layer that will be called
for each warning and error with as much extra information as possible.

The WJElement version I modified
https://github.com/petehug/wjelement/tree/wjelement++ (branh
wjelement++) was branched a while back, but with the right tools you
should have no difficulty seeing what I'd done.


Reply to this email directly or view it on GitHub
#17 (comment).

@handrews

This comment has been minimized.

handrews commented May 22, 2018

Note the spec is up to draft-07, with draft-08 on the way. Although there wasn't a draft-05, it goes from draft-04 to draft-06 in the meta-schema numbering.

@penduin

This comment has been minimized.

Member

penduin commented May 22, 2018

@toleressea

This comment has been minimized.

toleressea commented May 22, 2018

@handrews

This comment has been minimized.

handrews commented May 22, 2018

@penduin I totally get not wanting to update unless there's a clear need.

For now, I can add you to implementation list as "intending to update", which might get you some attention and volunteers (EDIT: I had originally said I'd add to the old draft-04 list, but if you're open to patches we should make you more visible)

For an overview of changes, you can look at our guidelines for migrating from older drafts.

Note that it is totally fine to offer core/validation support without hyper-schema support. This is why they are now separate specifications. Hyper-Schema changed dramatically between draft-04 and draft-07 in response to the limited adoption and widespread confusion around draft-04 hyper-schema. I would not recommend implementing Hyper-Schema earlier than draft-07 at this point.

In terms of draft-08 things that might affect how you want to move forward, particularly regarding $ref, the major accepted proposals are as follows. Some of these are really deep in the guts of how JSON Schema works so if they don't make sense, you can ask on the slack channel, or just wait for the spec which should be more clear than the GitHub discussions:

  • json-schema-org/json-schema-spec#523 (allowing keywords alongside $ref by changing the conceptual model for $ref from "as if it were replaced by the reference target" to "has identical results as evaluating the reference target on its own")
  • json-schema-org/json-schema-spec#396 recommended result and error formats
  • json-schema-org/json-schema-spec#530 optional formalized annotation (e.g. title, default, examples) collection process
  • json-schema-org/json-schema-spec#556 optional, probably (unevaluatedProperties, which does what a lot of people think happens when you combine additionalProperties and *Of keywords)
  • splitting dependencies into two keywords for the two separate behaviors that it currently offers (already committed to master, although we may further tweak the keyword names before we're done)
@penduin

This comment has been minimized.

Member

penduin commented May 23, 2018

@handrews

This comment has been minimized.

handrews commented May 23, 2018

You're welcome! I'm trying to do more community outreach to get things moving again now that the current spec team has a track record of delivering regular updates towards finalizing the spec.

@penduin

This comment has been minimized.

Member

penduin commented May 24, 2018

@handrews

This comment has been minimized.

handrews commented May 24, 2018

@penduin It probably won't be finalized soon, although I am hoping that draft-08 will be the last really big update that affects things broadly. Part of draft-08 is a formal notion of different vocabularies (which exists informally starting with draft-04 with validation and hyper-schema as separate specifications). That way we can fence off the core and validation parts and get those into the standardization process.

draft-06 and -07 are starting to get implemented and used fairly widely, and the list of incompatibilities with draft-04 is quite small. They are a good way to get to a "modern" base without worrying about the more substantial draft-08 changes. Then you can wait until whatever tweaks we get to the draft-08 changes stabilize (I'm sure we'll get some feedback requiring a draft-09).

I expect draft-06/-07 to be the new plateau of support for the near term, with a bit more of a gap before folks start adopting draft-08 or later. So I encourage getting at least to draft-06 for now. But unlike what happened after draft-04, we're still actively moving forward, and you can keep an eye on draft-08+ for when the new stuff settles.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment