Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce redundancy in generated schema #1011

Closed
1 of 3 tasks
marcprux opened this issue Jan 27, 2016 · 33 comments
Closed
1 of 3 tasks

Reduce redundancy in generated schema #1011

marcprux opened this issue Jan 27, 2016 · 33 comments
Milestone

Comments

@marcprux
Copy link
Member

Some structures in the generated vega-lite-schema.json are repeated multiple times. For example, the timeUnit enumeration is repeated verbatim 13 times throughout the schema. Detecting this redundancy and emitting repeated structures to a shared defs section of the schema would make it much more compact.

TODO:

  • add enum support to typescript-json-schema
  • change build process to generate schema using new typescript-json-schema
  • remove old schema generation process
@marcprux
Copy link
Member Author

I would be interested in working on this, but I was first wondering if it should be part of #927.

@kanitw
Copy link
Member

kanitw commented Feb 1, 2016

Yes, this should be a part of #927. In addition, our vl.schema.instantiate method, which instantiate default values for the spec, also uses default property in the schema. Current vl.schema.instantiate does not support traversing defs section of the schema. So that's another part of work to be done for this to work.

@kanitw kanitw added this to the 1.x.x Patches & Minor Features milestone Feb 1, 2016
@domoritz
Copy link
Member

domoritz commented Feb 1, 2016

@mprudhom It would be awesome if you could look into this! In addition to using defs, we think that the json schema should be generated from typescript rather than manually by hand. In fact, I worked a bit on https://github.com/YousefED/typescript-json-schema to add support for union types and it's almost ready to be used for our purposes. What's missing right now is support for enums.

I suggest that we first look into the ts -> json schema generation (which can use defs) and then modify the instantiate method to use references.

If you want to talk about this in more detail, we could discuss on skype.

@marcprux
Copy link
Member Author

marcprux commented Feb 3, 2016

@domoritz Are you sure that typescript-json-schema supports storing shared schemas in a separate definitions section of the schema? Glancing over the code I don't see anywhere that will generate any $ref schema attributes.

@domoritz
Copy link
Member

domoritz commented Feb 3, 2016

Sorry, I didn't realize this hasn't been merged yet. There is a branch referenced from YousefED/typescript-json-schema#3. It only needs a few more changes to be ready.

@domoritz
Copy link
Member

domoritz commented Feb 3, 2016

Relevant #927

@domoritz
Copy link
Member

@mprudhom would you like to work on this? It would be incredibly helpful for us!

@marcprux
Copy link
Member Author

@domoritz I'll take a stab at it. Will it be OK for vega-lite to rely on my own fork of typescript-json-schema? As you mention, it will at least need support for enums.

@domoritz
Copy link
Member

Sure! Although @ YousefED has been extremely responsive so I think we will be able to merge quickly.

Since we are refactoring some things around the schema in VL right now, the first step would be to finish ref support (YousefED/typescript-json-schema#3) and then make sure that enums work.

Also note that TS 1.8 added support for string literals and we might use them some time in the future. I don't think they are treated as enums internally and may produce horrible code with a lot of oneOfs.

@marcprux
Copy link
Member Author

I implemented support for enums in typescript-json-schema. You can take a look at the generated schema by installing my branch:

npm install 'glimpseio/typescript-json-schema#%24ref-support'

Then generating the new schema:

node_modules/typescript-json-schema/bin/typescript-json-schema src/schema/schema.ts Spec > vega-lite-schema2.json

Most of the sample specs pass validation:

node_modules/z-schema/bin/z-schema vega-lite-schema2.json examples/specs/*.json

@domoritz Before I proceed, can you take a look at the TODO list I put in this issue's description and let me know if it sounds good to you?

@domoritz
Copy link
Member

Fantastic! The validation errors seem to be mostly because the ts schema is not complete. We use any in a couple of places and the compiler assumes that objects are expected.

migrate "const TIMEUNITS" to "enum TimeUnit" in src/timeunit.ts

Looks like the only difficulty here is that vl shorthand uses the actual array. But to be honest, I think we can just change that code and always output the complete time unit. In fact, I think its necessary now that we have yearmonth. cc @kanitw

We are planning to extract the shorthand into its own repo anyway. I'll see whether we can speed that up.

update string literal references to use new enum types across vega-lite

What does this mean? We don't use string literals in vl yet but once TS supports microsoft/TypeScript#6554, we might. Check out https://github.com/Microsoft/TypeScript/wiki/What's-new-in-TypeScript#string-literal-types.

Before we change too much in vl, let's try to get YousefED/typescript-json-schema#3 and your changes into the main repo. This means

  • Add command line options to use refs
  • Add command line option to not create a ref for the root. This should be the default.
  • Maybe even fix the logic so that two entities with the same name but in different namespaces don't create the same ref entity.
  • Send a PR

Should we move these before the changes in vl?

Also, just as a side note, we are changing the way default values are used so the instantiate methods or the schema in vega lite are probably not necessary any more. This simplifies our life and makes your contributions to schema generation from interfaces even more important. See #1119

@marcprux
Copy link
Member Author

OK, I've submitted a PR with CLI support for using refs & root refs at YousefED/typescript-json-schema#15. You can see it in action by grabbing my branch and running:

./bin/typescript-json-schema -useRef -noRootRef ../vega-lite-schema/src/schema/schema.ts Spec 

Is the name-conflict issue an existent problem in the vega-lite schema? I don't notice any cases where it overlapping schema references were being created.

@domoritz
Copy link
Member

Is the name-conflict issue an existent problem in the vega-lite schema?

Not that I know of.

@marcprux
Copy link
Member Author

In that case, I'll defer implementing it in order to keep the PR as simple as possible.

Should I submit a vega-lite PR now to use the new schema generation against my branch of typescript-json-schema, or should we wait to see if #1119 gets merged soon?

@domoritz
Copy link
Member

Sounds good. I'll look into #1119 and the vl shorthand issues today. I do t think #1119 will give too many conflicts so I think you can go ahead.

Thanks a lot by the way!

@marcprux
Copy link
Member Author

Note that one shortcoming of the new schema generation process is that typescript-json-schema doesn't do anything special with class extensions, so FieldDef, ChannelDefWithScale, ChannelDefWithLegend, OrderChannelDef, and PositionChannelDef are all generated as unrelated schemas with all the properties of the subclass and all superclasses. This isn't incorrect per se, but it does mean there are redundant property definitions.

Since JSON Schema Draft 4 doesn't have any notion of schema extension, the only way to model the extension is to use an allOf schema. For example, OrderChannelDef would be allOf FieldDef and an object with the property sort. This could be accomplished with more enhancements to the typescript-json-schema project.

However, I wonder if the FieldDef class extensions are even worth it, given that all the extension properties are optional. Why not just have a single FieldDef that has optional scale, sort, axis, and legend fields that will be ignored if they are assigned to a channel that doesn't support the encoding? That would allow a client of vega-lite to create a FieldDef prototype for a data field that they can experiment with assigning to different encoding channels without having to go through the trouble of converting the FieldDef to the appropriate subclass.

@marcprux
Copy link
Member Author

@domoritz The new schema generation PR is at #1130, and it is passing (aside from an unrelated test failure in "vl.fieldDef.cardinality()"). The example spec are all being validated against the the new schema generation. Let me know if you would like me to proceed with migrating the old schema generation to the new, or if you would like to handle that yourself, as it will touch quite a lot of code.

@domoritz
Copy link
Member

Note that one shortcoming of the new schema generation process is that typescript-json-schema doesn't do anything special with class extensions, so FieldDef, ChannelDefWithScale, ChannelDefWithLegend, OrderChannelDef, and PositionChannelDef are all generated as unrelated schemas with all the properties of the subclass and all superclasses. This isn't incorrect per se, but it does mean there are redundant property definitions.

I think it's fine to have repetition for now. In the future, the join schema generator could use allOf but for now it's fine to have repetition.

However, I wonder if the FieldDef class extensions are even worth it, given that all the extension properties are optional. Why not just have a single FieldDef that has optional scale, sort, axis, and legend fields that will be ignored if they are assigned to a channel that doesn't support the encoding? That would allow a client of vega-lite to create a FieldDef prototype for a data field that they can experiment with assigning to different encoding channels without having to go through the trouble of converting the FieldDef to the appropriate subclass.

We want the validation to fail rather than silently accept ignored properties.

@domoritz The new schema generation PR is at #1130, and it is passing (aside from an unrelated test failure in "vl.fieldDef.cardinality()"). The example spec are all being validated against the the new schema generation. Let me know if you would like me to proceed with migrating the old schema generation to the new, or if you would like to handle that yourself, as it will touch quite a lot of code.

Amazing! Sorry about the test failure. It's fixed on master now. Give us some time to look at your PR and then we will decide. We want to migrate all defaults to the config in the next 1 to 2 days and avoid merge conflicts but your help is greatly appreciated.

Thanks again.

@domoritz
Copy link
Member

Awesome work @mprudhom! I made a few changes to your pr and created #1131. Also, I made some changes to the typescript to json schema compiler (YousefED/typescript-json-schema#20).

We migrated the defaults into the config (thanks @kanitw) so now we don't need the json schema instantiation any more. What remains now is

  • Delete the json schemas in vl
  • Copy the descriptions from the docs into comments (the descriptions in the json schema may be out of date in some places)
  • Potentially generate documentation from the json schema. See Generate docs from schema #1133

I'm also having some issues with the compiler. Can you have a look at these? You can just take the changes from #1131 and create a new PR. Maybe start with one file before you do the complete move.

foo/typescript-json-schema/node_modules/typescript/lib/typescript.js:39567
                throw e;
                ^

TypeError: Cannot read property 'replace' of undefined
    at Object.normalizeSlashes (foo/typescript-json-schema/node_modules/typescript/lib/typescript.js:1383:20)
    at writeReferencePath (foo/typescript-json-schema/node_modules/typescript/lib/typescript.js:31876:90)
    at foo/typescript-json-schema/node_modules/typescript/lib/typescript.js:30442:29
    at Object.forEach (foo/typescript-json-schema/node_modules/typescript/lib/typescript.js:938:30)
    at foo/typescript-json-schema/node_modules/typescript/lib/typescript.js:30434:20
    at Object.forEach (foo/typescript-json-schema/node_modules/typescript/lib/typescript.js:938:30)
    at emitDeclarations (foo/typescript-json-schema/node_modules/typescript/lib/typescript.js:30427:12)
    at getDeclarationDiagnosticsFromFile (foo/typescript-json-schema/node_modules/typescript/lib/typescript.js:30393:13)
    at onSingleFileEmit (foo/typescript-json-schema/node_modules/typescript/lib/typescript.js:6380:13)
    at Object.forEachExpectedEmitFile (foo/typescript-json-schema/node_modules/typescript/lib/typescript.js:6369:21)

@marcprux
Copy link
Member Author

OK, I re-opened my original PR and updated it by migrating most of the schema attributes into comments & annotations on the typescript properties. I haven't eliminated the old schema yet (now renamed to "schemaOLD") because I don't know what you want to do about the current instantiate() usage in Model.ts. But once that is resolved, all the old schema exports can be ripped out.

Since #1130 touches quite a lot of files, so I urge priority in reviewing it.

P.S. the compiler errors seemed to have something to do with not being able to find the typescript-json-schema package, I'm guessing maybe because the node package cache in Travis is out of date; restoring the reference to YousefED/typescript-json-schema fixes it for the time being.

@domoritz
Copy link
Member

Excellent. I merged master and ripped out everything that was left around the old schema. We don't need instantiate any more.

Somehow the schema generation is not working with 0.0.5

> vega-lite@0.9.3 schema /Users/domoritz/Developer/UW/vega-lite
> typescript-json-schema src/schema/schema.ts Spec > vega-lite-schema.json

/Users/domoritz/Developer/UW/vega-lite/node_modules/typescript/lib/typescript.js:39567
                throw e;
                ^

TypeError: Cannot read property 'replace' of undefined

I think that is the last step left before we can merge all of this (pending review from @kanitw of course).

@domoritz
Copy link
Member

See #1131

@marcprux
Copy link
Member Author

@domoritz Try using the complete path to the typescript-json-schema repo and command in package.json. That's how I fixed it last time. I suspect it might be a npm module caching issue with the Travis CI server; this should get around it for the time being.

26,27c26
<     "schema": "tsc && npm run schema:only",
<     "schema:only": "node_modules/typescript-json-schema/bin/typescript-json-schema src/schema/schema.ts Spec > vega-lite-schema.json",
---
>     "schema": "typescript-json-schema src/schema/schema.ts Spec > vega-lite-schema.json",
61c60
<     "typescript-json-schema": "YousefED/typescript-json-schema",
---
>     "typescript-json-schema": "^0.0.5",

@domoritz
Copy link
Member

I see the same error on my machine, not just on travis. I also tried changing my package.son to what you wrote above and reinstalled tjs but the error is still the same. It may have to do with the changes in 0.0.5 but I can't tell what the issue is.

@domoritz
Copy link
Member

@marcprux
Copy link
Member Author

I can reproduce it too. Looks like maybe a newline issue with the script in typescript-json-schema, and this fixes it for me:

dos2unix ./node_modules/typescript-json-schema/bin/typescript-json-schema

@domoritz
Copy link
Member

Hmm, I wish it worked.

>>> dos2unix ./node_modules/typescript-json-schema/bin/typescript-json-schema                                                                                    11:18
dos2unix: converting file ./node_modules/typescript-json-schema/bin/typescript-json-schema to Unix format...
>>> node_modules/typescript-json-schema/bin/typescript-json-schema src/schema/schema.ts Spec > vega-lite-schema.json                                             11:18
foo/vega-lite/node_modules/typescript/lib/typescript.js:39567
                throw e;
                ^

TypeError: Cannot read property 'replace' of undefined
    at Object.normalizeSlashes (foo/vega-lite/node_modules/typescript/lib/typescript.js:1383:20)
    at writeReferencePath (foo/vega-lite/node_modules/typescript/lib/typescript.js:31876:90)
    at foo/vega-lite/node_modules/typescript/lib/typescript.js:30442:29
    at Object.forEach (foo/vega-lite/node_modules/typescript/lib/typescript.js:938:30)
    at foo/vega-lite/node_modules/typescript/lib/typescript.js:30434:20
    at Object.forEach (foo/vega-lite/node_modules/typescript/lib/typescript.js:938:30)
    at emitDeclarations (foo/vega-lite/node_modules/typescript/lib/typescript.js:30427:12)
    at getDeclarationDiagnosticsFromFile (foo/vega-lite/node_modules/typescript/lib/typescript.js:30393:13)
    at onSingleFileEmit (foo/vega-lite/node_modules/typescript/lib/typescript.js:6380:13)
    at Object.forEachExpectedEmitFile (foo/vega-lite/node_modules/typescript/lib/typescript.js:6369:21)

@domoritz
Copy link
Member

I'm on mac by the way. Not sure whether that's the issue.

@domoritz
Copy link
Member

Converting all files does not fix the issue. I also can't get it to work on travis.

@marcprux
Copy link
Member Author

Odd, now I am seeing that error too. If I force the typescript-json-schema dependency to use typescript 1.7.5 instead of 1.8.0 then it works.

@domoritz
Copy link
Member

That's annoying because we use ts 1.8 in vega-lite. Could you have a look at this?

I submitted a pr to fix the ts version to 1.7 (YousefED/typescript-json-schema#22) for now so at least we can get the pr in and don't risk many merge conflicts.

@marcprux
Copy link
Member Author

I tracked it about as far as I am able and made an issue for it at YousefED/typescript-json-schema#23. In order to move forward for the purposes of getting this PR merged, I recommend using a fork of typescript-json-schema that depends on typescript 1.7.5 instead of 1.8.0 and hope that @YousefED is able to figure out what the underlying problem with 1.8.0 is.

@domoritz
Copy link
Member

Yes, I've done that in 4c2d974

kanitw added a commit that referenced this issue Feb 17, 2016
Generate schema from TypeScript classes (#1011 & #927)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants