Replies: 2 comments
-
|
The major reason aside from semantic fidelity IMO to have generators make representations across frameworks be as close as possible to the schema is that the entire purpose of generating representations in different frameworks is to be able to do interoperability across them - that they are all "the same thing" but in different modalities. Yes, true, different frameworks have different capabilities, that's a given, so interop will always be slightly lossy. The thing we really need for "true" interop is also to be able to have generation be bidirectional - if I already have my schema expressed in json schema, or as pydantic models, to be able to juice that back into linkml, but that's obviously a ways off and a more difficult problem. The thing that would really make that impossible is if rather than pushing desired features into the schema layer, we left them hanging around in the generators, and so the question of interoperability explodes from "mapping domain to domain" to "mapping (domain * all the possible overrides) to (domain * all the possible overrides." Even if we did have a really awesome means of recording all those overrides as a set of transformation options (I'm a big fan of the linkml map idea, of course) then that still implies that we need to make any interoperability layer accommodate the product of all those extra possibilities, and come up with ways to represent not only the schema, but the transforms applied in every framework. This is possible in frameworks like Python and pydantic where we can just stick a private dict in the module, or in json schema with some annotation object, but not really with e.g. SQL DDL and others. Things like unwinding inheritance hierarchy to me are part of adapting to other frameworks, so as a much more mildly held opinion, IMO that would be great to flatten out as a single layer as being different frameworks (e.g. This intersects with another longstanding problem, and that's the tooling, fluidity of modifications, and versionability (is that a word?) of the metamodel. Two things are true: a) we don't want to go hogwild and make the metamodel a moving target, and b) we do need to be able to make it just as easy to make changes in the metamodel as in the generators to avoid the temptation of a quick hack. To make this possible it needs to be possible to e.g. say "my schema is tied to version x of the metamodel, so I want to use the tooling for version x of the metamodel." For that we need to move the linkml model artifacts out of So, briefly, IMO we should have a standard that generators do not mutate schema. Over time deprecate places they do by moving any desired behaviors into the metamodel. And to make that feasible make the metamodel easier and safer to mutate. Thank you for attending my ted talk |
Beta Was this translation helpful? Give feedback.
-
|
on the more specific question of schema-level extra data, continued the prior issue over here: #1595 , just linking for discoverability between multiple discussions of related topic |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Historically, some generators have allowed particular options that change the semantics of the schema. Classic examples include allowing extra attributes for jsonschemagen and pydanticgen. There are a number of objections to this:
However, pragmatically there may be reason to want this kind of behavior, sometimes this might be short term while we wait for sufficient expressivity in the metamodel. There might also be reasons that are particular to the target formalism. E.g. when converting to strict relational DDL not all schema features are supported so there may be tradeoffs to be made downstream of the schema itself.
Examples of transforms include:
Beta Was this translation helpful? Give feedback.
All reactions