Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use framing instead of json-schema to specify tree structure #128

Closed
cboettig opened this issue Mar 30, 2017 · 6 comments
Closed

Use framing instead of json-schema to specify tree structure #128

cboettig opened this issue Mar 30, 2017 · 6 comments
Labels

Comments

@cboettig
Copy link
Member

Since JSON-LD is linked data, there are lots of possible json structures that correspond to the same set of information (e.g. the same raw collection of triples). When consuming the data programmatically, it is of course really helpful to have the JSON data in a sensible tree-like structure (e.g. role should be a branch of an agent). To guarantee a predictable layout, rather than put this obligation on the creator of the codemeta.json by insisting json validate against some schema, we can simply provide a "frame" which will transform the data provided into the desired tree structure so that our program can always access the data in this sensible way (e.g. code$agent[[1]]$role). Of course since this approach must work with whatever data is provided, there's no notion of enforcing 'required' fields. On the other hand, additional fields in the codemeta.json that are not recognized by the frame can be automatically included anyway or ignored, depending on how we write the framing. For illustrative purposes I've written this example frame to drop anything we don't explicitly request:

e.g. here is a playground example which uses a simple frame to filter and transform a flat representation of codemeta json-ld triples describing the agents into a json tree containing only requested fields and nesting the agent properties.

@cboettig
Copy link
Member Author

cboettig commented Apr 5, 2017

A few examples of this in R are now shown in the draft vignette on jsonld-framing for the codemetar package: https://codemeta.github.io/codemetar/articles/JSON-LD-framing.html

@mfenner
Copy link

mfenner commented Apr 17, 2017

@cboettig can you explain the use cases where framing would help? In my understanding framing is a link data concept, needed for unambiguously converting a JSON tree into a graph. I am not sure how central this is to a software metadata crosswalk, which in my understanding should be possible without knowing anything about linked data.

@cboettig
Copy link
Member Author

@mfenner Thanks for all your help on the issues, great to have your experience & perspective on these things.

Re framing: the use case would be as an alternative to expecting users to validate their own codemeta.json files against our schema. I'm just trying to play by the book: we've declared that codemeta.json is linked data, so I believe that it is reasonable for a user to assume they can do the usual json-ld things, like using a flat graph with reference ids or a nested graph, or extending the data model with other linked data strings and not have to validate against a specific tree format of our schema. I think this makes things simpler for a user without knowledge of linked data too, in that they likewise don't have to worry about validation and the formatting is a bit more forgiving.

Of course this means that someone consuming the codemeta.json has the extra responsibility of first applying the frame rather than just assuming the data already matches the schema. Ultimately that gives such developers more flexibility, but of course I acknowledge it's shifting some burden from those generating the json to those consuming it, which was ever the trade off.

I'm not set on framing, it just seemed like it was the json-ld way of solving the problem we were addressing with json-schema, and folks seemed pretty keen on going with json-ld and not json schema or xml schema as our data model.

@mfenner
Copy link

mfenner commented Apr 17, 2017

@cboettig I didn't know that you made the decision that codemeta is linked data. Is there a Github issue or other place where I can read up on the reasoning behind this? Obviously JSON-LD describes linked data, but you can also use JSON-LD without going all the way down this path. If I am not mistaken, most metadata standards relevant for software are not linked data.

@cboettig
Copy link
Member Author

@mfenner Thanks. No, I don't think we ever flushed out so precisely the positives and negatives of going with linked data vs some other route. Going into the workshop we seemed to have a consensus that we were going to be writing a json-ld context file; and maybe the rest is just my confusion, it seemed to me that this involved the codemeta.jsonwere going to declare and use that context and thus be valid json-ld. I don't have much experience in this, so I'm not familiar with examples that go about defining JSON-LD context files but not "going all the way down this path".

I don't think I'm trying to push people down any particular path; we have the schema file already and I think that's not a bad thing. I don't see creating some example frames as being particularly prescriptive, if anything I thought the general idea of this approach was to be less prescriptive and more flexible. To me at least it looked like we had written some linked data so it made sense to try and operate on it as linked data.

@cboettig
Copy link
Member Author

I've added examples of how framing provides similar checks to validation here: https://codemeta.github.io/codemetar/articles/validation-in-json-ld.html

I think JSON-LD eliminates some of the issues associated in working with JSON without a validator, which sidesteps, rather than really answering, some otherwise challenging issues where we really didn't have consensus (e.g. we never had consensus on what properties should be "required"; JSON-LD leaves this up to the application developer).

Of course there's still some very good use cases, e.g. a user wants to make sure they don't have a typo in some property name in their file, or that they didn't include some term that is outside of the codemeta context by mistake (though they can extend codemeta at will, by adding their own additional context). A check that the input file compacts all names (no : in compacted property names) does this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants