Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Type for defining a Data Schema? #713

Closed
bshamblen opened this issue Aug 6, 2015 · 5 comments
Closed

Type for defining a Data Schema? #713

bshamblen opened this issue Aug 6, 2015 · 5 comments
Assignees

Comments

@bshamblen
Copy link

Definitions already exist for Dataset and DataCatalog, which help define a collection of data records or a catalog of those collections. I've looked though almost all of the definitions on schema.org, but I can't find one to identify a data schema (the structure of the data that's stored within a Dataset).

Without getting too much into the details, I need to create a few CommunicateAction events that have an about value with a @type referring to a database schema. Does a definition already exist that serves this purpose? If not, do you have any suggestions for how best to represent a data schema as the subject for an action?

Note: I currently have a way to create JSON-schema documents for the schemas in question.

@danbri
Copy link
Contributor

danbri commented Aug 6, 2015

Interesting, this theme came up earlier in the week with @vholland who has also been looking into related (but different) ideas for going deeper into the contents of datasets - #688 . There is also some related work at W3C around tabular data like CSVs - again see #688 for pointers.

If you want to define schema.org-like structures for your dataset, you could look at using the same basic approach that schema.org users: define a bunch of types and properties. But it sounds like you are already using JSON-schema, so just want to point to that. Analogously, XML datasets might want to point to their DTDs, XML Schemas etc. I can see this being a useful thing, let's figure something out...

@bshamblen
Copy link
Author

Thanks for taking the time to get back to me. In my case, I'm less interested in helping identify the data that's contained within a dataset, but rather the ability to identify the schemas themselves.

The project that I'm currently working on is a platform which allows developers to define, document, and share schemas. Developers can collaborate, generate source code, and automatically spin up APIs based on public or private schemas.

For example, if there were a sub-item of CreativeWork called DataSchema developers could post a schema for storing email messages. Once published we could post this JSON-LD snippet on our website, allowing search engines to catalog them and make them easier for developers to find.

{
    @context": "http://schema.org",
    "@type": "DataSchema",
    "name": "EmailMessage",
    "description": "Defines all of the fields necessary to store and search parsed email messages. Based on RFC 822 and RFC 2045. Supports meeting requests and attachments.",
    "version": 2,
    "mainEntityOfPage": "http://example.com/schema/EmailMessage",
    "discussionUrl": "http://example.com/schema/EmailMessage/discuss",
    "schemaVersion": "http://example.com/json-schema/EmailMessage/v2", //this could be an external reference to any schema document (not sure if this is the correct use of this property)
    "aggregateRating": {
        "@type": "AggregateRating",
        "reviewCount": "1",
        "ratingValue": "5"
     },
    "review": [
        {
            "@type": "Review",
            "author": "Brian",
            "datePublished": "2015-07-08",
            "description": "I spun up a new email app using this schema in just a few minutes. Thanks!",
            "name": "Excellent schema for any email app",
            "reviewRating": {
                "@type": "Rating",
                "ratingValue": "5"
            }
    ]
}

Imagine starting a new project and being able to quickly locate schemas that have already been defined, documented, and reviewed for almost every feature in a new app. It would be a huge time saver for developers. We just need a way to clearly identify them.

@danbri
Copy link
Contributor

danbri commented Aug 2, 2016

We added a 'variablesMeasured' property in the pending extension.

See #1083 (comment)

http://pending.webschemas.org/variablesMeasured

It isn't on the main site yet but will launch (still in 'pending' i.e. pending.schema.org) with the upcoming 3.1 release.

@danbri
Copy link
Contributor

danbri commented Aug 10, 2016

I've tagged this issue "closed and noted", noting the suggestion in our overview issue #2 to aid findability and for future review. In the interests of keeping a manageable number of open issues I'll go ahead and close it now, but discussions are still very much welcomed here. Thanks again!

@danbri danbri closed this as completed Aug 10, 2016
@joshsh
Copy link
Contributor

joshsh commented Oct 6, 2016

For the sake of sensor dataset schemas, perhaps the property/value classes could be aligned with ssn:Property. The properties of a dataset could then be interrelated with sensors, features of interest, etc. as well as actual observations contained in the dataset. See also #1391.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants