Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON-LD Context Error in "genre"? #2234

Closed
iherman opened this issue Apr 25, 2019 · 8 comments

Comments

Projects
None yet
5 participants
@iherman
Copy link

commented Apr 25, 2019

The documentation for genre says that the values expected is of type text or URL. However, the JSON-LD Context file says:

genre: {
    @id: "schema:genre",
    @type: "@id"
},

which forces the value to be always a URL. This may also lead to syntactically incorrect URL-s, and is certainly not in line with the documentation...

@BigBlueHat @RichardWallis

@BigBlueHat

This comment has been minimized.

Copy link
Contributor

commented Apr 25, 2019

@iherman fwiw, this happens at least a few other places within the context file. There's also an issue for this genre specifically: #1473. If you dig around, you'll find this has come up often over the years...and now might be a good time to fix it. 😄

Using this particular situation as a proof case, here's what could change and how it would help things be more clearly expressed.

If we change genre's definition to just...

{
  "genre": "schema:genre"
}

and then update (or add) examples that show the two variations...

  • one for Text:
{
  "@context": "http://schema.org",
  "@type": "Painting",
  "name": "The Madonna with the Long Neck",
  "genre": "Late Renaissance"
}
  • one for URL:
{
  "@context": "http://schema.org",
  "@type": "Painting",
  "name": "The Madonna with the Long Neck",
  "genre": {
    "@id": "http://vocab.getty.edu/aat/300021143"
  }
}

...then we get the advantage of not encumbering the "text only" folks with @value for the (perhaps) more common/simpler cases, but still provide an explanation/example to those who know they want to draw the distinction.

This has been proposed elsewhere in the issues...somewhere, so perhaps we should make an issue for this approach (on its own) and run this to the finish line? 😃

@RichardWallis

This comment has been minimized.

Copy link
Contributor

commented Apr 26, 2019

As @BigBlueHat identifies this is an issue that has come up a few times and is potentially relevant for many properties. This might be a time to apply a more generic fix.

As we have in excess of 2,000 terms in the vocabulary the context file, by necessity, is produced programatically. Just modifying the definition for genre, although possible, is not really a practical proposition.

Looking at the code that produces JSON-LD context file (human readable version) the logic for properties is as follows:

  1. If the rangeIncludes includes schema:URL the entry will be of following form:
    "menu": { "@id": "schema:menu", "@type": "@id"}
  2. Else if the rangeIncludes includes schema:Date the entry will be of following form:
    "startDate": { "@id": "schema:startDate", "@type": "Date"}
  3. Else if the rangeIncludes includes schema:DateTime the entry will be of following form:
    "startTime": { "@id": "schema:startTime", "@type": "DateTime"}
  4. Else entry will be of following form:
    "step": { "@id": "schema:step"}

From the forgoing discussion am I correct in assuming that if we tweak the test for the first three of these steps to be:
if the rangeIncludes only includes....
... it will generically solve the problem under discussion?

If it does, would it then invalidate many examples that would require updating and/or supplementing?

Whilst in that section of code are there any other data types that deserve similar treatment?

@gkellogg Will JSON-LD 1.1 have significant impacts or provide opportunities in this area?

/cc @danbri

@gkellogg

This comment has been minimized.

Copy link
Contributor

commented Apr 26, 2019

JSON-LD won’t change any basics for typing terms.

Another option might be to start which a check to see if rangeIncludes includes schema:Text and not use @type in that case.

Given the Travis tools, among others, the simplest thing might to create a branch that does this, and see what issues the lint stage finds.

@danbri

This comment has been minimized.

Copy link
Contributor

commented Apr 26, 2019

@gkellogg

This comment has been minimized.

Copy link
Contributor

commented Apr 26, 2019

does json-ld define any subtypes of Property that we could use to make this distinction explicitly?

JSON-LD doesn't really define types or subtypes, properties are keys in an object which can have information such as @type, @container and so forth associated with them in order to determine how to process the content.

I think it's outside the bounds of the JSON-LD spec to describe a micro-syntax for a string value, but if it did, you might say that if a property has @type: @id and the value of that property is not of the form of an IRI (perhaps absolute IRI), for example, if it has whitespace, that it should be interpreted as a string value instead. I suspect that this is what Google may do when looking at actual data in the wild, but I don't see how we could reasonably describe that. (Perhaps @type: @figure-it-out) type could be a catch-all?)

@RichardWallis

This comment has been minimized.

Copy link
Contributor

commented Apr 29, 2019

For the moment then I'm going to follow @gkellogg's [nicely simple] suggestion of not outputting a @type if rangeIncludes schema:Text.

I'll do it on a new branch to see what effects there are...

@RichardWallis

This comment has been minimized.

Copy link
Contributor

commented Apr 29, 2019

Created an updated version of context file with the simple change.

No difference in the number of lint failures.

@iherman @gkellogg output attached - is it more like what you expect?

jsonldcontext.json.txt

@RichardWallis RichardWallis self-assigned this Apr 30, 2019

RichardWallis pushed a commit that referenced this issue Apr 30, 2019

@BigBlueHat

This comment has been minimized.

Copy link
Contributor

commented May 2, 2019

Change looks good overall @RichardWallis! Thanks.

List of 53 terms which can have URL's (or objects) as their values (i.e. they have `"@type": "@id"` set)
jsonldcontext.json.txt|1108 col 82| "actionableFeedbackPolicy": { "@id": "schema:actionableFeedbackPolicy", "@type": "@id"},
jsonldcontext.json.txt|1118 col 62| "additionalType": { "@id": "schema:additionalType", "@type": "@id"},
jsonldcontext.json.txt|1129 col 54| "afterMedia": { "@id": "schema:afterMedia", "@type": "@id"},
jsonldcontext.json.txt|1216 col 56| "beforeMedia": { "@id": "schema:beforeMedia", "@type": "@id"},
jsonldcontext.json.txt|1219 col 70| "benefitsSummaryUrl": { "@id": "schema:benefitsSummaryUrl", "@type": "@id"},
jsonldcontext.json.txt|1302 col 62| "codeRepository": { "@id": "schema:codeRepository", "@type": "@id"},
jsonldcontext.json.txt|1306 col 52| "colleague": { "@id": "schema:colleague", "@type": "@id"},
jsonldcontext.json.txt|1336 col 54| "contentUrl": { "@id": "schema:contentUrl", "@type": "@id"},
jsonldcontext.json.txt|1344 col 68| "correctionsPolicy": { "@id": "schema:correctionsPolicy", "@type": "@id"},
jsonldcontext.json.txt|1414 col 60| "discussionUrl": { "@id": "schema:discussionUrl", "@type": "@id"},
jsonldcontext.json.txt|1419 col 64| "diversityPolicy": { "@id": "schema:diversityPolicy", "@type": "@id"},
jsonldcontext.json.txt|1420 col 80| "diversityStaffingReport": { "@id": "schema:diversityStaffingReport", "@type": "@id"},
jsonldcontext.json.txt|1421 col 60| "documentation": { "@id": "schema:documentation", "@type": "@id"},
jsonldcontext.json.txt|1430 col 56| "downloadUrl": { "@id": "schema:downloadUrl", "@type": "@id"},
jsonldcontext.json.txt|1443 col 56| "duringMedia": { "@id": "schema:duringMedia", "@type": "@id"},
jsonldcontext.json.txt|1460 col 50| "embedUrl": { "@id": "schema:embedUrl", "@type": "@id"},
jsonldcontext.json.txt|1489 col 58| "ethicsPolicy": { "@id": "schema:ethicsPolicy", "@type": "@id"},
jsonldcontext.json.txt|1549 col 58| "gameLocation": { "@id": "schema:gameLocation", "@type": "@id"},
jsonldcontext.json.txt|1588 col 46| "hasMap": { "@id": "schema:hasMap", "@type": "@id"},
jsonldcontext.json.txt|1606 col 78| "healthPlanMarketingUrl": { "@id": "schema:healthPlanMarketingUrl", "@type": "@id"},
jsonldcontext.json.txt|1628 col 44| "image": { "@id": "schema:image", "@type": "@id"},
jsonldcontext.json.txt|1632 col 52| "inCodeSet": { "@id": "schema:inCodeSet", "@type": "@id"},
jsonldcontext.json.txt|1633 col 66| "inDefinedTermSet": { "@id": "schema:inDefinedTermSet", "@type": "@id"},
jsonldcontext.json.txt|1657 col 54| "installUrl": { "@id": "schema:installUrl", "@type": "@id"},
jsonldcontext.json.txt|1674 col 52| "isBasedOn": { "@id": "schema:isBasedOn", "@type": "@id"},
jsonldcontext.json.txt|1675 col 58| "isBasedOnUrl": { "@id": "schema:isBasedOnUrl", "@type": "@id"},
jsonldcontext.json.txt|1711 col 58| "labelDetails": { "@id": "schema:labelDetails", "@type": "@id"},
jsonldcontext.json.txt|1737 col 48| "license": { "@id": "schema:license", "@type": "@id"},
jsonldcontext.json.txt|1751 col 42| "logo": { "@id": "schema:logo", "@type": "@id"},
jsonldcontext.json.txt|1759 col 66| "mainEntityOfPage": { "@id": "schema:mainEntityOfPage", "@type": "@id"},
jsonldcontext.json.txt|1762 col 40| "map": { "@id": "schema:map", "@type": "@id"},
jsonldcontext.json.txt|1764 col 42| "maps": { "@id": "schema:maps", "@type": "@id"},
jsonldcontext.json.txt|1765 col 50| "masthead": { "@id": "schema:masthead", "@type": "@id"},
jsonldcontext.json.txt|1794 col 96| "missionCoveragePrioritiesPolicy": { "@id": "schema:missionCoveragePrioritiesPolicy", "@type": "@id"},
jsonldcontext.json.txt|1817 col 64| "noBylinesPolicy": { "@id": "schema:noBylinesPolicy", "@type": "@id"},
jsonldcontext.json.txt|1902 col 54| "paymentUrl": { "@id": "schema:paymentUrl", "@type": "@id"},
jsonldcontext.json.txt|1941 col 64| "prescribingInfo": { "@id": "schema:prescribingInfo", "@type": "@id"},
jsonldcontext.json.txt|1987 col 74| "publishingPrinciples": { "@id": "schema:publishingPrinciples", "@type": "@id"},
jsonldcontext.json.txt|2024 col 56| "relatedLink": { "@id": "schema:relatedLink", "@type": "@id"},
jsonldcontext.json.txt|2041 col 54| "replyToUrl": { "@id": "schema:replyToUrl", "@type": "@id"},
jsonldcontext.json.txt|2078 col 46| "sameAs": { "@id": "schema:sameAs", "@type": "@id"},
jsonldcontext.json.txt|2085 col 54| "screenshot": { "@id": "schema:screenshot", "@type": "@id"},
jsonldcontext.json.txt|2087 col 52| "sdLicense": { "@id": "schema:sdLicense", "@type": "@id"},
jsonldcontext.json.txt|2116 col 54| "serviceUrl": { "@id": "schema:serviceUrl", "@type": "@id"},
jsonldcontext.json.txt|2124 col 64| "significantLink": { "@id": "schema:significantLink", "@type": "@id"},
jsonldcontext.json.txt|2125 col 66| "significantLinks": { "@id": "schema:significantLinks", "@type": "@id"},
jsonldcontext.json.txt|2140 col 52| "speakable": { "@id": "schema:speakable", "@type": "@id"},
jsonldcontext.json.txt|2204 col 52| "targetUrl": { "@id": "schema:targetUrl", "@type": "@id"},
jsonldcontext.json.txt|2213 col 58| "thumbnailUrl": { "@id": "schema:thumbnailUrl", "@type": "@id"},
jsonldcontext.json.txt|2232 col 56| "trackingUrl": { "@id": "schema:trackingUrl", "@type": "@id"},
jsonldcontext.json.txt|2252 col 74| "unnamedSourcesPolicy": { "@id": "schema:unnamedSourcesPolicy", "@type": "@id"},
jsonldcontext.json.txt|2256 col 40| "url": { "@id": "schema:url", "@type": "@id"},
jsonldcontext.json.txt|2289 col 94| "verificationFactCheckingPolicy": { "@id": "schema:verificationFactCheckingPolicy", "@type": "@id"},

I guess the next step is probably example clean-up and showing how to use "genre": {"@id": "...url..."} when one wants to use the "or URL" part of the "Text or URL" items. I suppose we can use schemaorg.owl to that list and then hunt down examples that need updating? I'm happy to provide examples where needed.

Thanks again for this work, @RichardWallis!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.