-
Notifications
You must be signed in to change notification settings - Fork 822
-
Notifications
You must be signed in to change notification settings - Fork 822
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Members of DayOfWeek should belong to the schema.org ontology #921
Comments
The pattern is supported. I would rather not include all enumerations from GR into schema.org; it just expands the size of the vocabulary. |
What do you mean? We can use Text instead of GoogleRelations' URIs? |
No, you should use GoodRelations URIs. |
FWIW Google's initially-released specifications for use of this property on their Local Business overview page specified the expected value as text, and provided an example which reflected this. I say "intially-released" as the page, which initially returned a 404 from the link that appeared in the Google Developers Structured Data site, is now again returning a 404, so the screenshot displayed below is derived from the Internet Archive copy of Dec. 3, 2015. It, of course, remains to be seen whether they'll used the plain text enumerations as pictured when they reactivate the pages or modify this to use the GR URIs. |
@Aaranged Google's Local Business is actually what triggered this issue :-) (and that's where the snippet was coming from) Note that we could improve the situation a little bit by adding That being said, I am not sure I like this approach. My background is RDF and I personally don't mind using several ontologies at the same time. But the appeal in Schema.org is that we do not need to do that. So I would prefer not relying on external ontologies for important things.
In practice, as soon as Schema.org links to GoodRelations, people have to consider the union of the two ontologies. In any case, I expect |
+1. I don't think this is an enumeration that's going to get out of hand, as I doubt we'll see a week ever consisting of more than these seven days. :) Failing this, a) usage of DayOfWeek should be supported with an example; b) documentation on enumerations should be added to schema.org documentation. IMO b) should be pursued regardless of what's done with this specific enumeration. Adding enumerations is anything but intuitive, and better documentation would assist webmasters in providing better markup. Confusing matters further is that the blog post on external enumerations is linked from the documention page. This might lead one to believe this is general information on how to use enumerations while its subject matter is, of course, how to use enumerations when they don't exist in the schema.org core. |
I would support adding Monday/Tuesday/Wednesday/Thursday/Friday into schema.org. These are common and useful concepts. Doesn't mean we would need to bring in all the other GR enumerations. At this stage I can only offer anecdata, but it feels like webmasters often mess up with the idea of enumerated property values being URLs, and we'll often see "True" or "true" in published data instead of the URL http://schema.org/True. One possible cleanup rule that schema.org consumers might need could be: "if a string value, e.g. "True" is unexpected and found where an enumeration is expected, and if the string exactly matches a term from within that enumeration, then fix it to the appropriate URL". So if you find an http://schema.org/isAccessibleForFree property and its value is "foo", you don't know what to do with it. But if you find "True", then normalizing it to be the URL http://schema.org/True would seem reasonable. We could do this more easily for days of the week if they were internally enumerated within the site. Whether we really want to get into writing down such hacks is another matter, however I will note the precedent in http://schema.org/docs/datamodel.html which already sets and expectation that consumers will figure out how to accept 'strings' where 'things' ought really to be provided: The site currently says:
For non-enumerable things like Country, Person etc it is not obviously clear how to take a string and get an URL; for enumerated things like True, False we could choose to be more explicit. |
That's also my experience.
Let's look at https://developers.google.com/structured-data/local-businesses/. The documentation says that If Google really wants to promote the use of |
Picking this up again, ... I propose 1.) we add 'Monday' through 'Sunday' as enumerations into schema.org. 2.) that schema.org documents (in datamodel.html or some other way) a standard way for the long URIs of its enumerated values to be written instead as simple strings based on the last part of the URI. |
Thanks :-) It's hard to have an idea of priorities around here...
What do we actually gain by having days as URIs? Especially if you say that we should use simple String instead. It seems to me that datatypes already do that job very well e.g. So my proposal is to fully embrace the use of well-defined strings by
That is basically standardizing Google's own implementation for |
@betehess - re priorities, as with all consensus-oriented collaborative projects it is easier to discuss rough goals than to successfully predict exact timelines. Historically all changes to the site have been by consensus of the full steering group based on public discussion and debate. We usually lose some momentum around vacation periods but http://schema.org/docs/releases.html should give a rough feel for the typical pace. As for progress on a particular topic, that depends on someone championing an issue - as you're doing here. When you put together your analysis late last year I added it to the issue tracking our next release - #911 (these are findable via #1). How much of that we manage to cover depends on how much agreement we can achieve here. I'll respond to the technical points separately. |
My experience is similar to @betehess's; while namespaces make sense to folks with RDF backgrounds, many authors get confused. For clarity, we should add the days of the week values to schema.org. (I am not advocating adding all of GoodRelations. To @mfhepp's point, that is a lot of new vocabulary.) I am not sure there is much to be gained by creating a datatype. At the end of the day, even with well-defined values, authors will confuse them, so automated tools need to be somewhat liberal in understanding "Monday", "monday", "MONday", and "http://schema.org/Monday" are the same thing. |
It would be great if accessing http://schema.org/Monday results in a |
Talking of translations and "MONday" etc., I should also note that a number of sites have been publishing non-English simple strings for the daysOfWeek values. I don't have hard stats to share at this point, am digging deeper to try to get a sense for how widespread the errors are. |
I'm concerned about the ontological meaning of this set "Sunday" ... "Saturday" David Whitten In OpenCyc it talks about each of the constants is a shorthand for a set of days. To wit: OpenCyc Collection: day of the week states Instances: Sunday, Tuesday, Wednesday, Thursday, Monday, Friday, Saturday Monday: http://sw.opencyc.org/concept/Mx4rvVjW85wpEbGdrcN5Y29ycA A collection of CalendarDays and an instance of DayOfWeekType. |
I would say that in the context of schema.org, enumerated values are perfectly fine (as they have been in GoodRelations for almost a decade now). We do not need to argue about their ontological essence (which could be a painful and fruitless effort). An axiomatic theory that maps these constructs to a notion of time is not needed for the data processing tasks typically executed on the basis of schema.org markup. Martin martin hepp http://www.heppnetz.de
|
I think we are touching a nerve with the I see a few issues with @danbri's idea to write somewhere in the documentation how one can go between the two world. That is out-of-band information, and it won't easily accommodate alternate spelling and translations. I believe we can provide a lexical-to-value mapping where the domain of values is the instances of A few examples:
Note that the type coercion would be automatic in JSON-LD if we update the schema.org context accordingly. Defining such a mapping is pretty easy. It would be described in http://schema.org/DayOfWeek itself. The mapping MUST be 100% implementable or that will hurt interoperability, where one implementation would support French translation while others would not. So that means that we would still need a systematic way to handle something like If that idea has enough supporters, I am happy to update #923 accordingly. |
Is the idea that JSON-LD will allow us to write "dayOfWeek": "Monday", ... and map it transparently to the URL-based version? What would the context file need to contain to express this in a standard way? /cc @lanthaler |
@betehess re "write somewhere in the documentation how one can go between the two world." note that schema.org has had this kind of rule from the very start, on the principle that making things easier for webmasters and slightly harder for consumers was a good tradeoff. From http://schema.org/docs/datamodel.html
The idea here is to just make this rule a little more explicit, so that the result in terms of triples can be written down and shared rather than left under-determined. |
The URI will be relative to the base URL, not http://schema.org/. Are you against type-coercion? It seems much more flexible to me.
That datamodel document only tells data consumers such as Search engines to be liberal in what they accept from others. In practice, they will do it anyway because yes, "some data is better than none". The perspective of a data producer is very different. If I want my data to be understood by others, I will always prefer the way that makes be as precise as possible so that I won't be in that grey area, not knowing how my data will be interpreted. From that point of view, being too permissive will hurt interoperability. In practice, I agree that String in lieu of |
I'm still not understanding the substance of your proposal - is the idea dependent on using datatype URIs on strings like "Monday"? Can it work with simple plain strings? BTW on the Google side, there is now a note on https://developers.google.com/structured-data/local-businesses/ indicating a commitment to consuming the official schema.org notation, and to tracking the outcome of these discussions. I hope that helps clarify the situation there. |
Agreed, this is confusing because we are trying to tackle several issues at the same time. One issue was that the URLs were not in the schema.org namespace. I think we agreed to move them under the schema.org prefix and make them an enumeration. Then we started talking about strings.
Definitely, and I think we have to make it work with strings because Google will "continue to accept both variations for backwards compatibility". So we should back-port strings into the spec. Also, enough people said there was value in addressing the string use-case. My proposal here is to use typed strings instead of "simple plain strings". Because the data can easily be coerced to the expected type, there should be no impact on most users as Google recommended to use JSON-LD for that (all of their examples would still be valid with no modification). |
Where does this leave Microdata and RDFa? Microdata is on millions of sites and has no datatyping. Would we say that the short string form is JSON-LD only? |
Alright, let me try to summarize how we got there: I had two goals initially: 1. getting the I think we all agreed about 1. Now for 2, maybe this is just not needed anymore as implementations are willing to switch to that approach. That would basically work with all syntaxes. |
Perhaps we are getting close here? I continue to believe that schema.org consumers in practice will need to map strings to enumerated URIs in the following condition, and that they are worth writing down for the sake of encouraging multiple implementations.
This could be couched in terms of property values instead of triples, but I believe captures the common case of seeing "True", "true", "OrderInTransit" etc as plain strings. |
Just an idea to formalize the "strings as substitute for enumerated values approach": We could define an annotation property in schema.org that holds a string that can be used in lieu of the full URI, e.g. for http://schema.org/True etc. (and actually all instances of http://schema.org/Enumeration and its subclasses. Like so
This would define a canonical string shortcut for popular enumerated values. And a consuming client could add a rule like: Such can be done easily in SPARQL in LOD worlds. It would be a cheap way to formalize the string shortcut approach for some or all such values. Martin |
If we want to support strings, both approaches have their merits but I am not sure yet which one I prefer :-) But thinking more about all that, I think it's actually very hard to accept string and entities at the same time. Consider the following example: {
"dayOfWeek": [
"Monday",
"http://schema.org/Tuesday",
{ "@id": "http://schema.org/Wednesday" }
]
} Here, I would expect So maybe in the case of very well-identified entities, we should only accept URIs to avoid all those weird cases? To me, supporting both at the same time looks more and more confusing. Maybe all those considerations about accepting strings or not will go away if URIs become the canonical approach, and is reflected in all examples on schema.org and other documentations. |
I think the problem of seeing URIs as literal values is a general one in schema.org data, so I think we should handle it via a generic approach, not a specific one for this use-case. @danbri and @pmika will likely have data on this, but from the top of my head, I would guess there is quite some markup that uses URIs as strings, like so In my opinion, this type of bug should be handled by a tolerant parser and not at the level of the vocabulary. BTW, I am not proposing to change the range of the property to Enumeration OR Text, I am just suggesting a mechanism for defining a canonical string got the enumerated values. You are right that the adding Text to the range of the dayOfWeek and updating the JSON-LD context respectively would indeed cause problems with JSON-LD data that uses URIs without { "@type": "@id" } . |
Sorry for taking so long before commenting again :-/
Big 👍
Again, big 👍 Is it fair to say that there is some consensus around
If that's the case, #923 is already implementing that. |
A strong 👍 for this:
|
I've just been re-reading all this, trying to figure out what consensus we have achieved so far - if any! Here's a sketch of a checklist -
So - where are we today? My preference for this release would be to make the obvious and seemingly widely agreed changes needed to address 1.), 2.) and 3.) and defer 4.) for a site-wide treatment of the issue. I'm not seeing any clear consensus on what exactly to do around strings-vs-URLs, but we do all seem to agree that there is scope for improvements that could make life easier for publishers around enumerated value URLs/strings, and that could clarify what is expected of consumers. It would be very unfortunate to do one thing here on dayOfWeek and something else for True, False and all the other enumerated values we have. I have opened a general issue #1094 - please take a look and let me know if I've missed anything. Does this seem a reasonable way forward? I'd like to capture the consensus we have so far in a release without losing the larger (i.e. larger than dayOfWeek) issue... |
Here are 3 examples. I suggest that the first is what we know the public would like to write, the second is what they probably ought to be writing if our JSON-LD context defaults to @id, and the last is the new canonical URL-based form that we seem to be agreeing towards. If this is correct I suggest that the use of @value and nesting { makes the 2nd example at least as complex as the 3rd, possibly more so. EXAMPLE-OH-1:
EXAMPLE-OH-2:
EXAMPLE-OH-3:
|
Status as far as I can see it: I have just now merged the related #923 from @betehess so we have Monday-Friday now (plus PublicHolidays). Also the JSON-LD context has @id for dayOfWeek, as discussed. Progress! However (since this came up) I do not think we have yet achieved anything like consensus that would provide proper official schema.org justification for the current Google practice of accepting simple strings like "Monday" in place of full URLs. The text in datamodel.html offers some kind of partial justification but I think we can do better. I believe agreement on this is still worth pursuing (via #1094 to handle other vocab areas too). If we fail I'll pass that decision back into Google and get the local-businesses document over there updated appropriately. |
Monday-Friday and PublicHolidays implemented. PTAL. |
I know it's an old topic, but I want to check whether this option would be acceptable if I need translation. |
For some reason, it's still relying on http://purl.org/goodrelations/v1#.
Would the following snippet something we want to support?
The text was updated successfully, but these errors were encountered: