Skip to content
This repository has been archived by the owner on Nov 11, 2019. It is now read-only.

Microdata's property ordering semantics are unclear (and perhaps unused) - can we simplify? #32

Closed
danbri opened this issue May 15, 2017 · 4 comments · Fixed by #74
Closed

Comments

@danbri
Copy link
Contributor

danbri commented May 15, 2017

From https://w3c.github.io/microdata/#the-microdata-model

5.1 The microdata model

The microdata model consists of groups of name-value pairs known as items.

Each group is known as an item. Each item can have item types, a global identifier (if the vocabulary specified by the item types support global identifiers for items), and a list of name-value pairs. Each name in the name-value pair is known as a property, and each property has one or more values. Each value is either a string or itself a group of name-value pairs (an item). The names are unordered relative to each other, but if a particular name has multiple values, they do have a relative order.

Q: What does "they do have a relative order" vs "are unordered" actually mean? Did anyone implement against this distinction?

A test case to explore could be based on something like:

<div itemscope itemtype="http://schema.org/Book">
   <meta itemprop="bookFormat" content="EBook/DAISY3"/>
   <meta itemprop="accessibilityFeature" content="largePrint/CSSEnabled"/>
   <meta itemprop="accessibilityFeature" content="highContrast/CSSEnabled"/> 
   <span itemprop="author">
    <div itemscope itemtype="http://schema.org/Person">
      <span itemprop="name">Alice Aardvark</span>
    </div>
   </span>
   <span itemprop="author">
   </span>
</div>

The spec seems to say that the relative order of accessibilityFeature vs author on this Book is unimportant, whereas considering the values for accessibilityFeature, they are relative to each other; and considering the two authors listed, that ordering is also considered in some sense significant. For example, perhaps a later accessibilityFeature declaration overrides an earlier one; or perhaps a first-listed author is implicitly said to be a more significant contributor. Microdata delegates such details to vocabularies such as Schema.org. Schema.org says that it does not attach meaning at this level. Does anyone else?

So - I would like to explore clarifications in this area. Neither Schema.org nor the earlier datavocabulary.org vocabulary, assign semantics to this kind of property ordering. At Google we extract schema.org and datavocabulary Microdata into re-order-able triples / graphs; our parser currently assumes other uses of Microdata follow this pattern. I suspect @gkellogg and other parser writers may have implemented structures that represent the property ordering, but I do not know of anyone making use of such facilities.

I suggest that "but if a particular name has multiple values, they do have a relative order" may lack implementations beyond parsers i.e. vocabularies + publisher/consumer ecosystem. Is "parsers can handle this distinction" enough of an argument to preserve this aspect of Microdata, or can the spec be simplified in the light of experience here?

We might consider clarifying that the entire Microdata structure can be viewed as fully ordered as HTML, considered in the context of its life within a larger HTML document. This can be very important for use cases such as editors. However we might choose to say that order is not significant / meaningful when considering Microdata as a carrier of factual claims.

One way to state this idea would be to try to agree that any circumstances that are captured by the above test case ought to also be equally accurately described by the following test case (in which I have reordered everything):

<div itemscope itemtype="http://schema.org/Book">
   <span itemprop="author">
     <div itemscope itemtype="http://schema.org/Person">
      <span itemprop="name">Zac Zebedee</span>
    </div>
   </span>
   <span itemprop="author">
     <div itemscope itemtype="http://schema.org/Person">
      <span itemprop="name">Alice Aardvark</span>
    </div>
    </div>
   </span>  
   <meta itemprop="accessibilityFeature" content="highContrast/CSSEnabled"/> 
   <meta itemprop="accessibilityFeature" content="largePrint/CSSEnabled"/>
  <meta itemprop="bookFormat" content="EBook/DAISY3"/>
</div>

These distinctions are a bit easier to state for languages that explicitly extract into atomic triples, but I think we can find a way.

Does anyone know of a use of Microdata which depends upon "but if a particular name has multiple values, they do have a relative order."?

/cc @tmarshbing @nicolastorzec @chaals, @betehess (and @pmika for old time's sake) for Bing, Yahoo, Yandex, Apple perspective on this.

@gkellogg
Copy link
Member

We pushed ideas for ordering multiple property values to the vocabulary registry, and did not do ordering for schema.org.

@gkellogg
Copy link
Member

IMO, only reasonable use for Microdata is schema.org, which currently does not order value output at all. Simplest thing is to say that multiple property values are unordered.

@chaals
Copy link
Collaborator

chaals commented Jul 4, 2017

unordered works for me, but @danbri are you saying there is a real use case and requirement for specified ordering?

@danbri
Copy link
Contributor Author

danbri commented Jul 12, 2017

I was thinking to say something like “While it is harmless for e.g. authoring applications to make use of the enclosing documents natural ordering, publishers are cautioned that the output of microdata parsing does not preserve order unless dedicated structures (such as schema.org/ItemList or rdf:Collection) are explicitly used."

However, "the order of properties is not significant and authors must not rely on it being preserved"
works for me. Any serialization syntax by definition has an order, so maybe it needn’t be noted that using that order in editors is reasonable.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants