Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validator.w3.org/nu allowing "content" attribute #1210

Open
AndySky21 opened this issue Aug 15, 2021 · 2 comments
Open

Validator.w3.org/nu allowing "content" attribute #1210

AndySky21 opened this issue Aug 15, 2021 · 2 comments

Comments

@AndySky21
Copy link

AndySky21 commented Aug 15, 2021

URL being validated or code to reproduce error:

<div itemscope itemtype="https://schema.org/Product">
  <span itemprop="name">Kenmore White 17" Microwave</span>
  ...
  <div itemprop="offers" itemscope itemtype="https://schema.org/Offer">

    <!--price is 1000, a number, with locale-specific thousands separator and decimal mark -->
    <!-- the $ character is marked up with the machine-readable code "USD" -->
    <span itemprop="priceCurrency" content="USD">$</span>
    <span itemprop="price" content="1000.00">1,000.00</span>
  </div>
  ....
</div>

Validation results:

None relevant
(error messages due to incomplete code snippet: no lang attribute, no <!DOCTYPE html> declaration, no <title> in document head)

Expected results:

Error: Attribute content not allowed on element span at this point.

From line 7, column 5; to line 7, column 49

" -->↩    <span itemprop="priceCurrency" content="USD">$</spa

Attributes for element span:
Global attributes
=====
Error: Attribute content not allowed on element span at this point.

From line 8, column 5; to line 8, column 45

span>↩    <span itemprop="price" content="1000.00">1,000.

Attributes for element span:
Global attributes

(as shown on validator.nu result for the same snippet)

Further reasoning

a global content attribute in HTML does not exist. It is only allowed on <meta> element AFAIK.
Attribute content is present in RDFa spec (and as such it can be specified on any element). In order to be allowed in HTML, though, it should be only present along with a consistent RDFa syntax in the same subtree (namely, a typeof attribute on an ancestor element and a property attribute on the same element, otherwise it is only an incorrect attribute in no namespace.

Full disclosure

Code snippet produced is an excerpt from the example shown in the Schema.org page for the Offer vocabulary, Microdata markup selected. In such a case, use of the <data> element would have been expected, but Schema.org seems to not know such syntax (a bug has already been filed for that case, no action expected).

@sideshowbarker
Copy link
Contributor

sideshowbarker commented Aug 15, 2021

In order to be allowed in HTML, though, it should be only present along with a consistent RDFa syntax in the same subtree (namely, a typeof attribute on an ancestor element and a property attribute on the same element, otherwise it is only an incorrect attribute in no namespace.

That’s not what the HTML+RDFa spec says.

At https://www.w3.org/TR/html-rdfa/#extensions-to-the-html5-syntax, the HTML+RDFa spec simply says this:

For the avoidance of doubt, the following RDFa attributes are allowed on all elements in the HTML5 content model: @vocab, @typeof, @Property, @resource, @Prefix, @content, @about, @rel, @Rev, @datatype, and @inlist.

So the HTML+RDFa spec doesn’t at all say that the content attribute “should be only present along with a consistent RDFa syntax in the same subtree”. In fact it pretty much states the opposite, emphatically.

I guess the HTML+RDFa spec rightly should say something like “the content attribute should be only present along with a consistent RDFa syntax in the same subtree”. But it doesn’t.

In general, there’s a long list of things the HTML+RDFa spec should say, if it had been specified properly. In implementing RDFa support in the checker, I already had to make a number of guesses about what the spec seemed to be trying to say but didn’t actually say. But I’m now long past doing that. I’ve already invested way more time in adding RDFa support in the checker than RDFa as a technology actually merits.

So I’m completely unenthusiastic about spending any further time making any change to the checker to paper over yet another deficiency in the HTML+RDFa and RDFa specs.

@AndySky21
Copy link
Author

AndySky21 commented Aug 15, 2021

But content has absolutely no meaning outside RDFa usage. Given that, RDFa 1.1 validation in Validator.w3.org/nu is completely unreliable because one can simply put some vocab, typeof, Property, resource, Prefix, content, about, rel, Rev, datatype, and inlist here and there for the lulz. Validator checks that attribute values are conforming to expected type (e.g. URI), and that's all.

However, I have to correct your quote:
At https://www.w3.org/TR/html-rdfa/#extensions-to-the-html5-syntax, the HTML+RDFa spec correctly says this:

If HTML+RDFa document conformance is desired, all RDFa attributes and valid values (including CURIEs), as listed in RDFa Core 1.1, Section 2.1: The RDFa Attributes, must be allowed and validate as conforming when used in an HTML4, HTML5 or XHTML5 document. For the avoidance of doubt, the following RDFa attributes are allowed on all elements in the HTML5 content model: vocab, typeof, Property, resource, Prefix, content, about, rel, Rev, datatype, and inlist.

Text formatting is mine. Notice the causality relationship.

At https://www.w3.org/TR/html-rdfa/#document-conformance, the spec says this:

The following conformance criteria apply to any HTML document including RDFa markup:

  • All document conformance requirements stated as mandatory in the HTML5 specification must be met.
  • The appropriate Extensions to the HTML5 Syntax, as described in this document, must be considered valid and conforming....
  • All HTML5 elements and attributes should be used in a way that conforms to [html5]. All RDFa attributes should be used in a way that is conforms to [rdfa-core] and this document.

This means that misusing RDFa Core 1.1 attributes should be ground for non-conformance according to HTML+RDFa 1.1.
Now, using isolated content attribute is not immediately recognized as error by RDFa 1.1 validator. However, no Structured Data can be extracted from such a source. It is merely "conforming" because no data means no errors, and because a short-circuit between specs dictates that isolated content attribute is valid because:

  1. RDFa attributes are valid global HTML5 attributes
  2. thus making the document a valid HTML5 document and therefore a valid HTML5 + RDFa 1.1 document
  3. thus making it acceptable for RDFa validator (the same way it is valid if containing Microdata attributes or attributes in no namespace on an <embed> element, a valid document with no meaning: a Structured Data and Rich Snippet fnord).

Logic dictates that there was no intention of using RDFa in such a context in the first place, thus breaking the cause/effect chain explicited in the passage your excerpt was quoted from:
a. "The following conformance criteria apply to any HTML document including RDFa markup: ...All RDFa attributes should be used in a way that is conforms to [rdfa-core]".
b. If HTML+RDFa document conformance is desired, all RDFa attributes and valid values ... must be allowed and validated.

I imagine that the SHOULDpredicate mentioned twice above makes room for no use at all of these features (there may exist valid reasons in particular circumstances to ignore a particular item, but the full implications must be understood and carefully weighed before choosing a different course, as per RFC2119, but no mention is given of "particular circumstances to ignore" correct use of RDFa attributes.


As a final note, at this point, I guess those attributes are recognized as error in Validator.nu because RDFa explicitly mentions HTML5 and not HTML Living Standard.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants