What does nested according to the specification mean in SC 4.1.1 #978

mraccess77 · 2019-11-29T01:14:50Z

Refer to he following thread. Does SC "nested according to the specification" mean nested according to the syntax of opening and closing tags or in terms of the specification saying certain tags can't be within certain tags.

https://lists.w3.org/Archives/Public/w3c-wai-ig/2019OctDec/0113.html

JAWS-test · 2019-11-30T09:48:06Z

I have read the WCAG documents on this subject and could not find a clear answer to this question. I therefore propose that

to discuss this question, decide about it and document the result for WCAG 2.x in the Understanding
to abolish SC 4.1.1 with WCAG 3.x (see Deprecating SC 4.1.1 #770)

I suspect that the WCAG parsing only bookmarklet is often used for testing. This interprets SC 4.1.1 in a way that it is not only about correct nesting, but that according to the HTML specification also the child elements must be correct.

Unfortunately I can't find any information about whether the DOM or the source code should be checked:

Some of the SC 4.1.1 problems do not occur in the DOM because they are automatically fixed by the browser.
The source code is often not relevant these days, because it only contains JavaScript instructions for creating the DOM and hardly any HTML.

The question is also to what extent AT uses DOM or source code or whether they use the Accessibility APIs of the operating systems.

alastc · 2019-12-06T15:41:33Z

I've always interpreted that as meeting the spec of the language used. I.e. if you're writing HTML5, according to that spec. (I'd like to avoid tangets about multiple HMTL specs please!)

Therefore the nesting should be according to the rules of that spec, so I agree with the parsing bookmarket's approach.

DOM vs source is tricky, as you have to use the DOM to work out what the source code (including scripts) actually is, and then work out if that matches the spec. Another good reason to depricate 4.1.1, as if you are going by DOM then you should look at impact not spec.

My understanding is that AT genearlly uses the accessibility API of the system, but there are some odd cases which can use direct access (e.g. Dragon possibily?).

dd8 · 2020-02-05T09:56:13Z

If there are still ATs using direct access to the source, then they're very unlikely to apply JavaScript or CSS changes to the source. This may impact assumptions about JavaScript/CSS elsewhere in WCAG.

The reason I think direct access ATs are unlikely to apply JavaScript/CSS is that it's a lot of effort - at least an order of magnitude harder than pulling the information from the browser DOM or accessibility API. To apply JavaScript you need:

an HTML parser to find <script> elements,
networking to download external script src files
something to manage a fake DOM
a JavaScript parser
a JavaScript runtime engine
a garbage collector

Even if you pull in an existing implementation like Chrome's V8 JavaScript engine you still have a lot of integration work to do on the first 3 items above.

To apply CSS you need:

an HTML parser to find <style> and <link> elements,
networking to download external link rel=stylesheet files
something to manage a fake DOM
a CSS parser
a CSS engine that applies the cascade to the fake DOM

To apply JS/CSS you basically have to build most of a browser except the rendering portion.

detlevhfischer · 2020-05-08T13:40:06Z

Even after using the "Parsing only" or TPG's Validate Page bookmarklet, on the W3C nu validator results, there seem to be a number of things flagged as errors that probably have little or now impact on accessiblity, such as

Inappropriate attributes such as a name attribute on div or title attribute on svg
div used within strong or other inline elements

Then there are other cases where I am less confident that they are harmless, such as a div as child of ul.
Apart from custom attributes that have been discussed here #1078 and seem to be OK, I'd be curious what folks see as exemptions that do not violate the letter of 4.1.1?

patrickhlauke · 2021-05-10T17:55:07Z

x-ref #770

cstrobbe · 2022-06-16T13:46:49Z

This issue was triggered by a mailing list contribution by me but I hadn't seen it until it was referenced in a BIK-BITV issue (in German) today. My insistence on nesting based on syntax instead of nesting based on content models is based on the concept of well-formedness that informed discussions about the formulation of the success criterion in the years 2005-2008. Below are a few pointers to those discussions.

WCAG WG meeting minutes, 23 June 2005: resolution to remove all SC under Guideline 4.1 and to replace them with an editorial note that they require discussion and comments. Quote from the discussion leading up to that resolution: "Acknowledgement that well-formedness doesn't apply to SGML and a proposal at http://lists.w3.org/Archives/Public/w3c-wai-gl/2005AprJun/0841.html" (I had come to the WG with a background in using and teaching XML.)
WCAG WG meeting minutes, 10 November 2005: after a discussion on SGML content models, the working group accepts the following wording for SC 4.1.1: "Delivery units can be parsed unambiguously." ("Delivery units" would eventually be replaced with "Web pages".) The following definition of parsing was accepted along with it: "Parsing transforms markup or other code into a data structure, usually a tree, which is suitable for later processing and which captures the implied hierarchy of the input. Parsing unambiguously means that there is only one data structure that can result." Parsing into a correct tree requires correct syntax, not correct content models. XML well-formedness would have achieved essentially the same result, but only for XML-based formats. Well-formedness is based on syntax and can be checked by non-validating parsers, i.e. without reference to content models (DTD, XML Schema etc.).
WCAG WG meeting minutes, 17 November 2005: resolutions related to a draft of How to meet SC 4.1.1. Titles (placeholders) for proposed techniques: "Ensuring that unique ids are specified AND that opening and closing tags of all elements can be parsed unambiguously" (for HTML-based content) and "Ensuring that the delivery unit is well-formed AND that unique ids are specified" (for XML-based content). Again, nothing about correct nesting according to a spec's content models.
Understanding Success Criterion 4.1.1: Parsing (for WCAG 2.0) also discusses how well-formedness informed the wording of the success criterion: "Note: The concept of "well formed" is close to what is required here. However, exact parsing requirements vary amongst markup languages, and most non XML-based languages do not explicitly define requirements for well formedness. Therefore, it was necessary to be more explicit in the success criterion in order to be generally applicable to markup languages. Because the term "well formed" is only defined in XML, and (because end tags are sometimes optional) valid HTML does not require well formed code, the term is not used in this success criterion."

In other words, a correct understanding of the SC requires understanding the distinction between XML's concepts of well-formedness and validity. The parsing SC is based on the concept of well-formedness. Unfortunately, in non-XML-based languages, there are no tools to check syntax independently from validity (i.e. content models). This is why techniques for SC 4.1.1 rely on validation.

Note: This is a shortened version of Notes on the History of Success Criterion 4.1.1, which I wrote up on a personal website.

dd8 · 2022-08-06T07:18:58Z

One consideration here is the HTML 5 parser adoption agency algorithm:
https://html.spec.whatwg.org/multipage/parsing.html#adoption-agency-algorithm

This algorithm runs in two situations:

a) when tags are mis-nested and the document is not well-formed in the XML sense
https://html.spec.whatwg.org/multipage/parsing.html#misnested-tags:-b-i-/b-/i
b) when elements are well-formed in the XML sense, but are used where they're not allowed:
https://html.spec.whatwg.org/multipage/parsing.html#unexpected-markup-in-tables

For example, the img cannot appear as a direct child of table:

<table>
  <img src="test.png">
  <tr>
    <td>Cell</td>
  </tr>
</table>

So the adoption agency algorithm moves the img outside the table and produces the following DOM:

<img src="test.png">
<table>
  <tr>
    <td>Cell</td>
  </tr>
</table>

The parsing algorithm also discards some elements that are well-formed in the XML sense, but used with a forbidden ancestor:
https://html.spec.whatwg.org/multipage/parsing.html#parsing-main-inbody

For example, this markup is well-formed in the XML sense:

<form>
  <input name="one"/>
  <form>
    <input name="two"/>
  </form>
</form>

but produces this DOM if parsed as HTML (1) because form cannot be nested inside form:

<form>
   <input name="one">  
   <input name="two">
</form>

(1) Documents are parsed as HTML if they're served with MIME type text/html. Documents are parsed as XML when served with MIME type application/xml+xhtml. The HTML is not transformed this way if the document is parsed as XML. In this case the document is loaded directly into the DOM by an XML parser and none of the HTML parsing algorithm is used. This is very much an edge case since fewer than 0.05% of pages are served as application/xml+xhtml
https://commoncrawl.github.io/cc-crawl-statistics/plots/mimetypes

mraccess77 · 2023-03-23T13:57:14Z

Do I understand correctly that syntax issues we are discussing that would technically fail WCAG 2.0/2.1 4.1.1 would be the misnested ones with examples with nesting such that tags are closed and opened in the wrong order such as the example linked above <p>1<b>2<i>3</b>4</i>5</p> ?
I understand that we are placing a note in WCAG 2.0 and 2.1 understanding documents saying the SC is automatically met - but that is not a normative note.

If one was to use the nu validator from W3C - would they be looking for errors listed as "violates nesting rules."? I want to make sure that there is clear guidance on which nesting items can be ignored by the validator as related to the content model and which ones are syntactical in a way that anyone can differentiate.

cstrobbe · 2023-03-23T15:49:41Z

The HTML Validator reports many syntax issues that don't violate SC 4.1.1 (i.e. in the originally intended meaning), and since there are various types of syntax issues, these are described differently by the validator. I filter out the irrelevant ones using a bookmarklet based on Steve Faulkner's WCAG Parsing Bookmarklet.

Without a bookmarklet, you really need to understand both SC 4.1.1 and the validator's errors and warnings very well in order to know what violates the SC and what doesn't. The following are examples of failures:

End tag div seen, but there were open elements.
Stray end tag section.
End tag em violates nesting rules.
Duplicate attribute class.
Duplicate attribute id.
Duplicate ID search-1.

dd8 · 2023-03-23T17:17:23Z

If it's helpful I can go through all the error states in the VNU parser used by the HTML Validator and produce a list of these, with corresponding validator messages. This list won't include content model errors, because those aren't produced by the parser. I had a quick look at the code and can see around 108 parser error states.

Once that's done someone can go through them and decide which ones map to 4.1.1

PS I'm quite familiar with the internals of the VNU parser (because I did a port of it from Java to C++) and have over 25 years professional experience of writing HTML parsers.

mraccess77 · 2023-03-23T18:54:54Z

Hi @dd8 I wouldn't want to ask you to do that given browser support and the direction of the note - but what you and others have already provided is helpful to understand the scope. of what may be out there for the limited situations where it is important.

dd8 · 2023-03-23T19:50:11Z

Fair enough - if anyone needs to know which messages are reported by the parser you can find them in:

https://github.com/validator/htmlparser/blob/master/src/nu/validator/htmlparser/impl/ErrorReportingTokenizer.java
https://github.com/validator/htmlparser/blob/master/src/nu/validator/htmlparser/impl/TreeBuilder.java

The error reporting functions all have names with an err prefix like errNoSpaceBetweenAttributes, errDuplicateAttribute or errSlashNotFollowedByGt and contain the message strings reported by the validator.

Edit: there are a small number of content model errors reported by the parser (e.g. malformed tables and nested headings) because these are fixed by the parser. See #978 (comment) for details

alastc added 4.1.1 Parsing WCAG 2.0 labels Dec 6, 2019

JAWS-test mentioned this issue Jan 18, 2022

4.1.1 Validity vs. well-formedness #2186

Closed

Andreas-Englisch mentioned this issue Jun 16, 2022

Bewertung von 9.4.1.1 BIK-BITV/BIK-Web-Test#269

Open

cstrobbe mentioned this issue Jun 22, 2022

Proposal to Rephrase Success Criterion 4.1.1 #2525

Open

cstrobbe mentioned this issue Aug 28, 2022

New SC (4.1.4?): Native Child-Element Roles #2649

Open

mitchellevan mentioned this issue May 15, 2023

Content model nesting failures should not be 4.1.1 failures stevefaulkner/wcagparsing#11

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What does nested according to the specification mean in SC 4.1.1 #978

What does nested according to the specification mean in SC 4.1.1 #978

mraccess77 commented Nov 29, 2019

JAWS-test commented Nov 30, 2019 •

edited

Loading

alastc commented Dec 6, 2019

dd8 commented Feb 5, 2020 •

edited

Loading

detlevhfischer commented May 8, 2020 •

edited

Loading

patrickhlauke commented May 10, 2021

cstrobbe commented Jun 16, 2022

dd8 commented Aug 6, 2022 •

edited

Loading

mraccess77 commented Mar 23, 2023 •

edited

Loading

cstrobbe commented Mar 23, 2023

dd8 commented Mar 23, 2023

mraccess77 commented Mar 23, 2023

dd8 commented Mar 23, 2023 •

edited

Loading

What does nested according to the specification mean in SC 4.1.1 #978

What does nested according to the specification mean in SC 4.1.1 #978

Comments

mraccess77 commented Nov 29, 2019

JAWS-test commented Nov 30, 2019 • edited Loading

alastc commented Dec 6, 2019

dd8 commented Feb 5, 2020 • edited Loading

detlevhfischer commented May 8, 2020 • edited Loading

patrickhlauke commented May 10, 2021

cstrobbe commented Jun 16, 2022

dd8 commented Aug 6, 2022 • edited Loading

mraccess77 commented Mar 23, 2023 • edited Loading

cstrobbe commented Mar 23, 2023

dd8 commented Mar 23, 2023

mraccess77 commented Mar 23, 2023

dd8 commented Mar 23, 2023 • edited Loading

JAWS-test commented Nov 30, 2019 •

edited

Loading

dd8 commented Feb 5, 2020 •

edited

Loading

detlevhfischer commented May 8, 2020 •

edited

Loading

dd8 commented Aug 6, 2022 •

edited

Loading

mraccess77 commented Mar 23, 2023 •

edited

Loading

dd8 commented Mar 23, 2023 •

edited

Loading