New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggest adding a warning about outline algorithm #83

Open
stevefaulkner opened this Issue Sep 1, 2015 · 108 comments

Comments

@stevefaulkner
Contributor

stevefaulkner commented Sep 1, 2015

Currently the HTML standard does not provide any advice in regards to the outline algorithm not being implemented, This has lead to some developers believing that the outline algorithm has an effect in browsers and assitive technology which it does not. THis can lead to developers using markup patterns that don't convey document structure. Suggest adding a warning, for example this is the warning in the W3C HTML spec

There are currently no known implementations of the outline algorithm in graphical browsers or assistive technology user agents, although the algorithm is implemented in other software such as conformance checkers. Therefore the outline algorithm cannot be relied upon to convey document structure to users. Authors are advised to use heading rank (h1-h6) to convey document structure.

@domenic

This comment has been minimized.

Show comment
Hide comment
@domenic

domenic Sep 1, 2015

Member

I kind of feel we should either leave as-is or just remove the outline algorithm altogether...

Member

domenic commented Sep 1, 2015

I kind of feel we should either leave as-is or just remove the outline algorithm altogether...

@gsnedders

This comment has been minimized.

Show comment
Hide comment
@gsnedders

gsnedders Sep 1, 2015

An outline algorithm probably makes sense to keep around, given outlining is used in some sense in most screenreaders, as far as I'm aware. That said, may well make sense to drop the current one (and the semantics that go along with it...).

gsnedders commented Sep 1, 2015

An outline algorithm probably makes sense to keep around, given outlining is used in some sense in most screenreaders, as far as I'm aware. That said, may well make sense to drop the current one (and the semantics that go along with it...).

@Hixie

This comment has been minimized.

Show comment
Hide comment
@Hixie

Hixie Sep 1, 2015

Member

The algorithm just describes the semantics of the elements. If the tools aren't supporting the semantics, they're buggy and should be fixed. Changing the semantics would be a pretty drastic change to the spec, especially as people have been using these elements for years.

We can't just remove the outline algorithm, either. We need something to define the semantics of these elements. Even if we were to say that tools should ignore the semantics and just use the h1-h6 elements in the naive flat way (ignoring tree structure, as if we were back in the 90s), you'd still need the algorithm to be able to define the authoring conformance criteria (so that conforming documents only used h1-h6 in a manner consistent with the semantics). That's pretty silly though. The right solution is just to fix the tools.

Member

Hixie commented Sep 1, 2015

The algorithm just describes the semantics of the elements. If the tools aren't supporting the semantics, they're buggy and should be fixed. Changing the semantics would be a pretty drastic change to the spec, especially as people have been using these elements for years.

We can't just remove the outline algorithm, either. We need something to define the semantics of these elements. Even if we were to say that tools should ignore the semantics and just use the h1-h6 elements in the naive flat way (ignoring tree structure, as if we were back in the 90s), you'd still need the algorithm to be able to define the authoring conformance criteria (so that conforming documents only used h1-h6 in a manner consistent with the semantics). That's pretty silly though. The right solution is just to fix the tools.

@domenic

This comment has been minimized.

Show comment
Hide comment
@domenic

domenic Sep 1, 2015

Member

@Hixie, when you say "tools," do you include user agents and their accessibility bindings?

At some point we cannot claim that user agents are broken. They are instead rejecting our change request. https://www.w3.org/Bugs/Public/show_bug.cgi?id=25003 contains comments from both Firefox and Chrome accessibility developers explicitly rejecting the idea of implementing the outline algorithm in their accessibility bindings. It also outlines the history of JAWS removing their support. Although their reasoning may be wrong, it doesn't seem fruitful at this point to challenge them.

I think it would be better for the semantics in the spec to match the semantics already exposed through accessibility bindings in implementations. People have been using these elements for years, but either (a) they have been using them in a way supported by user agents, which contradicts the spec; or (b) they have been using them in the way specified, which means their content is broken in current user agents (for users which count on those accessibility bindings). Neither of these seem good.

Member

domenic commented Sep 1, 2015

@Hixie, when you say "tools," do you include user agents and their accessibility bindings?

At some point we cannot claim that user agents are broken. They are instead rejecting our change request. https://www.w3.org/Bugs/Public/show_bug.cgi?id=25003 contains comments from both Firefox and Chrome accessibility developers explicitly rejecting the idea of implementing the outline algorithm in their accessibility bindings. It also outlines the history of JAWS removing their support. Although their reasoning may be wrong, it doesn't seem fruitful at this point to challenge them.

I think it would be better for the semantics in the spec to match the semantics already exposed through accessibility bindings in implementations. People have been using these elements for years, but either (a) they have been using them in a way supported by user agents, which contradicts the spec; or (b) they have been using them in the way specified, which means their content is broken in current user agents (for users which count on those accessibility bindings). Neither of these seem good.

@domenic

This comment has been minimized.

Show comment
Hide comment
@domenic

domenic Sep 1, 2015

Member

My last message wasn't entirely clear. I agree with @gsnedders and phrasing this as "removing the outline algorithm" was incorrect. Rather, we should update the outline algorithm to reflect implementations. In some ways this is a "revert" of the "change request" that proposed a sectioning-based outline algorithm; it differs only in degree from a4313d3, which was a revert of the change request 78f1994 to simplify selector case-sensitivity matching rules.

Member

domenic commented Sep 1, 2015

My last message wasn't entirely clear. I agree with @gsnedders and phrasing this as "removing the outline algorithm" was incorrect. Rather, we should update the outline algorithm to reflect implementations. In some ways this is a "revert" of the "change request" that proposed a sectioning-based outline algorithm; it differs only in degree from a4313d3, which was a revert of the change request 78f1994 to simplify selector case-sensitivity matching rules.

@Hixie

This comment has been minimized.

Show comment
Hide comment
@Hixie

Hixie Sep 1, 2015

Member

If we want to require that authors use h1-h6 instead of being able to do the XHTML-style <h1>-everywhere, then we need two outline algorithms: one that describes how user agents are to act, and one that describes the restrictions that authors have to follow in order for their h1-h6 headers to not contradict the sectioning semantics. (And maybe a third one, that describes how an authoring tool could convert from the saner one-heading-element style to the legacy h1-h6 style for UAs.)

But IMHO that's a poor place to be in. I don't really see why accessibility tools couldn't expose the real semantics here. It doesn't require a complex algorithm (you only need "previous", "next", and "up" to be able to navigate the tree, and walking around the tree that way is pretty straight-forward as far as I can tell). Accessibility tools are notoriously slow about catching up to implementing new features, this algorithm is not that old by their time scales. (I mean, they still haven't implemented stuff from the 90s correctly, even though there's obvious usability gains to be had by doing so.)

Member

Hixie commented Sep 1, 2015

If we want to require that authors use h1-h6 instead of being able to do the XHTML-style <h1>-everywhere, then we need two outline algorithms: one that describes how user agents are to act, and one that describes the restrictions that authors have to follow in order for their h1-h6 headers to not contradict the sectioning semantics. (And maybe a third one, that describes how an authoring tool could convert from the saner one-heading-element style to the legacy h1-h6 style for UAs.)

But IMHO that's a poor place to be in. I don't really see why accessibility tools couldn't expose the real semantics here. It doesn't require a complex algorithm (you only need "previous", "next", and "up" to be able to navigate the tree, and walking around the tree that way is pretty straight-forward as far as I can tell). Accessibility tools are notoriously slow about catching up to implementing new features, this algorithm is not that old by their time scales. (I mean, they still haven't implemented stuff from the 90s correctly, even though there's obvious usability gains to be had by doing so.)

@domenic

This comment has been minimized.

Show comment
Hide comment
@domenic

domenic Sep 1, 2015

Member

Can you explain why you need two outline algorithms? Authors would use the sectioning elements the same way they are exposed in accessibility technologies: just like divs.

Member

domenic commented Sep 1, 2015

Can you explain why you need two outline algorithms? Authors would use the sectioning elements the same way they are exposed in accessibility technologies: just like divs.

@gsnedders

This comment has been minimized.

Show comment
Hide comment
@gsnedders

gsnedders Sep 1, 2015

That ignores that fact that both Firefox and Chrome have explicitly refused to support it, and that algorithm is hella old by their standards (it's what, seven years old now?). I think the battle's lost at this point, sadly. I don't have strong opinions on what we should do, but the spec as it stands now is fiction and will remain fiction. Your argument that they should implement it because they're buggy per spec is trying to oblige behaviour by spec, and we know that's a fallacy when everyone is refusing to implement it.

gsnedders commented Sep 1, 2015

That ignores that fact that both Firefox and Chrome have explicitly refused to support it, and that algorithm is hella old by their standards (it's what, seven years old now?). I think the battle's lost at this point, sadly. I don't have strong opinions on what we should do, but the spec as it stands now is fiction and will remain fiction. Your argument that they should implement it because they're buggy per spec is trying to oblige behaviour by spec, and we know that's a fallacy when everyone is refusing to implement it.

@Hixie

This comment has been minimized.

Show comment
Hide comment
@Hixie

Hixie Sep 1, 2015

Member

@domenic Consider the following:

<h1>A</h1>
<section>
 <p>aaa
 <h1>B</h1>
 <p>bbb
</section>
<p>bbb

What are the sections in that document? If the <section> element doesn't align with that answer, then what are the semantics of <section>?

@gsnedders I don't think the battle's been fought. Any time I've seen people say they don't want to do it (e.g. in the bug above) the reasons they've given don't actually fit the facts (e.g. I've heard complaints that it would be prohibitively expensive, but that's only if you recomputed the entire tree, which as far as I can tell is unnecessary). But in any case in my comment above I gave two paths: one that I think is the right path, and another path for the case where we give up on making accessibility tools give good results. We could go down the second path, certainly. It's not just removing text from the spec, though, as I described above.

Member

Hixie commented Sep 1, 2015

@domenic Consider the following:

<h1>A</h1>
<section>
 <p>aaa
 <h1>B</h1>
 <p>bbb
</section>
<p>bbb

What are the sections in that document? If the <section> element doesn't align with that answer, then what are the semantics of <section>?

@gsnedders I don't think the battle's been fought. Any time I've seen people say they don't want to do it (e.g. in the bug above) the reasons they've given don't actually fit the facts (e.g. I've heard complaints that it would be prohibitively expensive, but that's only if you recomputed the entire tree, which as far as I can tell is unnecessary). But in any case in my comment above I gave two paths: one that I think is the right path, and another path for the case where we give up on making accessibility tools give good results. We could go down the second path, certainly. It's not just removing text from the spec, though, as I described above.

@domenic

This comment has been minimized.

Show comment
Hide comment
@domenic

domenic Sep 1, 2015

Member

My understanding is that as implemented section has no semantics, just like div. I am not sure that implementations have a concept of "sections of a document" as much as they have an accessibility tree plus an outline tree which consists of links into nodes in the accessibility tree. But I am not sure on that; presumably @stevefaulkner has done the research there.

Member

domenic commented Sep 1, 2015

My understanding is that as implemented section has no semantics, just like div. I am not sure that implementations have a concept of "sections of a document" as much as they have an accessibility tree plus an outline tree which consists of links into nodes in the accessibility tree. But I am not sure on that; presumably @stevefaulkner has done the research there.

@Hixie

This comment has been minimized.

Show comment
Hide comment
@Hixie

Hixie Sep 1, 2015

Member

I'm not sure what it would mean for <section> to have "implemented" semantics. Semantics by their very nature are about the meaning of the elements, which is something for humans. It's how you get maintainable documents that different people who have never met can approach and understand.

Member

Hixie commented Sep 1, 2015

I'm not sure what it would mean for <section> to have "implemented" semantics. Semantics by their very nature are about the meaning of the elements, which is something for humans. It's how you get maintainable documents that different people who have never met can approach and understand.

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk Sep 2, 2015

Member

Yeah, I guess either you keep the current outline algorithm, or you obsolete <section>/<h1> & <hgroup>?

Member

annevk commented Sep 2, 2015

Yeah, I guess either you keep the current outline algorithm, or you obsolete <section>/<h1> & <hgroup>?

@domenic

This comment has been minimized.

Show comment
Hide comment
@domenic

domenic Sep 2, 2015

Member

Is the idea to keep the current outline algorithm but the output of the algorithm is only something that exists in authors' minds? If so I'd be fine moving it to some section with a preface like "If you want to assemble a mental outline, that does not match that displayed by screen readers, follow the following algorithm: ... NOTE: authors are advised not to author documents that produce outlines catering to this algorithm, but instead author documents catering to accessibility tools, which follow the algorithm in $cross-link-here, according to the priority of constituencies (users over authors)"

But that seems kind of pointless.

Member

domenic commented Sep 2, 2015

Is the idea to keep the current outline algorithm but the output of the algorithm is only something that exists in authors' minds? If so I'd be fine moving it to some section with a preface like "If you want to assemble a mental outline, that does not match that displayed by screen readers, follow the following algorithm: ... NOTE: authors are advised not to author documents that produce outlines catering to this algorithm, but instead author documents catering to accessibility tools, which follow the algorithm in $cross-link-here, according to the priority of constituencies (users over authors)"

But that seems kind of pointless.

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk Sep 2, 2015

Member

It all depends on whether user agents will implement these elements. If they don't, we should scrap them and the outline algorithm can be simplified to what is supported. If they do, or we expect them to within the next five years or so, it might be worth waiting a little longer.

Member

annevk commented Sep 2, 2015

It all depends on whether user agents will implement these elements. If they don't, we should scrap them and the outline algorithm can be simplified to what is supported. If they do, or we expect them to within the next five years or so, it might be worth waiting a little longer.

@domenic

This comment has been minimized.

Show comment
Hide comment
@domenic

domenic Sep 2, 2015

Member

Hmm, to be clear, what does "implement these elements" mean? They implement them as HTMLElement instead of HTMLUnknownElement, but they do not implement the accessibility mapping implied by the current outline algorithm, and as discussed up-thread at least a couple have publicly stated they are not planning to do so. (Are those the only two relevant requirements on implementations, or am I missing some?)

Member

domenic commented Sep 2, 2015

Hmm, to be clear, what does "implement these elements" mean? They implement them as HTMLElement instead of HTMLUnknownElement, but they do not implement the accessibility mapping implied by the current outline algorithm, and as discussed up-thread at least a couple have publicly stated they are not planning to do so. (Are those the only two relevant requirements on implementations, or am I missing some?)

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk Sep 2, 2015

Member

There are some styling and parser requirements too. And there are some speculative CSS features that would build upon the outline algorithm. We would have to check. But styling and outline would be the most important aspects.

Member

annevk commented Sep 2, 2015

There are some styling and parser requirements too. And there are some speculative CSS features that would build upon the outline algorithm. We would have to check. But styling and outline would be the most important aspects.

@domenic

This comment has been minimized.

Show comment
Hide comment
@domenic

domenic Sep 2, 2015

Member

Ah right, thanks for pointing those out.

Also, when you say "obsolete <section> etc.", we could give them "the <main> treatment" instead of "the <dir> treatment". I'm not sure there's any practical difference besides what section they go in, but it's worth pointing out.

Member

domenic commented Sep 2, 2015

Ah right, thanks for pointing those out.

Also, when you say "obsolete <section> etc.", we could give them "the <main> treatment" instead of "the <dir> treatment". I'm not sure there's any practical difference besides what section they go in, but it's worth pointing out.

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk Sep 2, 2015

Member

Perhaps it would end up like main since it still has some default ARIA semantics that might be useful. Not sure. I haven't studied this in detail, but I do agree with @Hixie that the fix isn't as simple as adding a note to the outline algorithm or dropping it altogether.

Member

annevk commented Sep 2, 2015

Perhaps it would end up like main since it still has some default ARIA semantics that might be useful. Not sure. I haven't studied this in detail, but I do agree with @Hixie that the fix isn't as simple as adding a note to the outline algorithm or dropping it altogether.

@stevefaulkner

This comment has been minimized.

Show comment
Hide comment
@stevefaulkner

stevefaulkner Sep 2, 2015

Contributor

@domenic section is mapped to a region role in browsers, section is not exposed in the aural UI unless it has an accessible name.

There are 2 document scope navigation methods implemented across AT:

  • landmark navigation of header/footer/nav/main/form/section(only if accessible name present) elements.
  • heading navigation via h1-h6 elements.

Implementation of accessibility layer semantics for html element landmarks and h1-h6 is complete in chrome, firefox, Safari

Data on the utilisation of each can be found in webaim screen reader surveys

Contributor

stevefaulkner commented Sep 2, 2015

@domenic section is mapped to a region role in browsers, section is not exposed in the aural UI unless it has an accessible name.

There are 2 document scope navigation methods implemented across AT:

  • landmark navigation of header/footer/nav/main/form/section(only if accessible name present) elements.
  • heading navigation via h1-h6 elements.

Implementation of accessibility layer semantics for html element landmarks and h1-h6 is complete in chrome, firefox, Safari

Data on the utilisation of each can be found in webaim screen reader surveys

@stevefaulkner

This comment has been minimized.

Show comment
Hide comment
@stevefaulkner

stevefaulkner Sep 2, 2015

Contributor

The reason for the warning is so authors are not mislead into thinking that use of sectioning elements actually does anything for users who consume heading semantics to make sense of and navigate document content.

Contributor

stevefaulkner commented Sep 2, 2015

The reason for the warning is so authors are not mislead into thinking that use of sectioning elements actually does anything for users who consume heading semantics to make sense of and navigate document content.

@JohnnyWalkerDesign

This comment has been minimized.

Show comment
Hide comment
@JohnnyWalkerDesign

JohnnyWalkerDesign Sep 2, 2015

Adding a similar warning makes most sense, IMO, at least until the Outline Algorithm starts being more adopted. Many authors most likely think that (correctly) placing a <h1> element both inside and outside an <article> will be understood by those using assistive technology.

JohnnyWalkerDesign commented Sep 2, 2015

Adding a similar warning makes most sense, IMO, at least until the Outline Algorithm starts being more adopted. Many authors most likely think that (correctly) placing a <h1> element both inside and outside an <article> will be understood by those using assistive technology.

@alastc

This comment has been minimized.

Show comment
Hide comment
@alastc

alastc Sep 2, 2015

I do quite a bit of accessibility training and regularly come across developers who think they should just use H1s (they don't always know about sectioning either).
I agree that it would be more elegant to use sectioning, but unless someone is going to campaign for UAs to implement it, then the spec should align with the reality.

alastc commented Sep 2, 2015

I do quite a bit of accessibility training and regularly come across developers who think they should just use H1s (they don't always know about sectioning either).
I agree that it would be more elegant to use sectioning, but unless someone is going to campaign for UAs to implement it, then the spec should align with the reality.

@JohnnyWalkerDesign

This comment has been minimized.

Show comment
Hide comment
@JohnnyWalkerDesign

JohnnyWalkerDesign Sep 5, 2015

I would love to help campaign UAs to use it. It would make everyone's lives (developers and users) if they did.

JohnnyWalkerDesign commented Sep 5, 2015

I would love to help campaign UAs to use it. It would make everyone's lives (developers and users) if they did.

@nhoizey

This comment has been minimized.

Show comment
Hide comment
@nhoizey

nhoizey Sep 14, 2015

Has a web developer, I also would be really happy if the outline worked as intended with sectioning elements, at last.

Working with only h1 is much easier when you want to include (server-side or with Ajax) the same HTML fragment in several places of your pages, with appropriate hierarchy.

nhoizey commented Sep 14, 2015

Has a web developer, I also would be really happy if the outline worked as intended with sectioning elements, at last.

Working with only h1 is much easier when you want to include (server-side or with Ajax) the same HTML fragment in several places of your pages, with appropriate hierarchy.

@domenic

This comment has been minimized.

Show comment
Hide comment
@domenic

domenic Dec 27, 2015

Member

Some Twitter discussion reminded me about this neglected issue. I wanted to summarize the action items here:

  • We are not interested in simply adding a warning like the W3C fork does. "Warning: ignore the following. Here is a bunch of normative text about outlines..." We have higher standards for our specs than that kind of self-contradictory patchwork. Instead, we should actually update the spec to reflect reality.
  • The outline algorithm should be rewritten or replaced to reflect @stevefaulkner's description of implemented AT mechanisms in #83 (comment). Namely, it should primarily inform about landmark navigation and heading navigation. (Or maybe only the latter of these?)
    • Maybe these are already specced somewhere we should be referring to?
    • It should reflected implemented heading level semantics as announced by screen-readers, not the ones derived from nesting level.
    • We should do some serious hands-on testing with a11y tools (or we could continue to lean on a11y experts to do so for us, but that is not very good) to figure out how to best phrase these things. For example it's not entirely clear to me whether headings are presented as a nested list, or as a flat list with heading level numbers. We should encourage authors to have a mental model that matches what's implemented.
  • With these in mind, we should carefully change the authoring guidance and semantics for related elements (article, section, nav, aside, h1-h6, hgroup, header, footer). In general I think we can keep the "spirit" of existing semantics, e.g. nothing needs to change about what a section "is". But the advice about how to use it relative to headings, or how to use markup patterns that give it an accessible name, and how that impacts the document outline, will need to be carefully reviewed.
    • One exception may be hgroup. But maybe we should leave it as-is for the first pass.

Potential future work:

  • Consider the fate of hgroup further. Does it impact a11y processing anywhere? Should it be repurposed, or obsoleted, or...? Does it have semantics independent of its a11y processing, like non-accessible-name'd sections could be said to?
  • Propose a <h> element that actually gets treated like people want <h1> to be treated. (If we can get the a11y teams of various browsers to implement, then we can merge it into the spec. But let's not add something until we have experimental implementations and commitments.)

All that said, this isn't that high on my priority list, or my employer's. If someone does have the time to devote to this, I'd be happy to help review patches.

Member

domenic commented Dec 27, 2015

Some Twitter discussion reminded me about this neglected issue. I wanted to summarize the action items here:

  • We are not interested in simply adding a warning like the W3C fork does. "Warning: ignore the following. Here is a bunch of normative text about outlines..." We have higher standards for our specs than that kind of self-contradictory patchwork. Instead, we should actually update the spec to reflect reality.
  • The outline algorithm should be rewritten or replaced to reflect @stevefaulkner's description of implemented AT mechanisms in #83 (comment). Namely, it should primarily inform about landmark navigation and heading navigation. (Or maybe only the latter of these?)
    • Maybe these are already specced somewhere we should be referring to?
    • It should reflected implemented heading level semantics as announced by screen-readers, not the ones derived from nesting level.
    • We should do some serious hands-on testing with a11y tools (or we could continue to lean on a11y experts to do so for us, but that is not very good) to figure out how to best phrase these things. For example it's not entirely clear to me whether headings are presented as a nested list, or as a flat list with heading level numbers. We should encourage authors to have a mental model that matches what's implemented.
  • With these in mind, we should carefully change the authoring guidance and semantics for related elements (article, section, nav, aside, h1-h6, hgroup, header, footer). In general I think we can keep the "spirit" of existing semantics, e.g. nothing needs to change about what a section "is". But the advice about how to use it relative to headings, or how to use markup patterns that give it an accessible name, and how that impacts the document outline, will need to be carefully reviewed.
    • One exception may be hgroup. But maybe we should leave it as-is for the first pass.

Potential future work:

  • Consider the fate of hgroup further. Does it impact a11y processing anywhere? Should it be repurposed, or obsoleted, or...? Does it have semantics independent of its a11y processing, like non-accessible-name'd sections could be said to?
  • Propose a <h> element that actually gets treated like people want <h1> to be treated. (If we can get the a11y teams of various browsers to implement, then we can merge it into the spec. But let's not add something until we have experimental implementations and commitments.)

All that said, this isn't that high on my priority list, or my employer's. If someone does have the time to devote to this, I'd be happy to help review patches.

@stevefaulkner

This comment has been minimized.

Show comment
Hide comment
@stevefaulkner

stevefaulkner Dec 27, 2015

Contributor

"Warning: ignore the following. Here is a bunch of normative text about
outlines..."

While there is normative requirement for UAs to implement the outline
algorithm, many web developers have been lead to believe it is implemented,
the whatwg spec continues to perpetrate the myth. However you choose to
modify the spec to bring it closer to reality will be an improvement.

Contributor

stevefaulkner commented Dec 27, 2015

"Warning: ignore the following. Here is a bunch of normative text about
outlines..."

While there is normative requirement for UAs to implement the outline
algorithm, many web developers have been lead to believe it is implemented,
the whatwg spec continues to perpetrate the myth. However you choose to
modify the spec to bring it closer to reality will be an improvement.

@sideshowbarker

This comment has been minimized.

Show comment
Hide comment
@sideshowbarker

sideshowbarker Dec 28, 2015

Member

I agree with much of the comments fro @Hixie and @annevk in this thread, especially the comment from @Hixie that “We can't just remove the outline algorithm” without needing to make other changes as a consequence, and the question “Yeah, I guess either you keep the current outline algorithm, or you obsolete <section>/<h1> & <hgroup>?”.

But all that said, I have over the years learned a huge amount of things from @stevefaulkner around the problems that some things in the spec cause for AT users, and I think others should read his comments here very carefully. We’re here to solve existing real problems for real users—not to hypothetically solve problems for some of them if we could somehow just get browser implementors to see things our way and implement what we’ve specced, or get AT vendors to fix their horribly broken/buggy tools.

In that spirit I agree very strongly with the implicit goals in the latest comment that @domenic posted and with his concrete suggestions there about how to get progress here. See the related IRC discussion.

Some ways I could help with this might be:

  • Experimentally implementing a Show Outline feature in the Nu HTML Checker that shows a “Here’s what the outline of your document looks like to AT users in practice” view—in parallel to or even in place of the current Show Outline view the checker provides, which shows what the outline looks like according to the outline algorithm in the HTML spec.
  • Adding further experimental warnings to help authors avoid using H1-H6 headings in any way bad for AT users; basically that amounts to uses of H1-H6 in ways different from how we told authors they should use them before we added section and article and allowed nested H1s to be used within them, and allowed uses of H2-H6 that break the existing simple hierarchical use of them.

As far as the second item above, I have already implemented experimental support in the checker for reporting (mis)use of H1 as anything other than a top-level head, and that’s been deployed in the checker for quite a long time now, and I think it’s been helping. But I’d like to help more if I can. I want to make the HTML checker be a tool that helps developers avoid making uninformed authoring choices that are going to cause problems in practice for real users.

Member

sideshowbarker commented Dec 28, 2015

I agree with much of the comments fro @Hixie and @annevk in this thread, especially the comment from @Hixie that “We can't just remove the outline algorithm” without needing to make other changes as a consequence, and the question “Yeah, I guess either you keep the current outline algorithm, or you obsolete <section>/<h1> & <hgroup>?”.

But all that said, I have over the years learned a huge amount of things from @stevefaulkner around the problems that some things in the spec cause for AT users, and I think others should read his comments here very carefully. We’re here to solve existing real problems for real users—not to hypothetically solve problems for some of them if we could somehow just get browser implementors to see things our way and implement what we’ve specced, or get AT vendors to fix their horribly broken/buggy tools.

In that spirit I agree very strongly with the implicit goals in the latest comment that @domenic posted and with his concrete suggestions there about how to get progress here. See the related IRC discussion.

Some ways I could help with this might be:

  • Experimentally implementing a Show Outline feature in the Nu HTML Checker that shows a “Here’s what the outline of your document looks like to AT users in practice” view—in parallel to or even in place of the current Show Outline view the checker provides, which shows what the outline looks like according to the outline algorithm in the HTML spec.
  • Adding further experimental warnings to help authors avoid using H1-H6 headings in any way bad for AT users; basically that amounts to uses of H1-H6 in ways different from how we told authors they should use them before we added section and article and allowed nested H1s to be used within them, and allowed uses of H2-H6 that break the existing simple hierarchical use of them.

As far as the second item above, I have already implemented experimental support in the checker for reporting (mis)use of H1 as anything other than a top-level head, and that’s been deployed in the checker for quite a long time now, and I think it’s been helping. But I’d like to help more if I can. I want to make the HTML checker be a tool that helps developers avoid making uninformed authoring choices that are going to cause problems in practice for real users.

@bkardell

This comment has been minimized.

Show comment
Hide comment
@bkardell

bkardell Mar 1, 2018

This is not true. The proposed change actually makes it a conformance error to skip a heading level.

The "why" of not skipping heading levels that I have always heard and explained has to do with the AT, not the markup. There, even if you don't in markup - this creates skipped heading levels in AT, right? Again, I want to be super clear that I am not against this change. I pondered hard how to reply because I know the 'concerns' sound kind of negative, but they're mostly just about asking for a lucid explanation from a11y folk who support this on this seeming dissonance and messaging this well.

bkardell commented Mar 1, 2018

This is not true. The proposed change actually makes it a conformance error to skip a heading level.

The "why" of not skipping heading levels that I have always heard and explained has to do with the AT, not the markup. There, even if you don't in markup - this creates skipped heading levels in AT, right? Again, I want to be super clear that I am not against this change. I pondered hard how to reply because I know the 'concerns' sound kind of negative, but they're mostly just about asking for a lucid explanation from a11y folk who support this on this seeming dissonance and messaging this well.

@jakearchibald

This comment has been minimized.

Show comment
Hide comment
@jakearchibald

jakearchibald Mar 1, 2018

Collaborator

simply a thing that my QA/a11y depts might identify as 'bad' yesterday, suddenly isn't. From my understanding, at least. I'd like to have an explanation to hand them, that's it.

Did you see my stringArray.join('') example? Doesn't this answer your question?

Sure, but isn't the whole current proposal actually based on sectioning plus h1s? I'm showing what I thought is your actual use case.

I was taking from your "as it stands" example. You were giving an example of something that's good practice today, but it doesn't seem to be good practice today.

I don't think I'm helping here, so I'll step back.

Collaborator

jakearchibald commented Mar 1, 2018

simply a thing that my QA/a11y depts might identify as 'bad' yesterday, suddenly isn't. From my understanding, at least. I'd like to have an explanation to hand them, that's it.

Did you see my stringArray.join('') example? Doesn't this answer your question?

Sure, but isn't the whole current proposal actually based on sectioning plus h1s? I'm showing what I thought is your actual use case.

I was taking from your "as it stands" example. You were giving an example of something that's good practice today, but it doesn't seem to be good practice today.

I don't think I'm helping here, so I'll step back.

@bkardell

This comment has been minimized.

Show comment
Hide comment
@bkardell

bkardell Mar 1, 2018

@annevk has just explained to me a misunderstanding I was operating under with regard to @stevefaulkner's comment

Advice we provide to developers is "do not skip heading levels"
<h1> → <h3> is bad
<h1> → <h2> → <h3> is good
If the algorithm produces skipped levels because headingless sectioning elements are taken into account in the calculation of a headings level it codifies a practice we tell developers to avoid.

While the algorithm can, in theory, create skipped levels they wouldn't be any more 'acceptable' than they were before and the ways that would create them would now be flaggable as conformance errors. Sorry for any noise/confusion on that item, that's enough for me.

bkardell commented Mar 1, 2018

@annevk has just explained to me a misunderstanding I was operating under with regard to @stevefaulkner's comment

Advice we provide to developers is "do not skip heading levels"
<h1> → <h3> is bad
<h1> → <h2> → <h3> is good
If the algorithm produces skipped levels because headingless sectioning elements are taken into account in the calculation of a headings level it codifies a practice we tell developers to avoid.

While the algorithm can, in theory, create skipped levels they wouldn't be any more 'acceptable' than they were before and the ways that would create them would now be flaggable as conformance errors. Sorry for any noise/confusion on that item, that's enough for me.

@stevefaulkner

This comment has been minimized.

Show comment
Hide comment
@stevefaulkner

stevefaulkner Mar 1, 2018

Contributor

Note: the w3c nu markup checker has warnings (and has had for some time) for sections missing headings and use of multiple h1's etc.
for example:
https://validator.w3.org/nu/?showsource=yes&showoutline=yes&doc=https%3A%2F%2Fs.codepen.io%2Fstevef%2Fdebug%2FNyJKJz

Contributor

stevefaulkner commented Mar 1, 2018

Note: the w3c nu markup checker has warnings (and has had for some time) for sections missing headings and use of multiple h1's etc.
for example:
https://validator.w3.org/nu/?showsource=yes&showoutline=yes&doc=https%3A%2F%2Fs.codepen.io%2Fstevef%2Fdebug%2FNyJKJz

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk Mar 1, 2018

Member

Given the requirement in the proposed change that you cannot skip heading levels:

Each heading following another heading lead in document headings must have a heading level that is less, equal, or 1 greater than lead's heading level.

I don't think we need to require that sectioning content has a heading, unless there's a concern with that on its own, but then that's probably best discussed in a new issue.

Not using multiple headings with heading level 1 is something we could consider adding, though currently the specification says this is fine and even includes an example that does that. I think that therefore that's also best discussed separately.

Member

annevk commented Mar 1, 2018

Given the requirement in the proposed change that you cannot skip heading levels:

Each heading following another heading lead in document headings must have a heading level that is less, equal, or 1 greater than lead's heading level.

I don't think we need to require that sectioning content has a heading, unless there's a concern with that on its own, but then that's probably best discussed in a new issue.

Not using multiple headings with heading level 1 is something we could consider adding, though currently the specification says this is fine and even includes an example that does that. I think that therefore that's also best discussed separately.

@alastc

This comment has been minimized.

Show comment
Hide comment
@alastc

alastc Mar 1, 2018

Not sure I'm following entirely, but there is an exception I'd like to check.

A fairly common pattern is to have one or more headings above/before the H1.

E.g.

  • H2: Site menu
  • H1: Main heading
  • H2: Section 1

etc. Such as http://www.bbc.co.uk/news/live/uk-43202018
image

This is a desirable pattern from an accessibility point of view, is that still valid with this new approach?

alastc commented Mar 1, 2018

Not sure I'm following entirely, but there is an exception I'd like to check.

A fairly common pattern is to have one or more headings above/before the H1.

E.g.

  • H2: Site menu
  • H1: Main heading
  • H2: Section 1

etc. Such as http://www.bbc.co.uk/news/live/uk-43202018
image

This is a desirable pattern from an accessibility point of view, is that still valid with this new approach?

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk Mar 1, 2018

Member

@alastc currently

The first heading within document headings must have a heading level of 1.

forbids that, but it seems reasonable to change that to require at least one heading to have a heading level of 1 and not require a particular position. Thanks for raising that.

Member

annevk commented Mar 1, 2018

@alastc currently

The first heading within document headings must have a heading level of 1.

forbids that, but it seems reasonable to change that to require at least one heading to have a heading level of 1 and not require a particular position. Thanks for raising that.

@bkardell

This comment has been minimized.

Show comment
Hide comment
@bkardell

bkardell Mar 20, 2018

@jakearchibald mentioned:

I'd rather we bumped the level of all headings according to the number of ancestor sectioning elements than introduce something extra like heading-policy.

Sorry to add more noise on this,but for clarity: Is this on the table or off the table? My reading is probably incorrect, but I understood that there were reasons to not want to do that? The reason I am asking is that @jakearchibald's basic use case is pretty much the reality of most CMSes I know of - different authored content is going to keep on using 'flat' h1...h6 and I think that ideally we'd like them to "make sense" like this?

bkardell commented Mar 20, 2018

@jakearchibald mentioned:

I'd rather we bumped the level of all headings according to the number of ancestor sectioning elements than introduce something extra like heading-policy.

Sorry to add more noise on this,but for clarity: Is this on the table or off the table? My reading is probably incorrect, but I understood that there were reasons to not want to do that? The reason I am asking is that @jakearchibald's basic use case is pretty much the reality of most CMSes I know of - different authored content is going to keep on using 'flat' h1...h6 and I think that ideally we'd like them to "make sense" like this?

@rehierl

This comment has been minimized.

Show comment
Hide comment
@rehierl

rehierl Apr 11, 2018

@annevk
Allow me to recap the main points of your proposal:

  1. The heading level of h1 elements depends on its placement: 1 + the number of ancestor sectioning content elements.
  2. The heading level of an h2-h6 element is constant (fixed to the number value).
  3. The lower a heading's heading level is, the more important the heading is.
  4. "Each heading following another heading lead in document headings must have a heading level that is less, equal, or 1 greater than lead's heading level."

I am having difficulties to understand (4): Could you please rephrase (4) so that it is more clear what it means.

<body>
  <h1> A </h1>            toc-1
  <section>               =====
    <h1> B </h1>          1. A
    <section>             1.1. B
      <h1> C </h1>        1.1.1. C
      <h2> D </h2>        1.2. D
    </section>
  </section>
</body>

If I understand your approach correctly, then the heading level of heading C is 3. And because the heading level of the h2 element is constant, its heading level is still 2. So, is "toc-1" an accurate representation of the intended table-of-contents listing (because the h2 element, is now more important than the h1 element), or is it supposed to be something else (e.g. undefined because non-conformant)?

rehierl commented Apr 11, 2018

@annevk
Allow me to recap the main points of your proposal:

  1. The heading level of h1 elements depends on its placement: 1 + the number of ancestor sectioning content elements.
  2. The heading level of an h2-h6 element is constant (fixed to the number value).
  3. The lower a heading's heading level is, the more important the heading is.
  4. "Each heading following another heading lead in document headings must have a heading level that is less, equal, or 1 greater than lead's heading level."

I am having difficulties to understand (4): Could you please rephrase (4) so that it is more clear what it means.

<body>
  <h1> A </h1>            toc-1
  <section>               =====
    <h1> B </h1>          1. A
    <section>             1.1. B
      <h1> C </h1>        1.1.1. C
      <h2> D </h2>        1.2. D
    </section>
  </section>
</body>

If I understand your approach correctly, then the heading level of heading C is 3. And because the heading level of the h2 element is constant, its heading level is still 2. So, is "toc-1" an accurate representation of the intended table-of-contents listing (because the h2 element, is now more important than the h1 element), or is it supposed to be something else (e.g. undefined because non-conformant)?

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk Apr 11, 2018

Member

@bkardell my impression is that there's too much content out there that uses h2-h6 without regard for their sectioning content element ancestors so adjusting them would do too much damage to existing content.

@rehierl 4 means that you cannot skip heading levels when the heading level increases relative to a previous heading. And toc-1 is indeed accurate (and conforming).

Member

annevk commented Apr 11, 2018

@bkardell my impression is that there's too much content out there that uses h2-h6 without regard for their sectioning content element ancestors so adjusting them would do too much damage to existing content.

@rehierl 4 means that you cannot skip heading levels when the heading level increases relative to a previous heading. And toc-1 is indeed accurate (and conforming).

@LJWatson

This comment has been minimized.

Show comment
Hide comment
@LJWatson

LJWatson Apr 11, 2018

@annevk
< my impression is that there's too much content out there that uses h2-h6 without regard for their sectioning content element ancestors so adjusting them
would do too much damage to existing content.<

It would, but a question worth asking, is what impact that damage would actually have. To some extent this idea is based on the assumption that most/many pages have a useful heading hierarchy, and that introducing these changes would therefore break it.

My own experience is that heading hierarchies are almost always broken as it is. So perhaps we should be asking whether the damage this algorithm might cause is better or worse, than the existing status quo?

My (entirely unscientific) hunch is that it would not make things worse. If that assertion can be backed up with evidence, then the question is whether it is worth exchanging one kind of broken for another, as a means of getting everyone to the point where things get easier for authors and better for users?

LJWatson commented Apr 11, 2018

@annevk
< my impression is that there's too much content out there that uses h2-h6 without regard for their sectioning content element ancestors so adjusting them
would do too much damage to existing content.<

It would, but a question worth asking, is what impact that damage would actually have. To some extent this idea is based on the assumption that most/many pages have a useful heading hierarchy, and that introducing these changes would therefore break it.

My own experience is that heading hierarchies are almost always broken as it is. So perhaps we should be asking whether the damage this algorithm might cause is better or worse, than the existing status quo?

My (entirely unscientific) hunch is that it would not make things worse. If that assertion can be backed up with evidence, then the question is whether it is worth exchanging one kind of broken for another, as a means of getting everyone to the point where things get easier for authors and better for users?

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk Apr 11, 2018

Member

@LJWatson the other reason for avoiding adjusting h2-h6 is that we have styling in place for h1 (based on nesting depth) and we cannot add equivalent styling for h2-h6 (and we also cannot change it for h1). It also seems like an easier pitch to say that folks should just use h1 and sectioning content elements.

Member

annevk commented Apr 11, 2018

@LJWatson the other reason for avoiding adjusting h2-h6 is that we have styling in place for h1 (based on nesting depth) and we cannot add equivalent styling for h2-h6 (and we also cannot change it for h1). It also seems like an easier pitch to say that folks should just use h1 and sectioning content elements.

@alastc

This comment has been minimized.

Show comment
Hide comment
@alastc

alastc Apr 11, 2018

It's worth examining, from the summary I'd expect these sites to be fine:

  • H1s only with good section nesting for levels.
  • An H1 at the top (not in nested sections) and then H2-H6 properly.

Sites that would not provide a good experience would be:

  • Any site using a mix of (mutiple) H1s and H2-H6s will be broken as per @rehierl's example above.
  • Any site using H1s with poor section structure.

I'm trying to think of any inbetween structures, for example:

<body>
  <banner>
     <h2>Nav heading</h2>
  </banner>
  <main>  
     <h1>Page heading</h1>
    <article>
      <h2> C </h2> 
      <h3> D </h3>
    </article>
  </main>
</body>

I think that would be fine, but if the structure was main > section > h1, that would be an H2? That seems like an easy mistake to make.

It feels like you really have to go one way or the other, mixing multiple H1s with H2-6s is where it would go wrong. I'm guessing someone already came up with the idea of detecting H2-6 and switching to classic heading-levels?

alastc commented Apr 11, 2018

It's worth examining, from the summary I'd expect these sites to be fine:

  • H1s only with good section nesting for levels.
  • An H1 at the top (not in nested sections) and then H2-H6 properly.

Sites that would not provide a good experience would be:

  • Any site using a mix of (mutiple) H1s and H2-H6s will be broken as per @rehierl's example above.
  • Any site using H1s with poor section structure.

I'm trying to think of any inbetween structures, for example:

<body>
  <banner>
     <h2>Nav heading</h2>
  </banner>
  <main>  
     <h1>Page heading</h1>
    <article>
      <h2> C </h2> 
      <h3> D </h3>
    </article>
  </main>
</body>

I think that would be fine, but if the structure was main > section > h1, that would be an H2? That seems like an easy mistake to make.

It feels like you really have to go one way or the other, mixing multiple H1s with H2-6s is where it would go wrong. I'm guessing someone already came up with the idea of detecting H2-6 and switching to classic heading-levels?

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk Apr 11, 2018

Member

@alastc switching is too expensive (and could also lead to very weird experiences on slow loading pages).

Member

annevk commented Apr 11, 2018

@alastc switching is too expensive (and could also lead to very weird experiences on slow loading pages).

@bkardell

This comment has been minimized.

Show comment
Hide comment
@bkardell

bkardell Apr 11, 2018

I'm guessing someone already came up with the idea of detecting H2-6 and switching to classic heading-levels?

Yes, but because this is difficult and expensive and simultaneously also a very real thing that will happen with pretty much any existing tool or CMS (because visual editors, markdown, etc think of text in a more traditional 'flat' way - that's where our original ideas about them come from in fact) I offered that a kind of indicator that let you know 'this element contains that kind of stuff' would be a way to potentially bridge the gap in a way that seems achievable both from an implementation standpoint and an 'authors could actually accomplish this' standpoint. Effectively, the current proposal says that h1's level becomes the section depth - this would allow that all the levels inside an element with such an indicator become section depth + (tag level -1). It seems pretty easy to make as a custom element or polyfill (in fact, I'm playing with a version of it in a project right now) so I'm not trying to push that it has to be a part of this, but it does seem like without it it will still be very hard for most authors to use common tools to create good headings.

bkardell commented Apr 11, 2018

I'm guessing someone already came up with the idea of detecting H2-6 and switching to classic heading-levels?

Yes, but because this is difficult and expensive and simultaneously also a very real thing that will happen with pretty much any existing tool or CMS (because visual editors, markdown, etc think of text in a more traditional 'flat' way - that's where our original ideas about them come from in fact) I offered that a kind of indicator that let you know 'this element contains that kind of stuff' would be a way to potentially bridge the gap in a way that seems achievable both from an implementation standpoint and an 'authors could actually accomplish this' standpoint. Effectively, the current proposal says that h1's level becomes the section depth - this would allow that all the levels inside an element with such an indicator become section depth + (tag level -1). It seems pretty easy to make as a custom element or polyfill (in fact, I'm playing with a version of it in a project right now) so I'm not trying to push that it has to be a part of this, but it does seem like without it it will still be very hard for most authors to use common tools to create good headings.

@alastc

This comment has been minimized.

Show comment
Hide comment
@alastc

alastc Apr 11, 2018

Fair enough.

In general then safe authoring advice for this proposal would be:

  • Use an H1 for the main page heading, make sure it is not in a nested section element.
  • Use H2-6 as you do now. Sorry: as you should now.

Then, when there is reasonable UA support you could switch to H1s only and use sectioning for levels.

Have I understood that correctly?

alastc commented Apr 11, 2018

Fair enough.

In general then safe authoring advice for this proposal would be:

  • Use an H1 for the main page heading, make sure it is not in a nested section element.
  • Use H2-6 as you do now. Sorry: as you should now.

Then, when there is reasonable UA support you could switch to H1s only and use sectioning for levels.

Have I understood that correctly?

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk Apr 11, 2018

Member

Yes (or use the polyfill if requiring script is acceptable for your site).

Member

annevk commented Apr 11, 2018

Yes (or use the polyfill if requiring script is acceptable for your site).

@rehierl

This comment has been minimized.

Show comment
Hide comment
@rehierl

rehierl Apr 23, 2018

Even though I am only a member of the general public, I need to ask you to do the following:
If only for a moment, try to take a look at the bigger picture.

tldr - The proposed heading-level algorithm may reflect "reality", but it won't solve the core problem: Figure out a way to teach computers how to read the content an author associates with a heading (aka. sections). The core concept of the current algorithm (i.e. sections) isn't what made it fail. The attempt to make it reflect reality is what broke the algorithm we have! So use the proposed design to complement (i.e. not replace) it. And, in the long run, fix the existing design, because that is what will drag us out of the mess we are in. - /tldr

With regards to ...

(2015-12-27, @domenic) - Quotes some twitter discussion: ... We have higher standards ... (W3C's) kind of self-contradictory patchwork ... update the spec to reflect reality.

(2016-10-23, @fititnt) - A requests to: decide the best path, explain it, and be consistent.

(2018-03-01, @bkardell) - A reminder that: it's been stressed over the years "it's quite important you get this right" and "here's the criteria for what it means to be 'right'".

(2018-01-26, @othermaciej) - The outline (as currently defined) is a list of (nested) sections (i.e. a "forest") ... an algorithm that produces a forest of sections is neither necessary nor helpful ... the new outline algorithm (should) operate on headings, not sections.

The reason for "a list of sections"

Take a closer look at WHATWG's outline algorithm in order to figure out what the reason for the "a list of sections" definition really is. (Note that I fully agree here: That part (i.e. forest) does not make sense and should be changed). However, if you'd scroll down to "When entering a heading content element", then you might notice this:

  1. The only case in which the "current section" can end up with having no heading is, if that section was declared by an element of sectioning content or sectioning root.

  2. If the "current section" has no heading, and if the first element of heading content is entered inside of such a section, then that heading element is reused as the section's heading.

  3. The first "Otherwise" block is what I'd call a "performance shortcut" which causes the loop beneath it to always create a subsection (see "append it to candidate section"). So the first "Otherwise" block is what is relevant with regards to the above definition, not the loop beneath it.

fragment 4-1      toc 4-1   fragment 4-2       toc 4-2
============      =======   ============       =======
<body>            1. A      <body>             1. Untitled
  <h1> A </h1>    2. B        <section>        1.1. A
  <h1> B </h1>                  <h1> A </h1>   2. B
</body>                       </section>
                              <h1> B </h1>
                            </body>

4-1) If the current heading being entered has a rank that is greater or equal to the rank of the current section's heading, then a sibling section is created (i.e. a forest).

4-2) If the current section has an implied heading, then a sibling section is created (i.e. a forest). Note that this case will only be triggered under certain circumstances. However, if you look close enough at toc 4-2, you could spot an inconsistency error with regards to the "the first element of heading content" definition. Other than that, I'll ignore fragment 4-2 because of it being "bad-practice".

Now, please focus on 4-1: (1) The reason why the current algorithm ends up with "a list of sections" is
due to a heading/rank-based perspective. But, instead of actually taking a look at what went wrong, you (2) suggest to let the "new" outline algorithm operate on headings.

I don't know about you, but I find that rather ironical because a heading-based perspective is largely responsible for our situation. And that is why I have to disagree: "It" is about sections, not headings!

With regards to fixing what we already have ...

I can only cite what @fititnt and @bkardell wrote (see above): Get it right!

Use Graph Theory (i.e. not statistics) to prove on a formal level that, whatever you come up with next, is consistent. One way or another, I am convinced that Graph Theory will win. After all, ... Earth isn't the center of the universe, it never was! ... (Note the critical switch in perspective!)

rehierl commented Apr 23, 2018

Even though I am only a member of the general public, I need to ask you to do the following:
If only for a moment, try to take a look at the bigger picture.

tldr - The proposed heading-level algorithm may reflect "reality", but it won't solve the core problem: Figure out a way to teach computers how to read the content an author associates with a heading (aka. sections). The core concept of the current algorithm (i.e. sections) isn't what made it fail. The attempt to make it reflect reality is what broke the algorithm we have! So use the proposed design to complement (i.e. not replace) it. And, in the long run, fix the existing design, because that is what will drag us out of the mess we are in. - /tldr

With regards to ...

(2015-12-27, @domenic) - Quotes some twitter discussion: ... We have higher standards ... (W3C's) kind of self-contradictory patchwork ... update the spec to reflect reality.

(2016-10-23, @fititnt) - A requests to: decide the best path, explain it, and be consistent.

(2018-03-01, @bkardell) - A reminder that: it's been stressed over the years "it's quite important you get this right" and "here's the criteria for what it means to be 'right'".

(2018-01-26, @othermaciej) - The outline (as currently defined) is a list of (nested) sections (i.e. a "forest") ... an algorithm that produces a forest of sections is neither necessary nor helpful ... the new outline algorithm (should) operate on headings, not sections.

The reason for "a list of sections"

Take a closer look at WHATWG's outline algorithm in order to figure out what the reason for the "a list of sections" definition really is. (Note that I fully agree here: That part (i.e. forest) does not make sense and should be changed). However, if you'd scroll down to "When entering a heading content element", then you might notice this:

  1. The only case in which the "current section" can end up with having no heading is, if that section was declared by an element of sectioning content or sectioning root.

  2. If the "current section" has no heading, and if the first element of heading content is entered inside of such a section, then that heading element is reused as the section's heading.

  3. The first "Otherwise" block is what I'd call a "performance shortcut" which causes the loop beneath it to always create a subsection (see "append it to candidate section"). So the first "Otherwise" block is what is relevant with regards to the above definition, not the loop beneath it.

fragment 4-1      toc 4-1   fragment 4-2       toc 4-2
============      =======   ============       =======
<body>            1. A      <body>             1. Untitled
  <h1> A </h1>    2. B        <section>        1.1. A
  <h1> B </h1>                  <h1> A </h1>   2. B
</body>                       </section>
                              <h1> B </h1>
                            </body>

4-1) If the current heading being entered has a rank that is greater or equal to the rank of the current section's heading, then a sibling section is created (i.e. a forest).

4-2) If the current section has an implied heading, then a sibling section is created (i.e. a forest). Note that this case will only be triggered under certain circumstances. However, if you look close enough at toc 4-2, you could spot an inconsistency error with regards to the "the first element of heading content" definition. Other than that, I'll ignore fragment 4-2 because of it being "bad-practice".

Now, please focus on 4-1: (1) The reason why the current algorithm ends up with "a list of sections" is
due to a heading/rank-based perspective. But, instead of actually taking a look at what went wrong, you (2) suggest to let the "new" outline algorithm operate on headings.

I don't know about you, but I find that rather ironical because a heading-based perspective is largely responsible for our situation. And that is why I have to disagree: "It" is about sections, not headings!

With regards to fixing what we already have ...

I can only cite what @fititnt and @bkardell wrote (see above): Get it right!

Use Graph Theory (i.e. not statistics) to prove on a formal level that, whatever you come up with next, is consistent. One way or another, I am convinced that Graph Theory will win. After all, ... Earth isn't the center of the universe, it never was! ... (Note the critical switch in perspective!)

@rehierl

This comment has been minimized.

Show comment
Hide comment
@rehierl

rehierl Apr 23, 2018

@bkardell - I am in the process of spending some time with some "it" ("some time", what an understatement that is). Here is what I think:

You are right, the outline is a higher level version of a document's node tree. And if you understand "sectioning nodes" as a class of nodes which, more or less, tell an algorithm where exactly a section begins and where it ends (i.e. heading elements, sectioning content elements, sectioning root elements), then that tree of sections (aka. outline) is defined by those sectioning nodes.

However, you could even take it one step further: Some of those sectioning nodes (i.e. the sectioning root elements) can even be understood to not just define sections within a document's node tree, but also to define sections within the document's outline. After all, a tree of sections is in its core just another node tree. So what you actually have is this: a document, an outline, and an outline of an outline.

rehierl commented Apr 23, 2018

@bkardell - I am in the process of spending some time with some "it" ("some time", what an understatement that is). Here is what I think:

You are right, the outline is a higher level version of a document's node tree. And if you understand "sectioning nodes" as a class of nodes which, more or less, tell an algorithm where exactly a section begins and where it ends (i.e. heading elements, sectioning content elements, sectioning root elements), then that tree of sections (aka. outline) is defined by those sectioning nodes.

However, you could even take it one step further: Some of those sectioning nodes (i.e. the sectioning root elements) can even be understood to not just define sections within a document's node tree, but also to define sections within the document's outline. After all, a tree of sections is in its core just another node tree. So what you actually have is this: a document, an outline, and an outline of an outline.

@prlbr

This comment has been minimized.

Show comment
Hide comment
@prlbr

prlbr Apr 23, 2018

Does anybody still think that it was a good idea to reuse <h1> instead of a new element <h> for the explicit <section> modell? All of this would be easier and more intuitive, if authors could chose between the traditional heading focused <h1><h6> modell with implicit sections and an explicit <section>/<h> modell. It may not be to late to recognize a mistake and fix it at the root.

prlbr commented Apr 23, 2018

Does anybody still think that it was a good idea to reuse <h1> instead of a new element <h> for the explicit <section> modell? All of this would be easier and more intuitive, if authors could chose between the traditional heading focused <h1><h6> modell with implicit sections and an explicit <section>/<h> modell. It may not be to late to recognize a mistake and fix it at the root.

@rehierl

This comment has been minimized.

Show comment
Hide comment
@rehierl

rehierl Apr 23, 2018

The <h> element ... I have seen the discussions, but I didn't have enough time to really think about it in detail. So here is my current and limited take on that element in order to avoid drifting off in a "war of beliefs":

The reason for "a forest of sections" is the reuse of heading elements. But instead of limiting the "reuse" to what it was supposed to do (i.e. only define a heading), people thought that a heading element should still have a rank under these kind of circumstances. What proof is there that the section of a <section> element even needs an inner or an outer rank?

The problem with the <h> element is that it "looks" too similar to the <hX> elements. My guess is, that people will confuse it with an element that declares a section (i.e. a sectioning node) and thus would try to use it just like all the other <hX> elements. Yes, introduce it to teach the core concept of a rank-less sectioning node. No, don't introduce it as an element that is not a sectioning node.

If used as a sectioning node, then what rank should it have? Should it act as a <h0> element? Or should it act more like a <h7> element instead?

The <h> element is just an option, not the solution itself.

And here is the bummer: In the long run, I'd much prefer the reuse of another element, the title element: If every element of sectioning content or sectioning root has only one top-level section and, as such, represents a "flat thing", then what is the heading of the <body> element? The document's title ... The current definition of the <title> element (IMHO) seems to better fit the intended purpose than the <h> element.

rehierl commented Apr 23, 2018

The <h> element ... I have seen the discussions, but I didn't have enough time to really think about it in detail. So here is my current and limited take on that element in order to avoid drifting off in a "war of beliefs":

The reason for "a forest of sections" is the reuse of heading elements. But instead of limiting the "reuse" to what it was supposed to do (i.e. only define a heading), people thought that a heading element should still have a rank under these kind of circumstances. What proof is there that the section of a <section> element even needs an inner or an outer rank?

The problem with the <h> element is that it "looks" too similar to the <hX> elements. My guess is, that people will confuse it with an element that declares a section (i.e. a sectioning node) and thus would try to use it just like all the other <hX> elements. Yes, introduce it to teach the core concept of a rank-less sectioning node. No, don't introduce it as an element that is not a sectioning node.

If used as a sectioning node, then what rank should it have? Should it act as a <h0> element? Or should it act more like a <h7> element instead?

The <h> element is just an option, not the solution itself.

And here is the bummer: In the long run, I'd much prefer the reuse of another element, the title element: If every element of sectioning content or sectioning root has only one top-level section and, as such, represents a "flat thing", then what is the heading of the <body> element? The document's title ... The current definition of the <title> element (IMHO) seems to better fit the intended purpose than the <h> element.

@jakearchibald

This comment has been minimized.

Show comment
Hide comment
@jakearchibald

jakearchibald Apr 23, 2018

Collaborator

@prlbr

Does anybody still think that it was a good idea to reuse <h1> instead of a new element <h> for the explicit <section> model?

Yes. It has a better backwards compatibility story. If <h> becomes a standard, there'll be a period of time where it's used, but it's unsupported in user agents. Unless it's polyfilled, this element is no better than a <span> to these users. Given that most screen reader users use headings to navigate pages, wrong-level headings is better than no headings.

Collaborator

jakearchibald commented Apr 23, 2018

@prlbr

Does anybody still think that it was a good idea to reuse <h1> instead of a new element <h> for the explicit <section> model?

Yes. It has a better backwards compatibility story. If <h> becomes a standard, there'll be a period of time where it's used, but it's unsupported in user agents. Unless it's polyfilled, this element is no better than a <span> to these users. Given that most screen reader users use headings to navigate pages, wrong-level headings is better than no headings.

@prlbr

This comment has been minimized.

Show comment
Hide comment
@prlbr

prlbr Apr 23, 2018

@jakearchibald

<h role='heading'> would work for the transitional period. Thearia-level attribute even allows to define a certain heading level. Authors could, but wouldn’t need to set these attributes manually – they could be injected by a polyfill, as you say.

prlbr commented Apr 23, 2018

@jakearchibald

<h role='heading'> would work for the transitional period. Thearia-level attribute even allows to define a certain heading level. Authors could, but wouldn’t need to set these attributes manually – they could be injected by a polyfill, as you say.

@jakearchibald

This comment has been minimized.

Show comment
Hide comment
@jakearchibald

jakearchibald Apr 24, 2018

Collaborator

@prlbr right, so it comes down to <h> where developers need to remember to assign the correct role and apply the correct default stylings, or <h1> where they don't need to do any of that.

Collaborator

jakearchibald commented Apr 24, 2018

@prlbr right, so it comes down to <h> where developers need to remember to assign the correct role and apply the correct default stylings, or <h1> where they don't need to do any of that.

@prlbr

This comment has been minimized.

Show comment
Hide comment
@prlbr

prlbr Apr 24, 2018

@rehierl I have no prejudice for the name of a heading element for the explicit, nestable <section> modell. I would be fine with replacing <h> by anything that has no other conflicting meaning or behaviour (which both <h1> and <title> have).

@jakearchibald Indeed, those would be the disadvantages of <h> for a transitional period.* In my opinion these temporal disadvantages are less of a problem than the continued confusion that is caused by reusing <h1>. Examples of unintuitive behaviour have been given in this thread.

(*That period would be over already if the decision would have went for <h> instead of reusing the ubiqoutous <h1> for something different than it had meant for years in the first place; the default styling issue wasn’t a disadvantage then as <h1> had an incorrect default styling for non-top-level use too.)

prlbr commented Apr 24, 2018

@rehierl I have no prejudice for the name of a heading element for the explicit, nestable <section> modell. I would be fine with replacing <h> by anything that has no other conflicting meaning or behaviour (which both <h1> and <title> have).

@jakearchibald Indeed, those would be the disadvantages of <h> for a transitional period.* In my opinion these temporal disadvantages are less of a problem than the continued confusion that is caused by reusing <h1>. Examples of unintuitive behaviour have been given in this thread.

(*That period would be over already if the decision would have went for <h> instead of reusing the ubiqoutous <h1> for something different than it had meant for years in the first place; the default styling issue wasn’t a disadvantage then as <h1> had an incorrect default styling for non-top-level use too.)

@rehierl

This comment has been minimized.

Show comment
Hide comment
@rehierl

rehierl Apr 24, 2018

@prlbr The reason I mentioned the <title> element was due to the overlap in semantics, although the spec clearly states "no more than one". But, without any formal design that can be proven to work, it seems too soon to discuss such things. (Not that I don't appreciate your feedback).

rehierl commented Apr 24, 2018

@prlbr The reason I mentioned the <title> element was due to the overlap in semantics, although the spec clearly states "no more than one". But, without any formal design that can be proven to work, it seems too soon to discuss such things. (Not that I don't appreciate your feedback).

@rehierl

This comment has been minimized.

Show comment
Hide comment
@rehierl

rehierl Apr 25, 2018

@annevk I totally forgot to apologize for giving your proposal such a blow. Even though there are some essential problems with it, I'd like to even thank you for giving it a go. That is, because your proposal allowed me to connect a few more dots. Now, if I could only figure out how to explain that in detail ...

rehierl commented Apr 25, 2018

@annevk I totally forgot to apologize for giving your proposal such a blow. Even though there are some essential problems with it, I'd like to even thank you for giving it a go. That is, because your proposal allowed me to connect a few more dots. Now, if I could only figure out how to explain that in detail ...

@othermaciej

This comment has been minimized.

Show comment
Hide comment
@othermaciej

othermaciej Apr 25, 2018

Introducing a new <h> element seems outside the scope of this issue, which is to update the outline algorithm for existing elements to something more implementable in browsers.

othermaciej commented Apr 25, 2018

Introducing a new <h> element seems outside the scope of this issue, which is to update the outline algorithm for existing elements to something more implementable in browsers.

@rehierl

This comment has been minimized.

Show comment
Hide comment
@rehierl

rehierl Apr 26, 2018

The main misconception seems to be that sectioning nodes (headings, section elements, etc.) can be treated like any other element: As being isolated and independent from one another. That however is not the case:

  1. With each of those elements, an author associates content. That is, whether the spec has a definition for "it" or not, sections do exist as associated content/nodes.
  2. If sectioning node "A" has a section that contains sectioning node "B", then section "B" is related to section "A". That is, those elements are not independent, they never were!
  3. By now it should not be difficult to accept that mathematics is involved: What a section hierarchy is depends on the definition of subset and/or subsequence.

Sure, under certain conditions you will be consistent. The problem with that is however: You won't be able to guarantee consistency. That is, under different conditions, you will still be in conflict. And that will keep us from having a Semantic Web.

These kind of issues will keep nagging you over and over again. Simply because mathematics is the only language computers understand. And because mathematics is what it is, it is "reality" that needs to change. The only thing you can do to finally solve our problem is to figure out how to properly deal with it.

(But, you already made it pretty clear that there still is not enough interest for a proper solution.)

rehierl commented Apr 26, 2018

The main misconception seems to be that sectioning nodes (headings, section elements, etc.) can be treated like any other element: As being isolated and independent from one another. That however is not the case:

  1. With each of those elements, an author associates content. That is, whether the spec has a definition for "it" or not, sections do exist as associated content/nodes.
  2. If sectioning node "A" has a section that contains sectioning node "B", then section "B" is related to section "A". That is, those elements are not independent, they never were!
  3. By now it should not be difficult to accept that mathematics is involved: What a section hierarchy is depends on the definition of subset and/or subsequence.

Sure, under certain conditions you will be consistent. The problem with that is however: You won't be able to guarantee consistency. That is, under different conditions, you will still be in conflict. And that will keep us from having a Semantic Web.

These kind of issues will keep nagging you over and over again. Simply because mathematics is the only language computers understand. And because mathematics is what it is, it is "reality" that needs to change. The only thing you can do to finally solve our problem is to figure out how to properly deal with it.

(But, you already made it pretty clear that there still is not enough interest for a proper solution.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment