Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggest adding a warning about outline algorithm #83

Open
stevefaulkner opened this issue Sep 1, 2015 · 110 comments · May be fixed by #3499
Open

Suggest adding a warning about outline algorithm #83

stevefaulkner opened this issue Sep 1, 2015 · 110 comments · May be fixed by #3499

Comments

@stevefaulkner
Copy link
Contributor

@stevefaulkner stevefaulkner commented Sep 1, 2015

Currently the HTML standard does not provide any advice in regards to the outline algorithm not being implemented, This has lead to some developers believing that the outline algorithm has an effect in browsers and assitive technology which it does not. THis can lead to developers using markup patterns that don't convey document structure. Suggest adding a warning, for example this is the warning in the W3C HTML spec

There are currently no known implementations of the outline algorithm in graphical browsers or assistive technology user agents, although the algorithm is implemented in other software such as conformance checkers. Therefore the outline algorithm cannot be relied upon to convey document structure to users. Authors are advised to use heading rank (h1-h6) to convey document structure.

@domenic
Copy link
Member

@domenic domenic commented Sep 1, 2015

I kind of feel we should either leave as-is or just remove the outline algorithm altogether...

@gsnedders
Copy link

@gsnedders gsnedders commented Sep 1, 2015

An outline algorithm probably makes sense to keep around, given outlining is used in some sense in most screenreaders, as far as I'm aware. That said, may well make sense to drop the current one (and the semantics that go along with it...).

@Hixie
Copy link
Member

@Hixie Hixie commented Sep 1, 2015

The algorithm just describes the semantics of the elements. If the tools aren't supporting the semantics, they're buggy and should be fixed. Changing the semantics would be a pretty drastic change to the spec, especially as people have been using these elements for years.

We can't just remove the outline algorithm, either. We need something to define the semantics of these elements. Even if we were to say that tools should ignore the semantics and just use the h1-h6 elements in the naive flat way (ignoring tree structure, as if we were back in the 90s), you'd still need the algorithm to be able to define the authoring conformance criteria (so that conforming documents only used h1-h6 in a manner consistent with the semantics). That's pretty silly though. The right solution is just to fix the tools.

@domenic
Copy link
Member

@domenic domenic commented Sep 1, 2015

@Hixie, when you say "tools," do you include user agents and their accessibility bindings?

At some point we cannot claim that user agents are broken. They are instead rejecting our change request. https://www.w3.org/Bugs/Public/show_bug.cgi?id=25003 contains comments from both Firefox and Chrome accessibility developers explicitly rejecting the idea of implementing the outline algorithm in their accessibility bindings. It also outlines the history of JAWS removing their support. Although their reasoning may be wrong, it doesn't seem fruitful at this point to challenge them.

I think it would be better for the semantics in the spec to match the semantics already exposed through accessibility bindings in implementations. People have been using these elements for years, but either (a) they have been using them in a way supported by user agents, which contradicts the spec; or (b) they have been using them in the way specified, which means their content is broken in current user agents (for users which count on those accessibility bindings). Neither of these seem good.

@domenic
Copy link
Member

@domenic domenic commented Sep 1, 2015

My last message wasn't entirely clear. I agree with @gsnedders and phrasing this as "removing the outline algorithm" was incorrect. Rather, we should update the outline algorithm to reflect implementations. In some ways this is a "revert" of the "change request" that proposed a sectioning-based outline algorithm; it differs only in degree from a4313d3, which was a revert of the change request 78f1994 to simplify selector case-sensitivity matching rules.

@Hixie
Copy link
Member

@Hixie Hixie commented Sep 1, 2015

If we want to require that authors use h1-h6 instead of being able to do the XHTML-style <h1>-everywhere, then we need two outline algorithms: one that describes how user agents are to act, and one that describes the restrictions that authors have to follow in order for their h1-h6 headers to not contradict the sectioning semantics. (And maybe a third one, that describes how an authoring tool could convert from the saner one-heading-element style to the legacy h1-h6 style for UAs.)

But IMHO that's a poor place to be in. I don't really see why accessibility tools couldn't expose the real semantics here. It doesn't require a complex algorithm (you only need "previous", "next", and "up" to be able to navigate the tree, and walking around the tree that way is pretty straight-forward as far as I can tell). Accessibility tools are notoriously slow about catching up to implementing new features, this algorithm is not that old by their time scales. (I mean, they still haven't implemented stuff from the 90s correctly, even though there's obvious usability gains to be had by doing so.)

@domenic
Copy link
Member

@domenic domenic commented Sep 1, 2015

Can you explain why you need two outline algorithms? Authors would use the sectioning elements the same way they are exposed in accessibility technologies: just like divs.

@gsnedders
Copy link

@gsnedders gsnedders commented Sep 1, 2015

That ignores that fact that both Firefox and Chrome have explicitly refused to support it, and that algorithm is hella old by their standards (it's what, seven years old now?). I think the battle's lost at this point, sadly. I don't have strong opinions on what we should do, but the spec as it stands now is fiction and will remain fiction. Your argument that they should implement it because they're buggy per spec is trying to oblige behaviour by spec, and we know that's a fallacy when everyone is refusing to implement it.

@Hixie
Copy link
Member

@Hixie Hixie commented Sep 1, 2015

@domenic Consider the following:

<h1>A</h1>
<section>
 <p>aaa
 <h1>B</h1>
 <p>bbb
</section>
<p>bbb

What are the sections in that document? If the <section> element doesn't align with that answer, then what are the semantics of <section>?

@gsnedders I don't think the battle's been fought. Any time I've seen people say they don't want to do it (e.g. in the bug above) the reasons they've given don't actually fit the facts (e.g. I've heard complaints that it would be prohibitively expensive, but that's only if you recomputed the entire tree, which as far as I can tell is unnecessary). But in any case in my comment above I gave two paths: one that I think is the right path, and another path for the case where we give up on making accessibility tools give good results. We could go down the second path, certainly. It's not just removing text from the spec, though, as I described above.

@domenic
Copy link
Member

@domenic domenic commented Sep 1, 2015

My understanding is that as implemented section has no semantics, just like div. I am not sure that implementations have a concept of "sections of a document" as much as they have an accessibility tree plus an outline tree which consists of links into nodes in the accessibility tree. But I am not sure on that; presumably @stevefaulkner has done the research there.

@Hixie
Copy link
Member

@Hixie Hixie commented Sep 1, 2015

I'm not sure what it would mean for <section> to have "implemented" semantics. Semantics by their very nature are about the meaning of the elements, which is something for humans. It's how you get maintainable documents that different people who have never met can approach and understand.

@annevk
Copy link
Member

@annevk annevk commented Sep 2, 2015

Yeah, I guess either you keep the current outline algorithm, or you obsolete <section>/<h1> & <hgroup>?

@domenic
Copy link
Member

@domenic domenic commented Sep 2, 2015

Is the idea to keep the current outline algorithm but the output of the algorithm is only something that exists in authors' minds? If so I'd be fine moving it to some section with a preface like "If you want to assemble a mental outline, that does not match that displayed by screen readers, follow the following algorithm: ... NOTE: authors are advised not to author documents that produce outlines catering to this algorithm, but instead author documents catering to accessibility tools, which follow the algorithm in $cross-link-here, according to the priority of constituencies (users over authors)"

But that seems kind of pointless.

@annevk
Copy link
Member

@annevk annevk commented Sep 2, 2015

It all depends on whether user agents will implement these elements. If they don't, we should scrap them and the outline algorithm can be simplified to what is supported. If they do, or we expect them to within the next five years or so, it might be worth waiting a little longer.

@domenic
Copy link
Member

@domenic domenic commented Sep 2, 2015

Hmm, to be clear, what does "implement these elements" mean? They implement them as HTMLElement instead of HTMLUnknownElement, but they do not implement the accessibility mapping implied by the current outline algorithm, and as discussed up-thread at least a couple have publicly stated they are not planning to do so. (Are those the only two relevant requirements on implementations, or am I missing some?)

@annevk
Copy link
Member

@annevk annevk commented Sep 2, 2015

There are some styling and parser requirements too. And there are some speculative CSS features that would build upon the outline algorithm. We would have to check. But styling and outline would be the most important aspects.

@domenic
Copy link
Member

@domenic domenic commented Sep 2, 2015

Ah right, thanks for pointing those out.

Also, when you say "obsolete <section> etc.", we could give them "the <main> treatment" instead of "the <dir> treatment". I'm not sure there's any practical difference besides what section they go in, but it's worth pointing out.

@annevk
Copy link
Member

@annevk annevk commented Sep 2, 2015

Perhaps it would end up like main since it still has some default ARIA semantics that might be useful. Not sure. I haven't studied this in detail, but I do agree with @Hixie that the fix isn't as simple as adding a note to the outline algorithm or dropping it altogether.

@stevefaulkner
Copy link
Contributor Author

@stevefaulkner stevefaulkner commented Sep 2, 2015

@domenic section is mapped to a region role in browsers, section is not exposed in the aural UI unless it has an accessible name.

There are 2 document scope navigation methods implemented across AT:

  • landmark navigation of header/footer/nav/main/form/section(only if accessible name present) elements.
  • heading navigation via h1-h6 elements.

Implementation of accessibility layer semantics for html element landmarks and h1-h6 is complete in chrome, firefox, Safari

Data on the utilisation of each can be found in webaim screen reader surveys

@stevefaulkner
Copy link
Contributor Author

@stevefaulkner stevefaulkner commented Sep 2, 2015

The reason for the warning is so authors are not mislead into thinking that use of sectioning elements actually does anything for users who consume heading semantics to make sense of and navigate document content.

@JohnnyWalkerDesign
Copy link

@JohnnyWalkerDesign JohnnyWalkerDesign commented Sep 2, 2015

Adding a similar warning makes most sense, IMO, at least until the Outline Algorithm starts being more adopted. Many authors most likely think that (correctly) placing a <h1> element both inside and outside an <article> will be understood by those using assistive technology.

@alastc
Copy link

@alastc alastc commented Sep 2, 2015

I do quite a bit of accessibility training and regularly come across developers who think they should just use H1s (they don't always know about sectioning either).
I agree that it would be more elegant to use sectioning, but unless someone is going to campaign for UAs to implement it, then the spec should align with the reality.

@JohnnyWalkerDesign
Copy link

@JohnnyWalkerDesign JohnnyWalkerDesign commented Sep 5, 2015

I would love to help campaign UAs to use it. It would make everyone's lives (developers and users) if they did.

@nhoizey
Copy link

@nhoizey nhoizey commented Sep 14, 2015

Has a web developer, I also would be really happy if the outline worked as intended with sectioning elements, at last.

Working with only h1 is much easier when you want to include (server-side or with Ajax) the same HTML fragment in several places of your pages, with appropriate hierarchy.

@domenic
Copy link
Member

@domenic domenic commented Dec 27, 2015

Some Twitter discussion reminded me about this neglected issue. I wanted to summarize the action items here:

  • We are not interested in simply adding a warning like the W3C fork does. "Warning: ignore the following. Here is a bunch of normative text about outlines..." We have higher standards for our specs than that kind of self-contradictory patchwork. Instead, we should actually update the spec to reflect reality.
  • The outline algorithm should be rewritten or replaced to reflect @stevefaulkner's description of implemented AT mechanisms in #83 (comment). Namely, it should primarily inform about landmark navigation and heading navigation. (Or maybe only the latter of these?)
    • Maybe these are already specced somewhere we should be referring to?
    • It should reflected implemented heading level semantics as announced by screen-readers, not the ones derived from nesting level.
    • We should do some serious hands-on testing with a11y tools (or we could continue to lean on a11y experts to do so for us, but that is not very good) to figure out how to best phrase these things. For example it's not entirely clear to me whether headings are presented as a nested list, or as a flat list with heading level numbers. We should encourage authors to have a mental model that matches what's implemented.
  • With these in mind, we should carefully change the authoring guidance and semantics for related elements (article, section, nav, aside, h1-h6, hgroup, header, footer). In general I think we can keep the "spirit" of existing semantics, e.g. nothing needs to change about what a section "is". But the advice about how to use it relative to headings, or how to use markup patterns that give it an accessible name, and how that impacts the document outline, will need to be carefully reviewed.
    • One exception may be hgroup. But maybe we should leave it as-is for the first pass.

Potential future work:

  • Consider the fate of hgroup further. Does it impact a11y processing anywhere? Should it be repurposed, or obsoleted, or...? Does it have semantics independent of its a11y processing, like non-accessible-name'd sections could be said to?
  • Propose a <h> element that actually gets treated like people want <h1> to be treated. (If we can get the a11y teams of various browsers to implement, then we can merge it into the spec. But let's not add something until we have experimental implementations and commitments.)

All that said, this isn't that high on my priority list, or my employer's. If someone does have the time to devote to this, I'd be happy to help review patches.

@stevefaulkner
Copy link
Contributor Author

@stevefaulkner stevefaulkner commented Dec 27, 2015

"Warning: ignore the following. Here is a bunch of normative text about
outlines..."

While there is normative requirement for UAs to implement the outline
algorithm, many web developers have been lead to believe it is implemented,
the whatwg spec continues to perpetrate the myth. However you choose to
modify the spec to bring it closer to reality will be an improvement.

@sideshowbarker
Copy link
Member

@sideshowbarker sideshowbarker commented Dec 28, 2015

I agree with much of the comments fro @Hixie and @annevk in this thread, especially the comment from @Hixie that “We can't just remove the outline algorithm” without needing to make other changes as a consequence, and the question “Yeah, I guess either you keep the current outline algorithm, or you obsolete <section>/<h1> & <hgroup>?”.

But all that said, I have over the years learned a huge amount of things from @stevefaulkner around the problems that some things in the spec cause for AT users, and I think others should read his comments here very carefully. We’re here to solve existing real problems for real users—not to hypothetically solve problems for some of them if we could somehow just get browser implementors to see things our way and implement what we’ve specced, or get AT vendors to fix their horribly broken/buggy tools.

In that spirit I agree very strongly with the implicit goals in the latest comment that @domenic posted and with his concrete suggestions there about how to get progress here. See the related IRC discussion.

Some ways I could help with this might be:

  • Experimentally implementing a Show Outline feature in the Nu HTML Checker that shows a “Here’s what the outline of your document looks like to AT users in practice” view—in parallel to or even in place of the current Show Outline view the checker provides, which shows what the outline looks like according to the outline algorithm in the HTML spec.
  • Adding further experimental warnings to help authors avoid using H1-H6 headings in any way bad for AT users; basically that amounts to uses of H1-H6 in ways different from how we told authors they should use them before we added section and article and allowed nested H1s to be used within them, and allowed uses of H2-H6 that break the existing simple hierarchical use of them.

As far as the second item above, I have already implemented experimental support in the checker for reporting (mis)use of H1 as anything other than a top-level head, and that’s been deployed in the checker for quite a long time now, and I think it’s been helping. But I’d like to help more if I can. I want to make the HTML checker be a tool that helps developers avoid making uninformed authoring choices that are going to cause problems in practice for real users.

@rehierl
Copy link

@rehierl rehierl commented Apr 11, 2018

@annevk
Allow me to recap the main points of your proposal:

  1. The heading level of h1 elements depends on its placement: 1 + the number of ancestor sectioning content elements.
  2. The heading level of an h2-h6 element is constant (fixed to the number value).
  3. The lower a heading's heading level is, the more important the heading is.
  4. "Each heading following another heading lead in document headings must have a heading level that is less, equal, or 1 greater than lead's heading level."

I am having difficulties to understand (4): Could you please rephrase (4) so that it is more clear what it means.

<body>
  <h1> A </h1>            toc-1
  <section>               =====
    <h1> B </h1>          1. A
    <section>             1.1. B
      <h1> C </h1>        1.1.1. C
      <h2> D </h2>        1.2. D
    </section>
  </section>
</body>

If I understand your approach correctly, then the heading level of heading C is 3. And because the heading level of the h2 element is constant, its heading level is still 2. So, is "toc-1" an accurate representation of the intended table-of-contents listing (because the h2 element, is now more important than the h1 element), or is it supposed to be something else (e.g. undefined because non-conformant)?

@annevk
Copy link
Member

@annevk annevk commented Apr 11, 2018

@bkardell my impression is that there's too much content out there that uses h2-h6 without regard for their sectioning content element ancestors so adjusting them would do too much damage to existing content.

@rehierl 4 means that you cannot skip heading levels when the heading level increases relative to a previous heading. And toc-1 is indeed accurate (and conforming).

@LJWatson
Copy link

@LJWatson LJWatson commented Apr 11, 2018

@annevk
< my impression is that there's too much content out there that uses h2-h6 without regard for their sectioning content element ancestors so adjusting them
would do too much damage to existing content.<

It would, but a question worth asking, is what impact that damage would actually have. To some extent this idea is based on the assumption that most/many pages have a useful heading hierarchy, and that introducing these changes would therefore break it.

My own experience is that heading hierarchies are almost always broken as it is. So perhaps we should be asking whether the damage this algorithm might cause is better or worse, than the existing status quo?

My (entirely unscientific) hunch is that it would not make things worse. If that assertion can be backed up with evidence, then the question is whether it is worth exchanging one kind of broken for another, as a means of getting everyone to the point where things get easier for authors and better for users?

@annevk
Copy link
Member

@annevk annevk commented Apr 11, 2018

@LJWatson the other reason for avoiding adjusting h2-h6 is that we have styling in place for h1 (based on nesting depth) and we cannot add equivalent styling for h2-h6 (and we also cannot change it for h1). It also seems like an easier pitch to say that folks should just use h1 and sectioning content elements.

@alastc
Copy link

@alastc alastc commented Apr 11, 2018

It's worth examining, from the summary I'd expect these sites to be fine:

  • H1s only with good section nesting for levels.
  • An H1 at the top (not in nested sections) and then H2-H6 properly.

Sites that would not provide a good experience would be:

  • Any site using a mix of (mutiple) H1s and H2-H6s will be broken as per @rehierl's example above.
  • Any site using H1s with poor section structure.

I'm trying to think of any inbetween structures, for example:

<body>
  <banner>
     <h2>Nav heading</h2>
  </banner>
  <main>  
     <h1>Page heading</h1>
    <article>
      <h2> C </h2> 
      <h3> D </h3>
    </article>
  </main>
</body>

I think that would be fine, but if the structure was main > section > h1, that would be an H2? That seems like an easy mistake to make.

It feels like you really have to go one way or the other, mixing multiple H1s with H2-6s is where it would go wrong. I'm guessing someone already came up with the idea of detecting H2-6 and switching to classic heading-levels?

@annevk
Copy link
Member

@annevk annevk commented Apr 11, 2018

@alastc switching is too expensive (and could also lead to very weird experiences on slow loading pages).

@bkardell
Copy link

@bkardell bkardell commented Apr 11, 2018

I'm guessing someone already came up with the idea of detecting H2-6 and switching to classic heading-levels?

Yes, but because this is difficult and expensive and simultaneously also a very real thing that will happen with pretty much any existing tool or CMS (because visual editors, markdown, etc think of text in a more traditional 'flat' way - that's where our original ideas about them come from in fact) I offered that a kind of indicator that let you know 'this element contains that kind of stuff' would be a way to potentially bridge the gap in a way that seems achievable both from an implementation standpoint and an 'authors could actually accomplish this' standpoint. Effectively, the current proposal says that h1's level becomes the section depth - this would allow that all the levels inside an element with such an indicator become section depth + (tag level -1). It seems pretty easy to make as a custom element or polyfill (in fact, I'm playing with a version of it in a project right now) so I'm not trying to push that it has to be a part of this, but it does seem like without it it will still be very hard for most authors to use common tools to create good headings.

@alastc
Copy link

@alastc alastc commented Apr 11, 2018

Fair enough.

In general then safe authoring advice for this proposal would be:

  • Use an H1 for the main page heading, make sure it is not in a nested section element.
  • Use H2-6 as you do now. Sorry: as you should now.

Then, when there is reasonable UA support you could switch to H1s only and use sectioning for levels.

Have I understood that correctly?

@annevk
Copy link
Member

@annevk annevk commented Apr 11, 2018

Yes (or use the polyfill if requiring script is acceptable for your site).

@rehierl
Copy link

@rehierl rehierl commented Apr 23, 2018

Even though I am only a member of the general public, I need to ask you to do the following:
If only for a moment, try to take a look at the bigger picture.

tldr - The proposed heading-level algorithm may reflect "reality", but it won't solve the core problem: Figure out a way to teach computers how to read the content an author associates with a heading (aka. sections). The core concept of the current algorithm (i.e. sections) isn't what made it fail. The attempt to make it reflect reality is what broke the algorithm we have! So use the proposed design to complement (i.e. not replace) it. And, in the long run, fix the existing design, because that is what will drag us out of the mess we are in. - /tldr

With regards to ...

(2015-12-27, @domenic) - Quotes some twitter discussion: ... We have higher standards ... (W3C's) kind of self-contradictory patchwork ... update the spec to reflect reality.

(2016-10-23, @fititnt) - A requests to: decide the best path, explain it, and be consistent.

(2018-03-01, @bkardell) - A reminder that: it's been stressed over the years "it's quite important you get this right" and "here's the criteria for what it means to be 'right'".

(2018-01-26, @othermaciej) - The outline (as currently defined) is a list of (nested) sections (i.e. a "forest") ... an algorithm that produces a forest of sections is neither necessary nor helpful ... the new outline algorithm (should) operate on headings, not sections.

The reason for "a list of sections"

Take a closer look at WHATWG's outline algorithm in order to figure out what the reason for the "a list of sections" definition really is. (Note that I fully agree here: That part (i.e. forest) does not make sense and should be changed). However, if you'd scroll down to "When entering a heading content element", then you might notice this:

  1. The only case in which the "current section" can end up with having no heading is, if that section was declared by an element of sectioning content or sectioning root.

  2. If the "current section" has no heading, and if the first element of heading content is entered inside of such a section, then that heading element is reused as the section's heading.

  3. The first "Otherwise" block is what I'd call a "performance shortcut" which causes the loop beneath it to always create a subsection (see "append it to candidate section"). So the first "Otherwise" block is what is relevant with regards to the above definition, not the loop beneath it.

fragment 4-1      toc 4-1   fragment 4-2       toc 4-2
============      =======   ============       =======
<body>            1. A      <body>             1. Untitled
  <h1> A </h1>    2. B        <section>        1.1. A
  <h1> B </h1>                  <h1> A </h1>   2. B
</body>                       </section>
                              <h1> B </h1>
                            </body>

4-1) If the current heading being entered has a rank that is greater or equal to the rank of the current section's heading, then a sibling section is created (i.e. a forest).

4-2) If the current section has an implied heading, then a sibling section is created (i.e. a forest). Note that this case will only be triggered under certain circumstances. However, if you look close enough at toc 4-2, you could spot an inconsistency error with regards to the "the first element of heading content" definition. Other than that, I'll ignore fragment 4-2 because of it being "bad-practice".

Now, please focus on 4-1: (1) The reason why the current algorithm ends up with "a list of sections" is
due to a heading/rank-based perspective. But, instead of actually taking a look at what went wrong, you (2) suggest to let the "new" outline algorithm operate on headings.

I don't know about you, but I find that rather ironical because a heading-based perspective is largely responsible for our situation. And that is why I have to disagree: "It" is about sections, not headings!

With regards to fixing what we already have ...

I can only cite what @fititnt and @bkardell wrote (see above): Get it right!

Use Graph Theory (i.e. not statistics) to prove on a formal level that, whatever you come up with next, is consistent. One way or another, I am convinced that Graph Theory will win. After all, ... Earth isn't the center of the universe, it never was! ... (Note the critical switch in perspective!)

@rehierl
Copy link

@rehierl rehierl commented Apr 23, 2018

@bkardell - I am in the process of spending some time with some "it" ("some time", what an understatement that is). Here is what I think:

You are right, the outline is a higher level version of a document's node tree. And if you understand "sectioning nodes" as a class of nodes which, more or less, tell an algorithm where exactly a section begins and where it ends (i.e. heading elements, sectioning content elements, sectioning root elements), then that tree of sections (aka. outline) is defined by those sectioning nodes.

However, you could even take it one step further: Some of those sectioning nodes (i.e. the sectioning root elements) can even be understood to not just define sections within a document's node tree, but also to define sections within the document's outline. After all, a tree of sections is in its core just another node tree. So what you actually have is this: a document, an outline, and an outline of an outline.

@prlbr
Copy link

@prlbr prlbr commented Apr 23, 2018

Does anybody still think that it was a good idea to reuse <h1> instead of a new element <h> for the explicit <section> modell? All of this would be easier and more intuitive, if authors could chose between the traditional heading focused <h1><h6> modell with implicit sections and an explicit <section>/<h> modell. It may not be to late to recognize a mistake and fix it at the root.

@rehierl
Copy link

@rehierl rehierl commented Apr 23, 2018

The <h> element ... I have seen the discussions, but I didn't have enough time to really think about it in detail. So here is my current and limited take on that element in order to avoid drifting off in a "war of beliefs":

The reason for "a forest of sections" is the reuse of heading elements. But instead of limiting the "reuse" to what it was supposed to do (i.e. only define a heading), people thought that a heading element should still have a rank under these kind of circumstances. What proof is there that the section of a <section> element even needs an inner or an outer rank?

The problem with the <h> element is that it "looks" too similar to the <hX> elements. My guess is, that people will confuse it with an element that declares a section (i.e. a sectioning node) and thus would try to use it just like all the other <hX> elements. Yes, introduce it to teach the core concept of a rank-less sectioning node. No, don't introduce it as an element that is not a sectioning node.

If used as a sectioning node, then what rank should it have? Should it act as a <h0> element? Or should it act more like a <h7> element instead?

The <h> element is just an option, not the solution itself.

And here is the bummer: In the long run, I'd much prefer the reuse of another element, the title element: If every element of sectioning content or sectioning root has only one top-level section and, as such, represents a "flat thing", then what is the heading of the <body> element? The document's title ... The current definition of the <title> element (IMHO) seems to better fit the intended purpose than the <h> element.

@jakearchibald
Copy link
Collaborator

@jakearchibald jakearchibald commented Apr 23, 2018

@prlbr

Does anybody still think that it was a good idea to reuse <h1> instead of a new element <h> for the explicit <section> model?

Yes. It has a better backwards compatibility story. If <h> becomes a standard, there'll be a period of time where it's used, but it's unsupported in user agents. Unless it's polyfilled, this element is no better than a <span> to these users. Given that most screen reader users use headings to navigate pages, wrong-level headings is better than no headings.

@prlbr
Copy link

@prlbr prlbr commented Apr 23, 2018

@jakearchibald

<h role='heading'> would work for the transitional period. Thearia-level attribute even allows to define a certain heading level. Authors could, but wouldn’t need to set these attributes manually – they could be injected by a polyfill, as you say.

@jakearchibald
Copy link
Collaborator

@jakearchibald jakearchibald commented Apr 24, 2018

@prlbr right, so it comes down to <h> where developers need to remember to assign the correct role and apply the correct default stylings, or <h1> where they don't need to do any of that.

@prlbr
Copy link

@prlbr prlbr commented Apr 24, 2018

@rehierl I have no prejudice for the name of a heading element for the explicit, nestable <section> modell. I would be fine with replacing <h> by anything that has no other conflicting meaning or behaviour (which both <h1> and <title> have).

@jakearchibald Indeed, those would be the disadvantages of <h> for a transitional period.* In my opinion these temporal disadvantages are less of a problem than the continued confusion that is caused by reusing <h1>. Examples of unintuitive behaviour have been given in this thread.

(*That period would be over already if the decision would have went for <h> instead of reusing the ubiqoutous <h1> for something different than it had meant for years in the first place; the default styling issue wasn’t a disadvantage then as <h1> had an incorrect default styling for non-top-level use too.)

@rehierl
Copy link

@rehierl rehierl commented Apr 24, 2018

@prlbr The reason I mentioned the <title> element was due to the overlap in semantics, although the spec clearly states "no more than one". But, without any formal design that can be proven to work, it seems too soon to discuss such things. (Not that I don't appreciate your feedback).

@rehierl
Copy link

@rehierl rehierl commented Apr 25, 2018

@annevk I totally forgot to apologize for giving your proposal such a blow. Even though there are some essential problems with it, I'd like to even thank you for giving it a go. That is, because your proposal allowed me to connect a few more dots. Now, if I could only figure out how to explain that in detail ...

@othermaciej
Copy link
Collaborator

@othermaciej othermaciej commented Apr 25, 2018

Introducing a new <h> element seems outside the scope of this issue, which is to update the outline algorithm for existing elements to something more implementable in browsers.

@rehierl
Copy link

@rehierl rehierl commented Apr 26, 2018

The main misconception seems to be that sectioning nodes (headings, section elements, etc.) can be treated like any other element: As being isolated and independent from one another. That however is not the case:

  1. With each of those elements, an author associates content. That is, whether the spec has a definition for "it" or not, sections do exist as associated content/nodes.
  2. If sectioning node "A" has a section that contains sectioning node "B", then section "B" is related to section "A". That is, those elements are not independent, they never were!
  3. By now it should not be difficult to accept that mathematics is involved: What a section hierarchy is depends on the definition of subset and/or subsequence.

Sure, under certain conditions you will be consistent. The problem with that is however: You won't be able to guarantee consistency. That is, under different conditions, you will still be in conflict. And that will keep us from having a Semantic Web.

These kind of issues will keep nagging you over and over again. Simply because mathematics is the only language computers understand. And because mathematics is what it is, it is "reality" that needs to change. The only thing you can do to finally solve our problem is to figure out how to properly deal with it.

(But, you already made it pretty clear that there still is not enough interest for a proper solution.)

annevk added a commit that referenced this issue Mar 14, 2019
This makes a number of fairly big changes:

* Introduces a heading and heading level concept.
* Replaces the outline algorithm with a document headings concept.
* Requires document headings to not skip heading levels and start
  with heading level 1.
* Introduces a :heading pseudo-class selector.
* Introduces a :heading(level) functional pseudo-class selector.
* Does away with the section concept (except insofar it's needed to
  influence the heading level of h1/hgroup).
* Does away with sectioning roots.

Tests: ...

Fixes #83.
@Comandeer
Copy link

@Comandeer Comandeer commented Sep 14, 2019

I've read the whole discussion around this issue and #3499 and I'm aware that many of the points I mention were already discussed. Yet I want to add my own perspective, which maybe someone will find useful.

TL;DR

  • keep the treatment of headings the way it is really implemented at the moment;
  • reconsider adding built-in custom h element with associated new heading level algorithm;
  • drop hgroup;
  • add section about marking up subheadings.

Issues

After reading proposal of fix in #3499, I noticed some potential issues.

Breakage of existing conforming usage of sectioning elements and headings

Such example was mentioned here before, but I'd like to return to it once more:

<body>
	<main>
		<article>
			<h1>Title of the page</h1>
		</article>
	</main>
</body>

According to the model defined in #3499, h1 in the above code will become a level 2 heading, therefore it won't longer be a heading of the whole page. It means that this change will effectively make now conforming markup incorrect – as it will produce undesirable effect.

I do not agree with the argumentation that authors shouldn't use the above markup as it is redundant. Even if we consider it redundant, it is still valid HTML. However I do not agree that it's redundant – and the new proposal seems to agree with me on that. If it was redundant, the result of calculating heading level in such structure would be the same as for the following one:

<body>
	<main>
		<h1>Title of the page</h1>
	</main>
</body>

Yet it is not the same, as the second one produces a level 1 heading. So there is a slight difference in semantics. As a spec states, main represents dominant contents of the document and article a self-contained composition that can be independently distributable or reusable. Joining these two elements means that dominant contents of the page are additionally self-contained and distributable.

What's more, such article is often followed by another section, e.g. comments in a blogpost:

<body>
	<main>
		<article>
			<h1>Title of the page</h1>
		</article>
		<section>
			<h2>Comments</h2>
		</section>
	</main>
</body>

In such case there are two independent sections inside main and the title of the article is also the title of the page. However, according to the new model, the article's heading can't be the heading of the whole page at the same time.

I'm wondering how popular is the above pattern. I often find it on sites that use WordPress (including mine), as default themes use it. I'm afraid that the change in treating h1 in a special way will break a lot of sites.

Compatibility

The new heading level algorithm have sense only if there is enough support for it. However there will be a transition period between putting the algorithm into the specification and implementing it in browsers. And during that period websites need polyfiils not to appear broken in unsupported environments.

However gaining support can take years, as statistics show that IE is still very popular among AT users. Browsers are only one side of the problem, but it is rather easy to imagine that users use also outdated versions of screen readers. In such cases polyfills will be needed for years or even longer. And we should also consider the possibility that some browsers won't implement the algorithm, which happened already with several HTML features, e.g. dialog element.

Yet there is even a bigger compatibility issue. As I mentioned in the previous point, many sites that have now correct heading levels will become incorrect when the new algorithm is introduced. And that would require action from webmasters to update their sites to the new rules. However many sites can be not actively maintained, still containing much valuable content. They will become less accessible to the users due to the change in the specification. It's very similar situation to Smooshgate, when the normative change would break the web (although we can argue if the case here would really break it or "just" make it less accessible).

Breakage of expectations

For years numbers in hx elements meant heading's level. It's true even in the current version of specification, which defines outline algorithm:

These elements have a rank given by the number in their name.

The new proposal breaks this expectation and ties heading's level directly with the new algorithm.

There are tons of tutorials that explain headings outline using their rank, e.g. MDN's one. Changing it would make most of these materials obsolete in the best scenario and harmful to the accessibility in the worst one. Webdevs creating their sites using advices from such tutorials would create websites that are suboptimal for some groups of users.

What's even more confusing is the fact that the special treatment is reserved only to the one of the headings – the top-level one. Due to that there can be a lot of incorrect websites due to assumption that all headings participate in the new algorithm.

How I perceive headings

Before I describe my vision of ideal headings algorithm, I'd like to quickly describe how I personally use headings.

Some time ago I developed (discovered?) something I call Headings First Principle (HFP). The rule is simple: divide the page into sections using headings, e.g.

<body>
	<h1>Title of the article</h1>
        <h2>Subsection 1</h2>
        <h2>Subsection 2</h2>
        	<h3>Subsubsection 2.1</h3>
        <h2>Subsection 3</h2>
</body>

After dividing page in such way, I add sectioning elements to make the division explicit:

<body>
	<main>
		<article>
			<h1>Title of the article</h1>

			<section>
				<h2>Subsection 1</h2>
			</section>

			<section>
				<h2>Subsection 2</h2>
				
				<section>
					<h3>Subsubsection 2.1</h3>
				</section>
			</section>

			<section>
				<h2>Subsection 3</h2>
			</section>
		</article>
	</main>
</body>

This way I'm sure that all my headings and sections point to the same portions of the site.

It is clear that in the above model sectioning content is somehow redundant and headings alone can be used to structure the site. However similar arguments were raised against main element. In my opinion sectioning content makes the division of page more explicit – or, to say it in HTML terms, more semantic.

All of my proposals are based on the above understanding of headings and sections.

Keep the current status quo

In my opinion headings definition should be kept as is, with the association between number in heading name and its heading level. This solution de facto fixes all of the above mentioned issues. The only downside of this approach is the fact that it won't fix the pages that use outline algorithm. However I'm not sure if the usage of this algorithm is so big that would overweight the breakage of mentioned WordPress sites.

There is also issue with default styling of headings inside sections, due to rendering guidelines, however I do not find it critical. First of all, whole rendering section is not, strictly speaking, normative:

User agents are not required to present HTML documents in any particular way. However, this section provides a set of suggestions for rendering HTML documents that, if followed, are likely to lead to a user experience that closely resembles the experience intended by the documents' authors. So as to avoid confusion regarding the normativity of this section, "must" has not been used.

Secondly, thanks to cascade order, this styling can be overridden with any heading styling, making it basically a non-issue (as only unstyled pages will be hit by the UA's rules).

Reconsider adding h element

In many cases webdevelopers know exactly which level of heading is appropriate for a given part of the content and in such cases h1-h6 elements are enough. However there are also cases, in which we do not know the level of heading, e.g. external widgets rendered inside Shadow DOM or content generated by user in CMS/WYSIWYG.

In CKEditor 5 we decided to default to h2 as the top-level heading. This solution is based on assumption that CKEditor 5 won't be used to edit the whole page, but only the content of the article (disccusion). Yet such solution is far from being ideal. Introducing h element would give us the certainty that all headings will have correct heading levels (assuming that the editor will output sections).

That's why I think that the best solution to this issue would be adding h element to the specification. I do not feel that such proposal is out of scope of this issue, as the new element would be directly associated with the new algorithm and it will be the only – and explicit – way to use it. It means that it would play the role of opt-in mechanism for the new feature, which would guard Web from the breaking connected with changing the meaning of currently conforming markup.

I also think that the recent changes proposed by Chrome – #4696 and #4697 – will make the introduction of the h element much easier. It can be made a built-in custom element, importable via API proposed for Layered APIs/JS standard library and therefore – a good candidate for e.g. origin trail in Chrome. That would give the needed feedback for further examination if it really fixes the issue and if it can be safely moved to the HTML specification. Such importable element will be also much easier to polyfill.

If the h element is introduced, then the whole heading level algorithm will be connected only with it, making the proposed changes "local", instead of current "global" approach. And making them local limits the possibility of breakage of anything nearly to zero.

Drop hgroup

I totally agree with #3499 (comment). Without outline algorithm, hgroup does not make much sense. What's more, it seems to bring even more confusion about how the new (and old) algorithm works. Dropping it seems the most reasonable way to handle it.

Add section about subheadings

Currently there is no sensible way to mark up subheadings according to WHATWG standard. The official advice is to use hgroup:

The element is used to group a set of h1h6 elements when the heading has multiple levels, such as subheadings, alternative titles, or taglines.

However hgroup was never implemented in interoperable way and it's basically styled div in most (if not all) implementations. It is also – as was mentioned before – endangered by removal from the specification. Therefore there is a need to define a new way of marking up subheadings. I propose adopting appropriate section from HTML 5.2. This way the practice that was already common on the web will be codified by the current HTML specification.

And I think that would be all for my short thoughts on this issue!

@zcorpan
Copy link
Member

@zcorpan zcorpan commented Sep 16, 2019

At TPAC HTML/ARIA joint meeting, it's suggested that we should discuss this issue with the APA WG.

annevk added a commit that referenced this issue Oct 11, 2019
This makes a number of fairly big changes:

* Introduces a heading and heading level concept.
* Replaces the outline algorithm with a document headings concept.
* Requires document headings to not skip heading levels and start
  with heading level 1.
* Introduces a :heading pseudo-class selector.
* Introduces a :heading(level) functional pseudo-class selector.
* Does away with the section concept (except insofar it's needed to
  influence the heading level of h1/hgroup).
* Does away with sectioning roots.

Tests: ...

Fixes #83.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Linked pull requests

Successfully merging a pull request may close this issue.

You can’t perform that action at this time.