Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add heading-focused outlines and :heading #3499

Closed
wants to merge 5 commits into from
Closed

Conversation

annevk
Copy link
Member

@annevk annevk commented Feb 23, 2018

This makes a number of fairly big changes:

  • Introduces a heading and heading level concept.
  • Replaces the outline algorithm with a document headings concept.
  • Requires document headings to not skip heading levels and start
    with heading level 1.
  • Introduces a :heading pseudo-class selector.
  • Introduces a :heading(level) functional pseudo-class selector.
  • Does away with the section concept (except insofar it's needed to
    influence the heading level of h1/hgroup).
  • Does away with sectioning roots.

Tests: ...

Fixes #83.


馃挜 Error: Wattsi server error 馃挜

PR Preview failed to build. (Last tried on Jan 15, 2021, 7:58 AM UTC).

More

PR Preview relies on a number of web services to run. There seems to be an issue with the following one:

馃毃 Wattsi Server - Wattsi Server is the web service used to build the WHATWG HTML spec.

馃敆 Related URL

Command failed: /home/noderunner/wattsi/bin/wattsi /tmp/upload_c761b0c0a6ed3be82aba7650e2599b0e (sha not provided) c3dpqy8f5c7 default /tmp/upload_8d9712341bde1574bcd01a34a4df2e4d

If you don't have enough information above to solve the error by yourself (or to understand to which web service the error is related to, if any), please file an issue.

@annevk annevk added normative change addition/proposal New features or enhancements accessibility Affects accessibility needs tests Moving the issue forward requires someone to write tests labels Feb 23, 2018
@js-choi
Copy link

js-choi commented Feb 23, 2018

It鈥檚 wonderful that this is being, at last, concretely fleshed out.

I鈥檓 also interested in heading-level ranges in the pseudo-class, such as :heading(>=3). Should I raise this in a new issue, or would commenting in this pull request or in #83 be okay?

@annevk
Copy link
Member Author

annevk commented Feb 23, 2018

@js-choi I'd prefer new issues for enhancements on top #83 / this PR as they can be added later. It's always good for the initial take to be as simple as possible.

source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
@cookiecrook
Copy link

@stevefaulkner @alice for additional review.

source Outdated Show resolved Hide resolved
@sideshowbarker
Copy link
Member

As far as the algorithm at https://whatpr.org/html/3499/sections.html#heading-level that assigns levels to headings, I think that for some normal cases of hgroup usage, it breaks author expectations by unintuitively assigning a level that鈥檚 different from what authors intend/assume.

For example, consider the following document:

<h1>Screenplays for Kubrick movies from the 1960s</h1>
  <p>The following are some details on movies Stanley Kubrick directed in the 1960s.
  <h2>Spartacus</h2>
    <p>1960; screenplay by Dalton Trumbo, based on a novel by Howard Fast.
  <h2>Lolita</h2>
    <p>1962; screenplay by Vladimir Nabokov, based on his own novel.
  <hgroup>
    <h2>Dr. Strangelove</h2>
    <h3>or: How I Learned to Stop Worrying and Love the Bomb</h3>
  </hgroup>
    <p>1964; screenplay by Kubrick, loosely based on a novel by Peter George.
  <h2>2001: A Space Odyssey</h2>
    <p>1968; screenplay by Kubrick and Arthur C. Clarke, based on a Clarke short story.

Given the above, the algorithm at https://whatpr.org/html/3499/sections.html#heading-level assigns a heading level of 1 to the h1 heading, and a heading level of 2 to every one of the h2 headings except the heading Dr. Strangelove (or: How I Learned to Stop Worrying and Love the Bomb) 鈥斅爓hich it instead assigns a heading level of 1 (due to the fact the author chose to put an hgroup element around the h2 title and h3 subtitle for that movie).

It doesn鈥檛 seem at all intuitive for the title to be assigned a heading level of 1 (effectively making it the same level as the title of the entire document) instead of being signed a heading level of 2 (keeping it at the same level as the headings with other movie titles.

I don鈥檛 think authors/developers would expect that title to be a level 1 heading 鈥 I think instead what they鈥檇 expect it to be a level 2 heading like the other titles in the list. And in fact that鈥檚 the level the old/existing outline algorithm in the spec would assign it.

@sideshowbarker
Copy link
Member

As far as hgroup and what the heading-level algorithm should do with instead of what the current patch in this PR branch does: I think the algorithm should just ignore hgroup. That is, h2-h6 headings should always just get assigned their corresponding 2-6 heading level, with no regard for whether they have an hgroup parent and ancestor sectioning content elements.

The original/current purpose of hgroup is pretty closely bound to the existing/old outline algorithm 鈥斅爄n that hgroup was created as a kind of necessary consequence of the fact the outline algorithm introduced the idea that headings create (conceptual) subsections (regardless of whether the headings happen to be marked up with section elements or other sectioning-content elements).

So in that model, it was necessary to ensure that a (conceptual) subsection would not be created by a heading an author intended as a subheading. Thus in the outline algorithm, that鈥檚 the effect hgroup has: It just prevents a new subsection from being created in the outline.

But if we dispense with the conceptual model of headings creating subsections (as the patch in this PR branch does), then we no longer need a way to prevent a heading from creating a subsection. So we need longer need hgroup to have any effect. It can (should) basically just become a no-op.

And if we don鈥檛 have hgroup actually doing anything, then logically the next question to consider is whether we should keep hgroup around at all. Personally, I think we shouldn鈥檛 鈥 I think instead we should deprecate/obsolete it along with dropping the outline algorithm it was designed for. Use-counter data from the HTML checker shows that only around 0.2% of documents are using hgroup anyway.

And if we drop hgroup then the next question to consider is what authors should use to mark up subheadings. I think the answer to that is, they should use whatever they鈥檙e already using 鈥 because I think it鈥檚 clear that among the 99.8% of documents that aren鈥檛 using hgroup, there is some significant percentage that have subheadings but that the authors have chosen not to use hgroup to mark up.

In other words, instead of using hgroup, it seems that authors are largely either not putting any extra markup around heading+subheading groups at all or else they鈥檙e just a div or p around them.

So I think the vast majority of authors who are using subheadings in their documents aren鈥檛 going to care whether we drop hgroup 鈥 because they鈥檙e not using it anyway.

But all that said, I don鈥檛 feel strongly that we must drop hgroup. I guess keeping it around for continued use by the small number of authors who are using it would not do a lot of harm.

However, I do feel strongly that if we keep hgroup it must have not affect on the assignment of heading levels 鈥斅爁or the reasons I give in #3499 (comment)

@sideshowbarker
Copy link
Member

To reinforce the points I made in #3499 (comment) about hgroup having any purpose outside the context of the old/existing outline algorithm, I want to note the following specific change this patch makes; it takes the following (non-normative) text (emphasis added):

The point of using hgroup in these examples is to prevent the h2 element (which acts as a secondary title) from creating a separate section of its own in any outline

鈥nd changes it to this:

The point of using hgroup in these examples is to prevent the h2 element (which acts as a secondary title) from creating a separate heading of its own.

On the face of it the phrase prevent the h2 element from creating a separate heading doesn鈥檛 make sense (in contrast to the phrase prevent the h2 element from creating a separate section of its own in any outline, which does make sense).

What I mean is, the h2 element does in fact create a separate displayed heading of its own in any visual rendering of the document (unless we鈥檙e end up deciding to have hgroup affect how UAs do the default visual rendering of headings, which I hope we鈥檙e won鈥檛鈥).

So it鈥檚 unclear what that explanation prevent the h2 element from creating a separate heading of its own means. I guess one solution to that would be to just drop that explanation. But if we do that, then we鈥檇 need to come up with some other explanation for the point of using hgroup in the examples. However, as I pointed out in #3499 (comment), I don鈥檛 think there is any good point to using hgroup that we could explain in a way similar to the way we explained in the context of the (old) outline algorithm.

So I think that argues for dropping the hgroup examples, and I guess for dropping hgroup 鈥 since it doesn鈥檛 make much sense to have an element in the language that we can鈥檛 explain the purpose for well and that we can鈥檛 come up with good examples for that wouldn鈥檛 make just as much sense as examples where the hgroup is replaced with div or whatever).

@prlbr
Copy link

prlbr commented Apr 3, 2018

As a note to @sideshowbarker's comment:

<hgroup> would have been interesting to me if It had not forced me to use <hx> as a sub-heading.

I understand that using <hx> as a sub-heading was a pattern that had been found in the wild, but as far as I can remember it has been a pattern that accessibility experts have always criticized聽鈥 similar to using <table> for purely presentational reasons instead of for tabular data. Standardizing this as the correct way to do it has been an unfortunate choice.

In my opinion what we have now is a badly designed <hgroup> element besides an overloaded <header> element which serves two different purposes with different semantics, depending on whether it is 鈥渟coped to the <body>鈥 or not.

鈥淪coped to the <body>鈥, the <header> element has an implied role=banner, meaning that it represents content that is rather site-oriented than page-specific. But when it鈥檚 not scoped to the <body>, it represents a group of introductory stuff to the nearest main/section/article/etc. it resides in, so it is specific for where it is 鈥渟coped to鈥.

What I would have liked to have: An element for site-oriented content, say <header>, and an element that groups the heading of a page or section with other introductory stuff, say <hgroup>.

@annevk
Copy link
Member Author

annevk commented Apr 5, 2018

@sideshowbarker @prlbr I think we need some kind of answer for subheadings though.

@prlbr
Copy link

prlbr commented Apr 11, 2018

I think we need some kind of answer for subheadings though.

w3c chose to add a section on subheadings etc. for common idioms without dedicated elements.
https://www.w3.org/TR/html52/common-idioms-without-dedicated-elements.html#subheadings-subtitles-alternative-titles-and-taglines

A new element <hsub> was a favorite of some people in the past:
https://www.w3.org/html/wg/wiki/ChangeProposals/hSub

source Outdated Show resolved Hide resolved
zcorpan added a commit to web-platform-tests/wpt that referenced this pull request Jun 8, 2018
Part of whatwg/html#3499.

This does not yet test :heading().
@Dan503

This comment has been minimized.

@annevk
Copy link
Member Author

annevk commented Oct 18, 2019

Going through previous sibling might not be that bad (though it is certainly worse than counting ancestors), but I suspect the need is even more complicated and would need to cover cases such as

<article>
  <h1>jsdom</h1>
  <h2>Basic usage</h2>
  <div><h2>Customizing jsdom</h2></div>
  <h3>Simple options</h3>
  ...
</article>

as well, at which point it becomes unworkable. (Also known as the unstated algorithm in the standard today, roughly.)

@domenic
Copy link
Member

domenic commented Oct 18, 2019

I'm not familiar enough with the details of what's happening to tell why inserting a <div> would make things difficult. I was just envisioning that wrappers like article would cause the heading level of all their children to increase by the most-recently-seen heading level, or something similar.

Edit: hmm, defining "most-recently-seen heading level" in a timeless way makes the difficulties more apparent. E.g. in cases like

<h1>GitHub</h1>
<h2>jsdom/jsdom</h2>

<div>
  <article>
    <h1>jsdom</h1>
    <h2>Basic usage</h2>
    <h2>Customizing jsdom</h2>
    <h3>Simple options</h3>
    ...
  </article>
</div>

(which I believe is more realistic with regard to GitHub's actual markup) do not provide any easy way of finding <h2>jsdom/jsdom</h2> from <article>. It's certainly not "previous sibling heading element". So I can see how if our goal is simplicity + support for h1-only pages, we might have to drop support for wrapping user-generated content.

@Dan503
Copy link

Dan503 commented Oct 18, 2019

If I recall correctly, the ask was to be able to wrap user-generated content in an <article> or similar, and thus adjust all its heading levels.

I'm a little bit concerned about if this idea breaks the meaning of the <article> element.

Article is meant to be a self contained piece of content that can make sense on its own when taken out of context.

Does this idea go against the fundamental meaning of what the <article> element is meant to represent?

I do like the idea in general though so maybe <section> would be better for this functionality. Or maybe a new element if necessary.

@muan
Copy link
Member

muan commented Oct 21, 2019

Thanks @domenic for mentioning our issue. I've been quietly following the thread and can see how this is all complicated, so haven't been able to add anything.

I don't know if introducing a new element will solve this since what algorithmically couldn't be done for <article> would be applied to this new element too, wouldn't it?

I might be way off鈥 could introducing a new attribute, to make "most-recently-seen heading level" explicit, work?

<h1>GitHub</h1>
<h2>jsdom/jsdom</h2>
<div>
  <article headinglevelstart="3">
    <h1>jsdom</h1>
    <h2>Basic usage</h2>
    <h2>Customizing jsdom</h2>
    <h3>Simple options</h3>
    ...
  </article>
</div>

Since the alternative is generating markup that has the explicit <h3>/<h4>/<h5>, the developers would need to have the starting level information either way. An attribute would save us from altering user generated content.

@Dan503
Copy link

Dan503 commented Oct 22, 2019

I like your idea @muan it is reminiscent of how <ul> and <ol> work with their start attribute. That would also make it easy to teach since parallels can be made with how <ul> and <ol> works.

This also might help solve the problem of nested sections labelled with aria-label increasing the heading level past what we want.

We don't have to move aria-label into a visually hidden heading if we can just override the base heading level that will be output 馃榿

This is how I imagine it working:

Without the attribute:

<body>
  <h1>Read as h1 :)</h1> 

  <p>content</p>

  <aside>

    <section>
      <h1>This is read as a h3 :(</h1>
      <p>content</p>
    </section>

    <section>
      <h1>Also read as a h3 :(</h1>
      <p>content</p>
    </section>

  </aside>
</body>

With the attribute:

<body>
  <h1>Read as h1 :)</h1> 

  <p>content</p>

  <aside headinglevelstart="1">

    <section>
      <h1>This is read as a h2 :)</h1>
      <p>content</p>
    </section>

    <section>
      <h1>Also read as a h2 :)</h1>
      <p>content</p>
    </section>

  </aside>
</body>

@annevk
Copy link
Member Author

annevk commented Oct 22, 2019

I filed #5033 on that suggestion. Seems like a reasonable follow-up to me once we have header level infrastructure in place and as it only requires going through ancestors shouldn't pose much of an issue implementation-wise.

@Dan503
Copy link

Dan503 commented Oct 22, 2019

This headinglevelstart attribute could also be a good way of telling browsers to use this heading levels algorithm.

<body>
  <!-- heading levels ignore sectioning elements -->
</body>
<body headinglevelstart="1">
  <!-- heading levels are affected by sectioning elements -->
</body>

Just implementing this algorithm directly into browsers across all websites across the whole world would break backwards compatibility.

@annevk

This comment has been minimized.

@Dan503
Copy link

Dan503 commented Oct 22, 2019

This also might help solve the problem of nested sections labelled with aria-label increasing the heading level past what we want.

I've had a better idea about this. It can be explicitly written in the spec that if a sectioning element does not have a heading element associated with it, then it does not affect heading levels. That would probably be much easier and also much less likely to break backwards compatibility 馃

<body>
  <h1>Read as h1</h1> 

  <p>content</p>

  <!-- no heading element association so heading levels are not incremented -->
  <aside>

    <section>
      <h1>This is read as a h2 :)</h1>
      <p>content</p>
    </section>

    <section>
      <h1>Also read as a h2 :)</h1>
      <p>content</p>
    </section>

  </aside>
</body>

I still like the headinglevelstart attribute. This addition to the spec would mainly mean that we don't have to use the attribute as often and it would help prevent backwards compatibility issues.

@MarcoZehe
Copy link

I don't think that's true. We've implemented the proposed algorithm in Firefox Nightly and there's been some changes observed that could be considered breakage, but overall it seems reasonable thus far.

I disagree with this assessment. Lots of WordPress themes I|ve visited encompass a post inside a main inside an article, and the comments are outisde the article, but inside the main. The h1 that contains the blog post title is in all these cases remapped to a level 2, which is wrong, since it is then put in line with h2s inside the post that denounce sub sections. And those pages then no longer have an h1 alltogether.

Further, one of the major news sites for IT topics in Germanz, Heise, uses articles as well, putting their h1 out of service.

The Moyilla instance of Bugyilla|s view bug page breaks, no longer has an h1 as well.

And I|ve seen others, which I forgot to take notes about, but to me, this seems prettz significant. The breakage inflicted on WordPress blogs alone is prettz substantial.

@Comandeer
Copy link

Comandeer commented Oct 22, 2019

I agree with @MarcoZehe. I raised this before in #83 (comment): main headings in default WordPress themes, e.g. Twenty Fifteen, are inside main > article on subpages dedicated to a single blog post. They are used as both page and blog post main headings:

<body>
  [鈥
  <main>
  	<article>
    	<h1>Blog post heading</h1>
    </article>
  </main>
  [鈥
</body>

I downloaded the newest Firefox Nightly and tested it on several WP sites (including mine) and Firefox treats such heading as level 2 one 鈥 so the main heading of the page is removed in the process.

However this pattern is not common only in the context of WordPress and its default themes, it is also present in many tutorials about basics of HTML and accessibility (including mine ;)) 鈥 at least in Poland.

I see two possible solutions for this issue:

  • mechanism for opting-in for the new algorithm. The proposed attribute ([headinglevelstart]) seems sensible.
  • always treat the first h1 on a page as level 1 heading or treat the first :is(article, section) h1 as level 1 heading if there is no h1 element directly in body. However, it would make the whole algorithm much more convoluted.

@Dan503
Copy link

Dan503 commented Oct 22, 2019

I see two possible solutions for this issue:

  • mechanism for opting-in for the new algorithm. The proposed attribute ([headinglevelstart]) seems sensible.
  • always treat the first h1 on a page as level 1 heading or treat the first :is(article, section) h1 as level 1 heading if there is no h1 element directly in body. However, it would make the whole algorithm much more convoluted.

I would much rather an opt-in mechanism. It is far less likely to result in bugs that are impossible for the developer to fix outside of forcing it with an aria-level attribute on the heading element.

The second option you gave doesn't really work if there is a left sidebar on the page with a heading in it. The sidebar heading would be encountered first so it would become the <h1> heading for the page. The main page content heading would remain at <h2> level.

@annevk
Copy link
Member Author

annevk commented Oct 22, 2019

@MarcoZehe yeah, I clearly wrote that too quickly. If reporting some h1s as level 2 is considered too much breakage we should make the standard reflect the status quo of h1-h6, even if it makes the default styling for some elements rather weird. Not sure what it means for hgroup, I guess it'll remain a container of sorts.

@domenic
Copy link
Member

domenic commented Jan 23, 2020

@annevk it appears from https://groups.google.com/d/msg/mozilla.dev.platform/SdnMKYwWxzU/U-v_b8c2BwAJ that this was not able to be implemented. Should we abandon this approach, and instead update the outline algorithm to just assemble an outline based on h1-hN, and update all the examples in the spec accordingly?

I guess we may still want to build it on top of this PR, since you've done a lot of the work to remove concepts like sectioning roots, etc.

@annevk
Copy link
Member Author

annevk commented Jan 24, 2020

Yeah, though you'll run into "what to do with hgroup" pretty quickly.

@sideshowbarker
Copy link
Member

Yeah, though you'll run into "what to do with hgroup" pretty quickly.

Yeah 鈥 so I think it鈥檚 time we dropped (obsoleted) hgroup #6462

I believe dropping hgroup will un-block the patch in this PR, which will (to recap) in turn allow us to finally get rid of the unimplemented HTML outline algorithm (and so, stop misleading web developers and causing confusion for them on a large scale by having something in the spec that doesn鈥檛 actually match reality as implemented in browser engines鈥).

@sideshowbarker
Copy link
Member

sideshowbarker commented Mar 20, 2021

To be clear about another thing: We could go ahead and just drop the current outline algorithm altogether. We don鈥檛 need to block dropping of the current outline algorithm on getting resolution for this PR first.

We could drop the current outline algorithm first, and then continue with trying to get this PR resolved.

But even if/when we were to reach resolution on this PR, it would still be a relatively long time before we could merge it 鈥 because we鈥檇 also need implementations (or at least explicitly-stated implementation commitments), and we鈥檇 need tests.

However, in contrast, we could go ahead and just drop the current outline algorithm immediately, basically; we don鈥檛 need to wait on implementors to make any changes 鈥 since it鈥檚 never been implemented and thus removing it would not require implementations to change 鈥 and no tests are needed.

And sorry if somewhere in this PR discussion I already previously said the above 鈥 it鈥檚 hard to remember, with a PR that鈥檚 been open for more than 3 years, created to resolve an issue that was reported more than 5 years ago.

And in fact I see/recall now that 鈥渏ust remove the outline algorithm altogether鈥 was already proposed more than 5 years ago.

stevefaulkner added a commit to stevefaulkner/html-1 that referenced this pull request Apr 18, 2022
used modified text of @annevk PR headings and sections section whatwg#3499
stevefaulkner added a commit to stevefaulkner/html-1 that referenced this pull request Apr 18, 2022
general clean up and closer alignment with @annevk PR whatwg#3499
@domenic domenic closed this in 6682bde Jul 1, 2022
@annevk annevk deleted the annevk/heading-level branch August 29, 2022 16:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accessibility Affects accessibility addition/proposal New features or enhancements needs tests Moving the issue forward requires someone to write tests normative change
Development

Successfully merging this pull request may close these issues.

Suggest adding a warning about outline algorithm