Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: standalone HTML and Restructured Text wrong heading #3119

Closed
norok2 opened this issue Sep 14, 2016 · 10 comments
Closed

BUG: standalone HTML and Restructured Text wrong heading #3119

norok2 opened this issue Sep 14, 2016 · 10 comments

Comments

@norok2
Copy link

norok2 commented Sep 14, 2016

When converting a restructured text to HTML with:

pandoc --standalone --section-divs --read rst --write html5

the heading and the title get the same level of heading, and all the headers are accordingly generated.
The --section-divs does not influence the behavior.

For example, with the following input (. in heading 3 is there only to prevent GitHub rendering, but should be removed for testing).

=====
Title
=====

heading1
========

heading2
--------

heading3
.~~~~~~~~

the output is:

<head>
  <meta charset="utf-8">
  <meta name="generator" content="pandoc">
  <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes">
  <title>Title</title>
  <style type="text/css">code{white-space: pre;}</style>
  <!--[if lt IE 9]>
    <script src="http://html5shim.googlecode.com/svn/trunk/html5.js"></script>
  <![endif]-->
</head>
<body>
<header>
<h1 class="title">Title</h1>
<h1 class="subtitle">heading1</h1>
</header>
<section id="heading2" class="level1">
<h1>heading2</h1>
<section id="heading3" class="level2">
<h2>heading3</h2>
</section>
</section>
</body>
</html>

when --standalone is not used, the expected output is obtained, i.e.:

<section id="title" class="level1">
<h1>Title</h1>
<section id="heading1" class="level2">
<h2>heading1</h2>
<section id="heading2" class="level3">
<h3>heading2</h3>
<section id="heading3" class="level4">
<h4>heading3</h4>
</section>
</section>
</section>
</section>

EDIT: minor fixes

@norok2
Copy link
Author

norok2 commented Sep 14, 2016

I also tried a different input (same flags - again mind the . below heading3 and 4):

=====
Title
=====

--------
subtitle
--------

heading1
========

heading2
--------

heading3
.~~~~~~~~

heading4
.````````

The non --standalone output:

<section id="title" class="level1">
<h1>Title</h1>
<section id="subtitle" class="level2">
<h2>subtitle</h2>
<section id="heading1" class="level3">
<h3>heading1</h3>
<section id="heading2" class="level4">
<h4>heading2</h4>
<section id="heading3" class="level5">
<h5>heading3</h5>
<section id="heading4" class="level6">
<h6>heading4</h6>
</section>
</section>
</section>
</section>
</section>
</section>

The --standalone output:

<!DOCTYPE html>
<html>
<head>
  <meta charset="utf-8">
  <meta name="generator" content="pandoc">
  <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes">
  <title>Title</title>
  <style type="text/css">code{white-space: pre;}</style>
  <!--[if lt IE 9]>
    <script src="//cdnjs.cloudflare.com/ajax/libs/html5shiv/3.7.3/html5shiv-printshiv.min.js"></script>
  <![endif]-->
</head>
<body>
<header>
<h1 class="title">Title</h1>
<h1 class="subtitle">subtitle</h1>
</header>
<section id="heading1" class="level1">
<h1>heading1</h1>
<section id="heading2" class="level2">
<h2>heading2</h2>
<section id="heading3" class="level3">
<h3>heading3</h3>
<section id="heading4" class="level4">
<h4>heading4</h4>
</section>
</section>
</section>
</section>
</body>
</html>

@jgm
Copy link
Owner

jgm commented Sep 15, 2016

The output with --standalone is correct as far as I can see. Top level section headers are converted to title and subtitle, and other sections are promoted, as specified in docutils documentation. Note that we agree with rst2html.py except on minor details (e.g. using an h2 for subtitle). It gives:

<h1 class="title">Title</h1>
<h2 class="subtitle" id="heading1">heading1</h2>

<div class="section" id="heading2">
<h1>heading2</h1>
<div class="section" id="heading3">
<h2>heading3</h2>
</div>
</div>
</div>

@tarleb
Copy link
Collaborator

tarleb commented Sep 15, 2016

W3C recommends not to use heading elements for subtitles:

h1–h6 elements must not be used to markup subheadings, subtitles, alternative titles and taglines unless intended to be the heading for a new section or subsection.

They also offer a list of common idioms to work around this. Wrapping title and subtitle in spans and putting them in the same header might be viable solution.

<h1><span class="title">Title</span><span class="subtitle">heading1</span><h1>

@jgm
Copy link
Owner

jgm commented Sep 15, 2016

@tarbeb that solution is going to produce very ugly output without special CSS.

@tarleb
Copy link
Collaborator

tarleb commented Sep 15, 2016

That's true. Wrapping things in a <header> element and demoting the subtitle to a paragraph would probably yield better results, but might violate user expectation to see their top-level header wrapped in a h1. Adding a <br> between spans would help readability, but feels weird. I don't know. Would adding a new CSS style to the default template be an option?

@jgm
Copy link
Owner

jgm commented Sep 15, 2016

+++ Albert Krewinkel [Sep 15 16 01:34 ]:

That's true. Wrapping things in header would probably yield better
results, but might violate user expectation to see their top-level
header wrapped in a h1. Adding a
between spans would help
readability, but feels weird. I don't know. Would adding a new style to
the default template be an option?

I've tried to avoid doing that, but it's an option.

It's hard to see what the real downside to using an h1 is
(since it can be styled using the class). Maybe someone
could explain that.

@tarleb
Copy link
Collaborator

tarleb commented Sep 15, 2016

Right, sorry.

There used to be an old rule that multiple <h1> elements confuse search engines and would lead to lower search rankings. I believe this is no longer true, but it's still deeply ingrained in may web-developers' minds.

The other reason is that the W3C designed an outline algorithm which is intended to allow well-defined toc generation. AFAIK no widely used software actually supports that feature, violating the recommendation should not cause much problems in practice. I like standard conformance though, going against it rubs me the wrong way. It's more of an æstetical than a practical issue.

@jgm
Copy link
Owner

jgm commented Sep 15, 2016 via email

@tarleb
Copy link
Collaborator

tarleb commented Sep 15, 2016

Using HTML5 and --section-divs solves most of the W3C outlining algorithm issues, so in general the current behavior should be fine. The only possible issue is with the subtitle (I misunderstood that at first). After further thought, my personal preference would be to wrap the subtitle into a specially styled <p> element instead of an <h1>.

<header>
<h1 class="title">Title</h1>
<p class="subtitle">Subtitle</p>
</header>

It's among the common idiom listed by the W3C and should look okay even when it is unstyled.

@jgm
Copy link
Owner

jgm commented Sep 15, 2016

I don't feel strongly about it. I think I'd be fine with
changing subtitle to go in a p tag. (We should also
consider author, date.)

I think this only concerns the HTML template.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants