Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to add a page break (via Sphinx rst)? #186

Closed
KubaO opened this issue Oct 14, 2020 · 14 comments
Closed

How to add a page break (via Sphinx rst)? #186

KubaO opened this issue Oct 14, 2020 · 14 comments
Labels

Comments

@KubaO
Copy link

KubaO commented Oct 14, 2020

A seemingly simple problem: how do I insert page breaks at arbitrary points in the rst document that is passed to rinoh through Sphinx as the front end. My search-fu comes up dry since Sphinx is apparently a document processor that doesn't understand the need for a common way to express page breaks (I find that hilarious and saddening at once - stuff that is pretty much trivial in any other text "processing" system, even ancient ones)...

@brechtm
Copy link
Owner

brechtm commented Oct 15, 2020

There are two ways to insert page breaks:

  1. Using a custom style sheet, you can force page breaks before arbitrary sections by setting the page_break style attribute. Since v0.4.3.dev1, page_break can be set on any flowable, not just sections. To insert a page break at an arbitrary point, add a class to a directive by setting the :class: attribute, or using the (rst-)class directive. The page break will be inserted before the corresponding element. Assuming v0.4.3.dev1:

    your reStructuredText file:

    .. image:: images/screenshot.png
       :class: page-break
    
    A regular paragraph.
    
    .. rst-class:: page-break
    
    This paragraph will trigger a page break.

    your custom style sheet:

    [page-break-paragraph : Paragraph(has_class="page-break")]
    base = default
    page_break = any
    
    [page-break-image : Image(has_class="page_break")]
    base = image
    page_break = any

    Note that the newly defined styles will also determine the styling of the page-breaking element. To style them like other elements in the document, you need to set their base style to the default style. Refer to the style log to figure out which styles these are.

  2. This is undocumented, but rinohtype supports rst2pdf's PageBreak. Note that only PageBreak (without argument) is supported (so no EvenPageBreak or OddPageBreak). To use it, insert the following into the reStructuredText where you want the break:

    .. raw:: pdf
    
       PageBreak

You should avoid inserting page breaks at arbitrary locations to improve page layout, e.g. to avoid widows or orphans. As soon as your document contents change, you'll have to reposition the page breaks. Widow and orphan handling should eventually be handled automatically by rinohtype.

A better use of page breaks is to systematically apply them. For example, before every new section. In this case, the first method is the recommended way of doing this, but I understand that this is not trivial to figure out given the current documentation. A tutorial on document styling should be able to help with that (#168).

@KubaO
Copy link
Author

KubaO commented Oct 15, 2020

That's exactly what I was missing. Thank you!

The problem with breaks on sections is that the section level alone is not enough context in my document: there is a chapter where sections at a certain level should start on a new page, but only in that chapter and not elsewhere. I imagine that this is not an uncommon problem, especially in larger technical reference manuals that rinohtype would be otherwise a good match for.

@brechtm
Copy link
Owner

brechtm commented Oct 16, 2020

there is a chapter where sections at a certain level should start on a new page, but only in that chapter and not elsewhere

In this case you can set an ID on the chapter and create new style with a selector that matches all subsections of that chapter.

[page-break-sections : Section(id='chapter-x') / Section]
page_break = any

@brechtm brechtm closed this as completed Oct 16, 2020
@KubaO
Copy link
Author

KubaO commented Oct 16, 2020

While the raw pdf PageBreak directive works stand-alone, it doesn't work as a substitution, i.e.:

.. |newpage| raw:: pdf

   PageBreak

Is this a restructured text "spec bug", or a Sphinx bug, or something that rinohtype needs to add special handling for, or is there a workaround?

The alternative form using explicit replacements doesn't work either:

.. role:: raw-pdf(raw)
   :format: pdf

.. |newpage| replace:: :raw-pdf:`PageBreak`

Summary:

line1

.. raw:: pdf

   PageBreak

line2 - on new page

:raw-pdf:`PageBreak`

line3 - same page

|newpage|

line4 - same page

There's something fundamentally broken here, although I imagine it's Sphinx's fault, as such things are notoriously underdocumented and there seem to be no test cases for them.

@brechtm
Copy link
Owner

brechtm commented Oct 17, 2020

While the raw pdf PageBreak directive works stand-alone, it doesn't work as a substitution.

As far as I know, substitutions can only be used for inline content. And the only supported substitution directives are replace, image, unicode and date. So I don't think reStructuredText supports what you want to achieve.

Note that inserting page break through is a raw role is not supported, neither by rst2pdf. This simply wouldn't make much sense, inserting a page break in the middle of a paragraph. Sure, you could use the |newpage| substitution in an otherwise empty paragraph, but that feels more like a hack.

You could create a custom directive similar to rst2pdf's page directive. That would just output a raw directive with PageBreak.

There's something fundamentally broken here, although I imagine it's Sphinx's fault, as such things are notoriously underdocumented and there seem to be no test cases for them.

I don't think it's broken,. It's just not an included feature. But it could be useful to have a directive-equivalent of substitutions. Something along these lines:

.. |page-break| directive::

   .. raw:: pdf

      PageBreak

Use it like this:

.. |page-break|::

That would require changes to the docutils core, however. The following could perhaps be implemented as a third-party directive:

.. define-alias:: page-break

   .. raw:: pdf

      PageBreak

Use it like this:

.. alias:: page-break

I do have to agree that reStructuredText is often hard to grasp and sometimes very confusing though! And the docutils homepage that's straight from the 90s makes finding answers unnecessarily painful. But I guess it's the best that's available, with perhaps the exception of asciidoc (which I haven't used).

@KubaO
Copy link
Author

KubaO commented Oct 18, 2020

The thing is: all those "unsupported" features should be producing errors or at least warnings. Of course that's a problem with Sphinx and/or docutils, but I find it amusing and saddening at once how unfinished those are. The hoops necessary to make this stuff work for the simplest things seem like something from the mainframe era :( Things like page breaks and macros really should be a zero-friction user experience in anything that's not a toy project... I'm sorry that you have to indirectly deal with the sad state of Sphinx.

@brechtm
Copy link
Owner

brechtm commented Oct 19, 2020

The thing is: all those "unsupported" features should be producing errors or at least warnings.

I didn't try this before, but now I see that .. |newpage| raw:: pdf is indeed not producing any errors. It seems it is (almost) equivalent to a custom raw-pdf role, like you suggested:

.. |newpage| raw:: pdf

   PageBreak

.. role:: raw-pdf(raw)
   :format: pdf

|newpage| is almost equivalent to :raw-pdf:`PageBeak`

produces this document tree (rst2pseudoxml.py output):

<document source="test.rst">
    <substitution_definition names="newpage">
        <raw format="pdf" xml:space="preserve">
            PageBreak
    <paragraph>
        <raw format="pdf" xml:space="preserve">
            PageBreak
         is almost equivalent to
        <raw classes="raw-pdf" format="pdf" xml:space="preserve">
            PageBeak

So it seems there is support for a raw substitution directive, but it isn't documented. You can open an issue with docutils to address this, if you like. Of course, this will not solve your original problem since these produce inline content.

The hoops necessary to make this stuff work for the simplest things seem like something from the mainframe era :(

I have briefly fantasized about writing a new reStructuredText parser in the past, but I believe I am unlikely to produce anything that performs better than docutils in any realistic timeframe. I'm afraid there is just a lot of complexity to parsing a structured text syntax, simply because it is not so clearly defined as, say XML.

I think the best way to deal with this problem is to help improve docutils and work on improved documentation available through an 'official' domain like restructuredtext.org or restructuredtext.net. With respect to the latter, perhaps you are interested in helping setting up something like that? I'm sure we could easily find a handful of people that would be interested in working on that, for example the folks over at ReadTheDocs.

@hamzamohdzubair
Copy link

hamzamohdzubair commented Jan 21, 2021

I use the following

.. raw:: pdf

   PageBreak

after every section in every chapter.

Is there a way to achieve the same results by adding something in the stylesheet, so i don't have to add that code after every section in every chapter?

The thing is unwittingly (adapting from one of the answers above) i tried something like this:

[page-break-sections: Section]
page_break = any

not knowing that this actually adds a pagebreak before every section not after.

@brechtm
Copy link
Owner

brechtm commented Jan 21, 2021

@hamzamohdzubair The page_break style property indeed controls page breaks before each flowable, because that seemed the most intuitive to me (i.e. always start chapters on a new page). If you want to add a page break after a given flowable, you'll have to set the page_break property on the next flowable.

I'm interested to hear about your specific use case that seems to map better to a page-break-after functionality.

@hamzamohdzubair
Copy link

hamzamohdzubair commented Jan 21, 2021

@brechtm Actually I just want every chapter to start from a new page, as well as every section to start from new page. How should i do that? The only thing i was thinking was, will it create a page break for the first section also? Then the first page of a new chapter would be empty

@brechtm
Copy link
Owner

brechtm commented Feb 4, 2021

I see. To prevent a page break before the first section, a selector would need to be able to differentiate between the first and subsequent sections. That could be handled by adding a new property 'position' to the Section (or it's superclass Flowable to make it more generally usable) class that returns the position of the section within it's parent Section (a chapter is also a Section). The selector could then check for position != 0.

But what if you have for example an introductory paragraph before the first section of a paragraph? Do you still want that section to appear on the same page? The above solution will not be able to differentiate between the two options.

I had a go at implementing the position property in the wip/section-position branch. Would you be interested in continuing this and providing a pull request?

@brechtm
Copy link
Owner

brechtm commented Feb 4, 2021

I just realized that a more pragmatic solution is to set a custom class (e.g. no-break) on the first section of each chapter in your reStructuredText sources and make your selector only match other sections. That will require the implementation of a not_has_class property, which is trivial to implement.

While this isn't automatic, it does leave you the option of removing the class in cases where you do want a page break for the first section, such as when you have an introductory paragraph.

I do agree that a page_break_after style property would be useful here. But I'm still hesitant to add a second page-breaking property...

@brechtm
Copy link
Owner

brechtm commented Feb 4, 2021

Yet another option is to make flowables, just like inline elements, accept before and after style properties. These would allow inserting arbitrary flowables, including page breaks.

@brechtm
Copy link
Owner

brechtm commented Feb 4, 2021

@hamzamohdzubair Oops. I just noticed that feature was already implemented by yours truly some time ago! 🤣
rinohtype will only perform the page break if there is already a non-heading element on the page†. So just set he page_break property to true for your sections and see what happens.

† Actually, this is a side-effect of the keep_with_next attribute set on headings (see mark_page_nonempty()); a page break is only performed when the page is still considered "empty".

@brechtm brechtm mentioned this issue Jan 19, 2023
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants