Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better fix: don't ignore page breaks on empty nodes #234

Merged
merged 1 commit into from
Sep 12, 2018

Conversation

poire-z
Copy link
Contributor

@poire-z poire-z commented Sep 12, 2018

Previous fix (in #233) was wrong, and was adding too many wrong page breaks (noticed by unit tests in koreader/koreader#4217).
This reverts some bits from it, and fix that hopefully the right way.

Note that this fix (to solve koreader/koreader#4207) may still add some new page breaks (and blank pages) that might be bothering (to me at least :) or feel uneeded. But in my tests, they look like they obey HTML and CSS.
For example if one HTML file in an epub (so, making a DocFragment, which generates a page-break by itself) contains:

<body>
  <div></div>
  <h3>some title</h3>

Previously, the empty div that would have carried the page-break (from DocFragment) was ignored, and the DocFragment page-break was lost, but the one implied by the H3 (we have in epub.css a page-break before H1 H2 and H3) was rendered.
Now, both will be rendered and generate a blank page.

This happened before with some EPUBs for some other reason, and we have a Style Tweak to avoid that (Avoid blank page on chapter start). So, I guess those like me who don't like blank page will have to use that tweak more often.

The diff from before the previous fix looks like this:

@@ -1979,9 +2026,9 @@ int renderBlockElement( LVRendPageContext & context, ldomNode * enode, int x, in
                     lvRect r;
                     enode->getAbsRect(r);
                     if (margin_top>0)
-                        context.AddLine(r.top - margin_top, r.top, pagebreakhelper(enode,width));
+                        context.AddLine(r.top-margin_top, r.top, pagebreakhelper(enode,width));
                     if (padding_top>0)
-                        context.AddLine(r.top,r.top+padding_top,pagebreakhelper(enode,width)|padding_top_split_flag);
+                        context.AddLine(r.top, r.top+padding_top, pagebreakhelper(enode,width)|padding_top_split_flag);

                     // List item marker rendering when css_d_list_item_block
                     int list_marker_padding = 0; // set to non-zero when list-style-position = outside
@@ -2046,12 +2093,26 @@ int renderBlockElement( LVRendPageContext & context, ldomNode * enode, int x, in
                     if ( y < st_y )
                         y = st_y;
                     fmt.setHeight( y + padding_bottom ); //+ margin_top + margin_bottom ); //???
+
+                    if (margin_top==0 && padding_top==0) {
+                        // If no margin or padding that would have carried the page break above, and
+                        // if this page break was not consumed (it is reset to css_pb_auto when used)
+                        // by any child node and is still there, add an empty line to carry it
+                        int pb_flag = pagebreakhelper(enode,width);
+                        if (pb_flag)
+                            context.AddLine(r.top, r.top, pb_flag);
+                    }
+
                     lvRect rect;
                     enode->getAbsRect(rect);
+                    int pb_flag = CssPageBreak2Flags(getPageBreakAfter(enode))<<RN_SPLIT_AFTER;
                     if(padding_bottom>0)
-                        context.AddLine(y+rect.top,y+rect.top+padding_bottom,(margin_bottom>0?RN_SPLIT_AFTER_AUTO:CssPageBreak2Flags(getPageBreakAfter(enode))<<RN_SPLIT_AFTER)|padding_bottom_split_flag);
+                        context.AddLine(y+rect.top, y+rect.top+padding_bottom, (margin_bottom>0?RN_SPLIT_AFTER_AUTO:pb_flag)|padding_bottom_split_flag);
                     if(margin_bottom>0)
-                        context.AddLine(y+rect.top+padding_bottom,y+rect.top+padding_bottom+margin_bottom,CssPageBreak2Flags(getPageBreakAfter(enode))<<RN_SPLIT_AFTER);
+                        context.AddLine(y+rect.top+padding_bottom, y+rect.top+padding_bottom+margin_bottom, pb_flag);
+                    if (margin_bottom==0 && padding_bottom==0 && pb_flag)
+                        // If no margin or padding to carry pb_flag, add an empty line
+                        context.AddLine(y+rect.top, y+rect.top, pb_flag);
                     if ( isFootNoteBody )
                         context.leaveFootNote();
                     return y + margin_top + margin_bottom + padding_bottom; // return block height

Previous fix (2 commits ago) was wrong, and was adding too many
wrong page breaks.
This reverts some bits from it, and fix that hopefully the right way.
@poire-z
Copy link
Contributor Author

poire-z commented Apr 7, 2019

@Frenzie : requesting your specs reading and interpretation wit.

What situation do you think item 3 applies to?

13.3.3 Allowed page breaks

In the normal flow, page breaks can occur at the following places:

1 In the vertical margin between block-level boxes. When an unforced page break occurs here, the used values of the relevant 'margin-top' and 'margin-bottom' properties are set to '0'. When a forced page break occurs here, the used value of the relevant 'margin-bottom' property is set to '0'; the relevant 'margin-top' used value may either be set to '0' or retained.
2 Between line boxes inside a block container box.
3 Between the content edge of a block container box and the outer edges of its child content (margin edges of block-level children or line box edges for inline-level children) if there is a (non-zero) gap between them.

From https://www.w3.org/TR/CSS2/page.html#allowed-page-breaks (not clearer in https://www.w3.org/TR/css-break-3/ or https://www.w3.org/TR/CSS22/page.html#page-breaks)

What could be that gap ? (I get the content edge is inside the padding - so what can be between the outer padding and the inner margin ?)

@Frenzie
Copy link
Member

Frenzie commented Apr 7, 2019

At a glance I'm not sure. I don't really have time to think about it right now. Perhaps https://www.princexml.com/doc-prince/#page-breaks has something useful to say?

@houqp
Copy link
Member

houqp commented Apr 8, 2019

yeah, doesn't make sense to me either, based on https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_Box_Model/Introduction_to_the_CSS_box_model, content edge should the be same as first child's margin edge.

@poire-z
Copy link
Contributor Author

poire-z commented Apr 8, 2019

page breaks can occur at the following places [...] 3 Between the content edge of a block container box and the outer edges of its child content (margin edges of block-level children or line box edges for inline-level children) if there is a (non-zero) gap between them.

Or maybe, it's a twisted way of saying the following :)

A page break can not occur between a parent (margin|border|padding) and it's first child margin:

  • never at the top (between the container and its first child) , as there is never a gap there (*)
  • usually never at the bottom (between the container and its last child) - except if there is a gap there, that can happen if the parent container has a CSS height: specified that is larger than the sum of heights of its children, which would give that non-zero gap at the bottom.

(*) : or can it? can we vertically align a child in its container (when the later has a specified height)? so there would then be a gap at top (vertical-align: bottom works with tables only Ithink)

@Frenzie
Copy link
Member

Frenzie commented Apr 8, 2019

or can it?

Easily: position:relative; top:20px. So it all makes sense then. It's just not something that will occur with default styles out of the box, and the CSS spec is written generically because methods other than the position property might also effect a gap, even if only in the future.

@poire-z
Copy link
Contributor Author

poire-z commented May 5, 2019

OK, another question, although not related to page break, but still to margins.

There's something I don't get that is shown on the sample from http://test.weasyprint.org/suite-css21/chapter8/section3/test35/
It's not related to margin collapsing (which I'm working on), just a matter of simple margins and background-color.
If you get the test html at http://test.weasyprint.org/test-data/suite-css21/margin-collapse-031.htm, change the body color to blue:

  <style type="text/css">
  body
  {
  background-color: blue;
  margin: 8px;
  }

  div#parent-block-container
  {
  background-color: red;
  margin: 0px;
  width: 100%;
  }

  div#child-block
  {
  background-color: lime;
  margin: 100px 0px;
  }
  </style>

 </head>

 <body>

  <div id="parent-block-container">
   <div id="child-block">
    There should be no red in this page
   </div>
  </div>

 </body>

You get a single lime line over a full blue background:
image

I don't understand why there is not red (as the test say there shouldn't be any red).
Even if you set the parent-block-container margin to be greater than the child-block, you don't get it, so it's not a matter of margin collapsing.
I'm thinking:

  • ok, the lime is applied only to the line itself, not over the 100px margin of child-block - that, I agree.
  • parent-block-container is the container of child-block, so, even if it has margin:0px, it still contains it, and so should draw it's red background-color and it's childrent should be laid over that.
  • so, why is the BODY blue background-color taking that over?

No matter what margin value you set, you don't get any red. Only when adding some padding to parent-block-container, you get the red when you (I) expect it.

My brain may be too formatted by crengine box rendering model, and that's why I don't get it - so, is this behaviour expected and clear to you? If yes, can you explain what's happening and why there is no red background?

It looks to me that this thing I noted should have no impact regarding this situation:

// Note about box model/sizing in crengine:
// https://quirksmode.org/css/user-interface/boxsizing.html says:
// - In the W3C box model, the width of an element gives the width of
// the content of the box, excluding padding and border.
// - In the traditional box model, the width of an element gives the
// width between the borders of the box, including padding and border.
// By default, all browsers use the W3C box model, with the exception
// of IE in "Quirks Mode" (IE5.5 Mode), which uses the traditional one.
//
// These models are toggable with CSS in current browsers:
// - the first one is used when "box-sizing: content-box" (default in browsers)
// - the second one is used when "box-sizing: border-box".
//
// crengine uses the traditional one (box-sizing: border-box).

@Frenzie
Copy link
Member

Frenzie commented May 5, 2019

It's not related to margin collapsing (which I'm working on), just a matter of simple margins and background-color.

But it is? There's no border or padding, so the top margins of both elements are collapsed. It's the second scenario described on this page: https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_Box_Model/Mastering_margin_collapsing

@poire-z
Copy link
Contributor Author

poire-z commented May 5, 2019

It has 0 margin, so there's not much margins to collapse :)
Set all margins to 0, and only/any one of them (body, parent-block-container or child-block) to non-zero: you still never get anything red.
That page talks about margins collapsing. Not elements collapsing/disappearing, and their bgcolor not applied.
Cause it looks indeed like all three margins (whatever their values) makes only one, and the outer bgcolor wins and applies on that. Which I haven't seen written in any specs.

But, may be it's the last (vague) sentence of the paragraph you point to that implies this behaviour?:

The collapsed margin ends up outside the parent

meaning all inner margins are set to zero, so all their content boxes are the same, so no room to apply any intermediate background-color on?

@Frenzie
Copy link
Member

Frenzie commented May 5, 2019

I don't think the second sentence is vague. :-) It's a logical consequence of parent and first/last-child margins collapsing.

Cause it looks indeed like all three margins (whatever their values) makes only one, and the outer bgcolor wins and applies on that. Which I haven't seen written in any specs.

It's written right here: ;-)

  • both belong to vertically-adjacent box edges, i.e. form one of the following pairs:
    • top margin of a box and top margin of its first in-flow child
    • bottom margin of box and top margin of its next in-flow following sibling

@poire-z
Copy link
Contributor Author

poire-z commented May 5, 2019

I don't think the second sentence is vague. :-)

I mean it looks like an insignificant small and ending sentence (and I didn't focus much on it, given all the other infos), but it has some huge consequence on the algorithm you take to have a go at that...
They do take care of saying "this implies that" for many things, which helps understanding how that should work and what things you have to not forget. But they (w3 and mozilla) didn't on this last one :|

So, my naive way of going at it (because margins are transparent, and used only for positionning): parent has margin-top: 10px: shift children y down by10px - child has margin-top: 13px, no content/padding/border in between: shift sub children y down by 3px instead of 13 - is not the right way, mostly just because of that background color...
I would have to keep a state until I meet some content/padding/border, and set all the involved y shifts to 0, and go back up to the top most parent involved to shift its y by 13px... Much less simple :)

Anyway, OK, I get the why of that behaviour.

@poire-z
Copy link
Contributor Author

poire-z commented May 10, 2019

Another question, where the relevant specs are not cleared about their combinations:
https://www.w3.org/TR/CSS2/page.html#allowed-page-breaks
https://www.w3.org/TR/CSS21/box.html#collapsing-margins

What should happen when collapsing margins meet forced page-breaks?
Example:

            some text in previous nested block elements
        margin-bottom: 5px
    margin-bottom: 30px
margin-bottom: 5px
margin-top: 5px
    margin-top: 10px , with page-break-before: always
        margin-top: 5px
            some text in these nested block elements

If there were no page break, if I understand margin collapsing right, this should end up with a 30px margin between the 2 text parts. (and what should we end up when there is a need for an unforced page split at this point? the specs say the used values of both margin are 0, so, no margin top, really?)

With this forced page-break, I don't know what we should get:

  • a single 30px margin, at end of previous page (possibly just ignored) - and none on the next page
  • no margin at end of previous page, and a 30px margin at top of next page
  • 30px on previous page, but still 10px at top of next page?

Closest specs I found, not even talking about collapsing, just adjoining, not saying which happen first, are:
https://www.w3.org/TR/css-break-4/#propdef-margin-break and https://www.w3.org/TR/css-break-3/#break-margins

And what should happen with:

            some text in previous nested block elements
        margin-bottom: 5px
    margin-bottom: 30px
margin-bottom: 5px
margin-top: 5px
    margin-top: 23px , with page-break-before: always
        margin-top: 5px
            margin-top: 17px , with another page-break-before: always
                margin-top: 5px
                    some text in these nested block elements

3 pages, or just 2, the 2 page-breaks collapsing (as the margin they are on)? And with which values of resulting top margin?

@Frenzie
Copy link
Member

Frenzie commented May 10, 2019

It's not really something I've had to deal with before. Without having read the specs, as a rule of thumb I'd say that when in doubt, create a test case and see what Prince does. It's likely to be right, or at the very least to do something sensible.

@poire-z
Copy link
Contributor Author

poire-z commented May 11, 2019

Thanks for the hint about Prince.
For reference, here's what Prince does:


<div class=orange style="margin-bottom: 1em">
    <div class=yellow style="margin-bottom: 16em">
        <div class=green style="margin-bottom: 1em">
            some text in previous nested block elements
        </div>
    </div>
</div>
<div class=red style="margin-top: 1em">
    <div class=blue style="margin-top: 7em; page-break-before: always">
        <div class=green style="margin-top: 1em">
            <div class=orange style="margin-top: 12em; page-break-before: always">
                <div class=yellow style="margin-top: 1em">
                    some text in these nested block elements
                </div>
            </div>
        </div>
    </div>
</div>

or

<div class=orange style="margin-bottom: 1em">
    <div class=yellow style="margin-bottom: 16em">
        <div class=green style="margin-bottom: 1em">
            some text in previous nested block elements
        </div>
    </div>
</div>
<div class=red style="margin-top: 1em">
    <div class=blue style="margin-top: 12em; page-break-before: always">
        <div class=green style="margin-top: 1em">
            <div class=orange style="margin-top: 7em; page-break-before: always">
                <div class=yellow style="margin-top: 1em">
                    some text in these nested block elements
                </div>
            </div>
        </div>
    </div>
</div>

Only one page break, top margin on next page of 12em.


<div class=yellow style="margin-bottom: 16em">
    [...] many lines so that last one is drawn at end of page
    <div class=orange>some other text</div>
    <div class=green>some text</div>
    <div class=orange>some other text</div>
</div>
<div class=red style="margin-top: 1em">
    <div class=blue style="margin-top: 12em;">
        <div class=green style="margin-top: 1em">
            <div class=orange style="margin-top: 7em; page-break-before: always">
                <div class=yellow style="margin-top: 1em">
                    some text in these nested block elements
                </div>
            </div>
        </div>
    </div>
</div>

margin bottom dropped on previous page, margin top of 12em on next page


<div class=yellow style="margin-bottom: 16em">
    [...] many lines so that last one is drawn at end of page
    <div class=orange>some other text</div>
    <div class=green>some text</div>
    <div class=orange>some other text</div>
</div>
<div class=red style="margin-top: 1em">
    <div class=blue style="margin-top: 12em;">
        <div class=green style="margin-top: 1em">
            <div class=orange style="margin-top: 7em;">
                <div class=yellow style="margin-top: 1em">
                    some text in these nested block elements
                </div>
            </div>
        </div>
    </div>
</div>

So, same as previous one with all page-break-before removed, and an unforced page split happening between the 2 main blocks:
margin bottom dropped on previous page, zero margin top on next page

@Frenzie
Copy link
Member

Frenzie commented May 11, 2019

Thanks for the hint about Prince.

It's not only an excellent renderer in general, but for paged media it implements all the fancy paged media stuff that web-focused renderers may not because it's not where there focus lies. :-) (Although it seems they do a lot more than they used to.)

@poire-z
Copy link
Contributor Author

poire-z commented Jul 13, 2019

Most of the things questioned above have been implemented (correctly I hope) in #299.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants