Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don’t discard the whole content of blocks in blocks when their bottom padding/border overflows the page #1476

Closed
drboone opened this issue Oct 20, 2021 · 16 comments
Labels
bug Existing features not working as expected

Comments

@drboone
Copy link

drboone commented Oct 20, 2021

For a sample size of 1, v53.3 seems to not split paragraphs in orphan/widow handling? I have a two column layout where it moved a 10-line paragraph to the top of the second column, apparently since there was only a 9-line open space at the bottom of the first column. I don't really want to share the as-yet-unpublished doc here, but could make it available to developers privately.

@drboone
Copy link
Author

drboone commented Oct 20, 2021

Actually, in looking further at this, v52.5 seems to do much the same thing. I just hadn't thought about it possibly being widow/orphan related.

@liZe
Copy link
Member

liZe commented Oct 26, 2021

Hello,

The default value of widows and orphans is 2, meaning that the 10-line paragraph should be split in your case. Could you please share an example?

@drboone
Copy link
Author

drboone commented Oct 26, 2021

I emailed a link to a zip file to your kozea email address.

@liZe
Copy link
Member

liZe commented Oct 27, 2021

I emailed a link to a zip file to your kozea email address.

I don’t have this address anymore, could you please sent it to the one given in my profile instead?

@drboone
Copy link
Author

drboone commented Oct 27, 2021

Done.

@liZe
Copy link
Member

liZe commented Nov 7, 2021

I don’t know really why, but the problem is already fixed in the current master branch.

review_56748.pdf

Could you please check that it also works for you?

@drboone
Copy link
Author

drboone commented Nov 9, 2021

I did this:

pip3 uninstall weasyprint
pip3 install -U html5lib tinycss2 fonttools cssselect2 Pillow Pyphen cffi pydyf pycparser webencodings brotli zopfli
pip3 install git+https://github.com/Kozea/WeasyPrint@ed20d94c49c986dacc9b71de9fa8322e666dbc7b

and then re-generated the pdf from the example kit I sent. In several places, it has no longer started a new column in a place that doesn't seem right, so that's an improvement. There is still one, at the bottom of page 15, column one, that I don't understand.

@drboone
Copy link
Author

drboone commented Nov 9, 2021

I notice your version is different from the one I just generated.
review_56748.pdf

@liZe
Copy link
Member

liZe commented Nov 9, 2021

There is still one, at the bottom of page 15, column one, that I don't understand.

It’s caused by a limitation of our current (and naive) implementation of columns layout: we only break direct children of the columns, not their grandchildren. The situation will probably be improved in version 54 or 55, as we’re currently improving the different layouts (that’s why this bug is already solved 😄).

I notice your version is different from the one I just generated.

Yes, the text layout is a bit different because of small differences in text metrics, probably because of different versions or configurations of Fontconfig or Harfbuzz. The different renderings don’t look right or wrong, there’s nothing to be worried about in my opinion.

@drboone
Copy link
Author

drboone commented Nov 9, 2021

It’s caused by a limitation of our current (and naive) implementation of columns layout: we only break direct children of the columns, not their grandchildren. The situation will probably be improved in version 54 or 55, as we’re currently improving the different layouts (that’s why this bug is already solved smile).

For my edification, could you explain what objects are children and grandchildren of the column in this instance?
Also, do you have any sense of the timeframe for v54? Weeks? Months? Years?

Yes, the text layout is a bit different because of small differences in text metrics, probably because of different versions or configurations of Fontconfig or Harfbuzz. The different renderings don’t look right or wrong, there’s nothing to be worried about in my opinion.

Well, yes and no. I'm not worried about small differences like font metrics, and over all the document looks fine either way. But there are paragraphs that weren't split on pages 12 and 14 that appear to be examples of the same widow/orphan problem I originally reported.

@liZe
Copy link
Member

liZe commented Nov 10, 2021

For my edification, could you explain what objects are children and grandchildren of the column in this instance?

The children of your columns are often div tags. When a div contains text, it can be split, but the blocks it includes (like the p tags for your footnotes) can’t be split.

Also, do you have any sense of the timeframe for v54?

Here are the open issues for version 54. The biggest issue, footnotes support, will be ready for the end of November. The version will thus probably be released in December.

Weeks? Months? Years?

Don’t be afraid. Our changelog can give you a pretty good hint: you won’t have to wait for years before getting a new version.

Well, yes and no. I'm not worried about small differences like font metrics, and over all the document looks fine either way. But there are paragraphs that weren't split on pages 12 and 14 that appear to be examples of the same widow/orphan problem I originally reported.

There were actually 2 different problems in your document:

  • In the main content of your documents, some column breaks were wrong. That was a bug that is now fixed.
  • In the footnotes, some column breaks are still wrong. This is caused by a limitation of our current column layout, and will be fixed later, in version 54 or in version 55.

@drboone
Copy link
Author

drboone commented Nov 10, 2021

liZe, thank you!

@drboone
Copy link
Author

drboone commented Nov 11, 2021

It appears I may be able to clean up the HTML I'm passing to weasyprint, to remove those basically extraneous <p> tags, which may help in the interim.

@liZe liZe changed the title Widow / Orphan default handling Don’t discard the whole content of blocks in blocks when their bottom padding/border overflows the page Nov 14, 2021
@liZe
Copy link
Member

liZe commented Nov 14, 2021

Here is a small example showing what the problem is:

<style>
  @page { margin: 0; size: 3.5em }
  body { line-height: 1; orphans: 1; widows: 1; margin: 0 }
  p { margin: 0 }
  div { padding-bottom: 1em }
</style>
<body>
  abc def
  <div>
    <p>
      hij klm
      nop qrs
    </p>
  </div>
  tuv wxy
  zab
</body>

Here, the bottom padding of the div overflows. We should keep the first line of the p, but instead we discard the whole p block.

(Fun fact: this bug is caused by code that has been written in 2012.)

@liZe liZe added this to the 54.0 milestone Nov 14, 2021
@liZe liZe added the bug Existing features not working as expected label Nov 14, 2021
liZe added a commit that referenced this issue Nov 14, 2021
liZe added a commit that referenced this issue Nov 14, 2021
@liZe
Copy link
Member

liZe commented Nov 14, 2021

046f27b fixes the problem caused by the bottom padding of the div: when this padding overflows, the p is now re-rendered with a smaller space, and the end of the p is rendered on the next page (where the padding of the div will have enough space). The commit also fixes the problem given above.

But there’s one more problem, because your sample is a corner case of this corner case.

When the last child of a block has a bottom margin, and when the block has a border padding/border, we have to include adjoining margins that will not collapse with the next children (because of the padding/border). Currently, when this case occurs, and when the bottom of the page in where the child’s bottom margin should be, we discard the whole last child, instead of re-rendering it. Once more, the solution would be to render it again with less space, so that the beginning of the last child can be rendered on the current page, and the end of the last child (and its bottom margin, and its parent’s bottom padding/border) can be drawn on the next page.

Before fixing this problem, we have to check the rules given in the specification about margins overflowing pages.

@liZe liZe modified the milestones: 54.0, 55.0 Dec 12, 2021
@liZe
Copy link
Member

liZe commented Sep 24, 2022

This bug is fixed with the recent improvements of multicolumn layout.

@liZe liZe closed this as completed Sep 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Existing features not working as expected
Projects
None yet
Development

No branches or pull requests

2 participants