-
-
Notifications
You must be signed in to change notification settings - Fork 678
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Infinite loop and memory leak with NBSP after a long word #614
Comments
Thank you for this bug report. I've tried to reproduce the bug, but everything works fine for me, even with older versions of WeasyPrint. Could you please:
|
Here's the PDF result from http://wooledge.org/~greg/weasy614.html |
Following the "git clone" instructions with a python3 virtualenv, the problem still occurs. I get this warning: "UserWarning: There are known rendering problems with cairo < 1.15.4. WeasyPrint may work with older versions, but please read the note about the needed cairo version on the "Install" page of the documentation before reporting bugs." Debian 9 has libcairo 1.14.8 |
I have the same default font as you (DejaVu Serif) but I can't reproduce this bug, and in this situation it's hard to help you…
Your issue is probably not related to Cairo. Do you have the problem when launching |
|
I've tried WeasyPrint versions between 0.42 and current master branch, I've tried with Cairo 1.14.x (just in case), I've tried with the same font as you, I've tried to change the number of hyphens, I didn't reproduce your bug. The only solution I can think of now is to find an old version of WeasyPrint that works for you and to use |
May be related to #585. |
WeasyPrint version 0.40 ("git checkout v0.40") works. Starting from there, git bisect leads me to:
|
Can reproduce the infinite loop. On Windows 7 (64bit), having Pango 1.40.11. With this reduced html: <html><body>
<br>--------------------------------------------------------------------------------------------------------------------
<br>
</body></html> The When I force the otherwise never-ending Tried to debug but must confess: The way WeasyPrint layouts the pages is a mystery to me. Anyway: It looks like Pango in Definitely |
The bug is in Pango and have been fixed in 1.40.13 (see bug 788115 and bug 785978). It's becoming harder and harder to have recent versions of WeasyPrint working with older versions of Cairo, Pango, GhostScript, etc. Changes introduced in 0.41 and 0.42 rely on a lot of PDF- and Unicode-related features, and having a bleeding-edge system doesn't help me to find corner cases like this one. I'm seriously thinking about upgrading the needed library versions on the install page (I already did with Cairo), and adding warnings at runtime and in the documentation if the needed library versions are not installed. WeasyPrint 0.40 seems to be a good candidate for users with older versions. If I had time, I would even backport some fixes of the 0.42.x branch on 0.40 and make the next 0.x versions from this hybrid version (basically with Python 2 support, no pdfrw, no #528). |
Not to mention how hard it is for Windows users like me to keep their GTK up-to-date. Looking at the Pango bugs and the Pango source, I'm not shure whether even with an up-to-date Pango there won't be similiar issues -- in this case: discrepancies between More debugging and experimenting revealed the problem and an attempt to solve it. Problem After asking Pango whether a text is breakable --
The last letter in those strings is attributed as This inconsistency is the cause of #585 and #614. Idea to work around the Pango bug If diff --git a/weasyprint/layout/inlines.py b/weasyprint/layout/inlines.py
index 0270de50..57d0d126 100644
--- a/weasyprint/layout/inlines.py
+++ b/weasyprint/layout/inlines.py
@@ -809,8 +809,12 @@ def split_inline_box(context, box, position_x, max_x, skip_stack,
if (child.is_in_normal_flow() and
can_break_inside(child)):
# This waiting child is in flow and can be broken,
- # let's break it!
- break_found = True
+ # let's TRY TO break it!
+
+ # cant be shure about that!
+ # Thats what PangoLogAttr told us, have to wait for
+ # PangoLayout
+ # break_found = True
# We break the waiting child at its last possible
# breaking point.
@@ -826,6 +830,13 @@ def split_inline_box(context, box, position_x, max_x, skip_stack,
absolute_boxes, fixed_boxes,
line_placeholders, waiting_floats,
line_children))
+ # prevent #585 and #614
+ # TODO: Expert's review required!
+ break_found = not child_resume_at is None
+ if child_resume_at is None:
+ # PangoLayout decided NOT to break the child
+ child_resume_at = (0, None)
+
children = children + waiting_children_copy
if child_new_child is None:
# May be None where we have an empty TextBox. As you can see, there are definitely issues with the |
According to what's in the Pango bugs, your workaround seems to be a perfect solution!
I think you hit the only |
Version 43 ---------- Released on 2018-11-09. Bug fixes: * `#726 <https://github.com/Kozea/WeasyPrint/issues/726>`_: Make empty strings clear previous values of named strings * `#729 <https://github.com/Kozea/WeasyPrint/issues/729>`_: Include tools in packaging This version also includes the changes from unstable rc1 and rc2 versions listed below. Version 43rc2 ------------- Released on 2018-11-02. **This version is experimental, don't use it in production. If you find bugs, please report them!** Bug fixes: * `#706 <https://github.com/Kozea/WeasyPrint/issues/706>`_: Fix text-indent at the beginning of a page * `#687 <https://github.com/Kozea/WeasyPrint/issues/687>`_: Allow query strings in file:// URIs * `#720 <https://github.com/Kozea/WeasyPrint/issues/720>`_: Optimize minimum size calculation of long inline elements * `#717 <https://github.com/Kozea/WeasyPrint/issues/717>`_: Display <details> tags as blocks * `#691 <https://github.com/Kozea/WeasyPrint/issues/691>`_: Don't recalculate max content widths when distributing extra space for tables * `#722 <https://github.com/Kozea/WeasyPrint/issues/722>`_: Fix bookmarks and strings set on images * `#723 <https://github.com/Kozea/WeasyPrint/issues/723>`_: Warn users when string() is not used in page margin Version 43rc1 ------------- Released on 2018-10-15. **This version is experimental, don't use it in production. If you find bugs, please report them!** Dependencies: * Python 3.4+ is now needed, Python 2.x is not supported anymore * Cairo 1.15.4+ is now needed, but 1.10+ should work with missing features (such as links, outlines and metadata) * Pdfrw is not needed anymore New features: * `Beautiful website <https://weasyprint.org>`_ * `#579 <https://github.com/Kozea/WeasyPrint/issues/579>`_: Initial support of flexbox * `#592 <https://github.com/Kozea/WeasyPrint/pull/592>`_: Support @font-face on Windows * `#306 <https://github.com/Kozea/WeasyPrint/issues/306>`_: Add a timeout parameter to the URL fetcher functions * `#594 <https://github.com/Kozea/WeasyPrint/pull/594>`_: Split tests using modern pytest features * `#599 <https://github.com/Kozea/WeasyPrint/pull/599>`_: Make tests pass on Windows * `#604 <https://github.com/Kozea/WeasyPrint/pull/604>`_: Handle target counters and target texts * `#631 <https://github.com/Kozea/WeasyPrint/pull/631>`_: Enable counter-increment and counter-reset in page context * `#622 <https://github.com/Kozea/WeasyPrint/issues/622>`_: Allow pathlib.Path objects for HTML, CSS and Attachment classes * `#674 <https://github.com/Kozea/WeasyPrint/issues/674>`_: Add extensive installation instructions for Windows Bug fixes: * `#558 <https://github.com/Kozea/WeasyPrint/issues/558>`_: Fix attachments * `#565 <https://github.com/Kozea/WeasyPrint/issues/565>`_, `#596 <https://github.com/Kozea/WeasyPrint/issues/596>`_, `#539 <https://github.com/Kozea/WeasyPrint/issues/539>`_: Fix many PDF rendering, printing and compatibility problems * `#614 <https://github.com/Kozea/WeasyPrint/issues/614>`_: Avoid crashes and endless loops caused by a Pango bug * `#662 <https://github.com/Kozea/WeasyPrint/pull/662>`_: Fix warnings and errors when generating documentation * `#666 <https://github.com/Kozea/WeasyPrint/issues/666>`_, `#685 <https://github.com/Kozea/WeasyPrint/issues/685>`_: Fix many table layout rendering problems * `#680 <https://github.com/Kozea/WeasyPrint/pull/680>`_: Don't crash when there's no font available * `#662 <https://github.com/Kozea/WeasyPrint/pull/662>`_: Fix support of some align values in tables
The following example causes weasyprint 0.42 (on Debian 9 amd64) to go into an infinite (or practically infinite) loop and consume all possible memory. I've worked around it in the application that generates the HTML, partially, but this is really quite nasty and my band-aid won't cover all possible user inputs. I've also had to add resource limits on the script that runs weasyprint.
This is a line.This is another line.
----------------------------------------------------------------------------------------------------------------------------------------
No more lines.
Ugh, the bug tracking system is rendering the HTML. I don't want it to render the HTML. Here, here's the HTML in a text file that you can download: http://wooledge.org/~greg/weasybug614
The text was updated successfully, but these errors were encountered: