New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WeasyPrint 0.42 gets stuck #560

Closed
dhimmel opened this Issue Jan 16, 2018 · 21 comments

Comments

Projects
None yet
5 participants
@dhimmel
Contributor

dhimmel commented Jan 16, 2018

We are calling WeasyPrint via pandoc. With 0.42 and a certain input, the command never completes, but does not throw an error, causing builds to timeout.

Downgrading from weasyprint 0.42 to 0.41 solves the issue: greenelab/scihub-manuscript@5cb1245 produced a passing build.

The issue doesn't happen for all inputs (pandoc manuscripts). See for example, this passing build with WeasyPrint 0.42.

The command that fails is:

pandoc \
  --from=markdown \
  --to=html5 \
  --pdf-engine=weasyprint \
  --pdf-engine-opt=--presentational-hints \
  --filter=pandoc-fignos \
  --filter=pandoc-eqnos \
  --filter=pandoc-tablenos \
  --bibliography=$BIBLIOGRAPHY_PATH \
  --csl=$CSL_PATH \
  --metadata link-citations=true \
  --webtex=https://latex.codecogs.com/svg.latex? \
  --css=webpage/github-pandoc.css \
  --output=output/manuscript.pdf \
  $INPUT_PATH

Any ideas on what the problem could be or how to better diagnose the issue?

@liZe

This comment has been minimized.

Member

liZe commented Jan 16, 2018

It may be a duplicate of #557. Could you please try to reproduce with master?

dhimmel added a commit to greenelab/scihub-manuscript that referenced this issue Jan 16, 2018

@dhimmel

This comment has been minimized.

Contributor

dhimmel commented Jan 16, 2018

Will comment again when the build for greenelab/scihub-manuscript@cfe2a05 is complete. So far not looking good (running for over 80 minutes), although its not clear yet exactly where its getting stuck. Interesting that it's not timing out after inactivity like before (that could be a Travis issue, perhaps the root cause of this backlog).

Update: build has been "Running for 5 hrs 46 min 25 sec". I think we broke Travis hehe.

@dhimmel

This comment has been minimized.

Contributor

dhimmel commented Jan 19, 2018

The build finally timed out after running for 8 hours. It appears the build got stuck while running WeasyPrint. So I don't think the issue has been fixed as of ea9ffc9.

@liZe

This comment has been minimized.

Member

liZe commented Jan 20, 2018

Is it possible to get a HTML+CSS file generated by pandoc that makes the bug happen?

@dhimmel

This comment has been minimized.

Contributor

dhimmel commented Jan 20, 2018

We also have pandoc export an HTML page, which has a few additional javascript elements. This HTML page also experiences the issue, so we can switch to it for debugging (removing the need to deal with pandoc).

Currently I get the hangup locally when running the following with Python 3.6.4 and WeasyPrint 0.42

weasyprint https://greenelab.github.io/scihub-manuscript/ weasyprint.pdf

Note that the content at https://greenelab.github.io/scihub-manuscript/ will change, so that URL may no longer trigger the issue in the future. If so, the versioned source for that webpage is preserved here.

@liZe

This comment has been minimized.

Member

liZe commented Jan 21, 2018

I think we broke Travis hehe.

😄

Currently I get the hangup locally when running the following with Python 3.6.4 and WeasyPrint 0.42
weasyprint https://greenelab.github.io/scihub-manuscript/ weasyprint.pdf

I can't reproduce, even with the preserved version, probably because we don't have the same default font. Could you please tell me what launching fc-match sans-serif in a terminal gives on your system?

@liZe liZe added this to the 43 milestone Jan 21, 2018

@dhimmel

This comment has been minimized.

Contributor

dhimmel commented Jan 21, 2018

I'm on Ubuntu 17.10:

$ fc-match sans-serif
DejaVuSans.ttf: "DejaVu Sans" "Book"
@liZe

This comment has been minimized.

Member

liZe commented Jan 22, 2018

$ fc-match sans-serif
DejaVuSans.ttf: "DejaVu Sans" "Book"

I've got the same default font, I don't know why it works for me. I've tried with both 0.42 and master, the PDF is correctly generated. I don't know how I could find what's going on…

@dhimmel

This comment has been minimized.

Contributor

dhimmel commented Jan 22, 2018

I created a docker container, hoping that it would exhibit the error:

docker run \
  --name=weasyprint-560 \
  --interactive --tty \
  --entrypoint=bash \
  python:3.6

Then inside the container's bash shell, I ran:

pip install weasyprint==0.42
cd home
git clone --single-branch --branch gh-pages https://github.com/greenelab/scihub-manuscript.git
cd scihub-manuscript
git checkout 0f1a35706985507ab12ad7a8c3c97d99d6e4aaa0
weasyprint index.html weasyprint.pdf

This weasyprint command completed... i.e. no bug. This image is based on Debian 8. After lunch I will see if the Ubuntu image gets the error.

@dhimmel

This comment has been minimized.

Contributor

dhimmel commented Jan 22, 2018

I switched to using conda to manage the environment in the Docker, so we can potentially better replicate our error:

docker run \
  --name=weasyprint-560 \
  --interactive --tty \
  --entrypoint=bash \
  continuumio/miniconda3:4.3.27

Then I ran:

apt-get install --yes gcc
cd /home
wget https://github.com/greenelab/manubot-rootstock/raw/59af0a2bdc23bbf48fae0acdcb8183888f12880e/build/environment.yml
conda env create --file=environment.yml
source activate manubot
git clone --single-branch --branch gh-pages https://github.com/greenelab/scihub-manuscript.git
cd scihub-manuscript
git checkout 0f1a35706985507ab12ad7a8c3c97d99d6e4aaa0
weasyprint index.html weasyprint.pdf

Unfortnately, weasyprint gives the following error:

OSError: dlopen() failed to load a library: cairo / cairo-2

Which seems to indicate that cairo is not installed (as per Kozea/CairoSVG#84), although it seems that conda should be installing it. Anyways, I'll update if I make anymore progress.

@mengyyy

This comment has been minimized.

mengyyy commented Jan 25, 2018

After i update weasyprint from 0.41 to 0.42 ,I found it could not generate pdf or png sfter it use 90% cpu for 30 minutes.
Python 3.5.2
Ubuntu 16.04

@liZe

This comment has been minimized.

Member

liZe commented Jan 28, 2018

After i update weasyprint from 0.41 to 0.42 ,I found it could not generate pdf or png sfter it use 90% cpu for 30 minutes.

@mengyyy Did you try the current master branch? You may have hit #557.

@charno6

This comment has been minimized.

charno6 commented Feb 3, 2018

Hi,

I am running into some issues with WeasyPrint getting stuck when creating PDFs from certain HTML files. I have now managed to (kind of) narrow down when this error occurs. My coding skills are insufficient to go into the WeasyPrint code to find out why this is happening; I hope this is helpful anyway. The files are attached.

wp-table-demonstration.zip

Basically, it's quite strange: On my system, the file "triangulate-error.html" will cause WeasyPrint to go into a kind of infinite loop when printing a page that contains a table; while "triangulate-noerror.html" will create a PDF within a matter of seconds. The only difference between them is in one additional "strong" tag in one of the cells. No error is logged by WeasyPrint. Interestingly, one other way to make the issue disappear is to change the font in the CSS to "Times New Roman". (I have also tried Georgia, Palatino, and my preferred font Iowan Old Style.)

My system is macOS Sierra 10.12.6 (16G1212), Python 3.6.4, WeasyPrint version 0.42.1.
I hope someone is able to reproduce this problem. If there is any way I can help, let me know.

EDIT: I have looked in-depth at another article that was causing the problem. I now have the suspicion that it is probably something to do with using tags that affect formatting such as "strong" and "em" within brackets, both round and square.

@liZe

This comment has been minimized.

Member

liZe commented Feb 3, 2018

I hope someone is able to reproduce this problem.

I can reproduce, thanks a lot for the example.

@liZe liZe closed this in 28d2e88 Feb 3, 2018

@dhimmel

This comment has been minimized.

Contributor

dhimmel commented Feb 4, 2018

I updated the test PR (greenelab/scihub-manuscript#39) to use the latest WeasyPrint commit. This changed the build from timing out to failing. See the new error here:

  File "/home/travis/miniconda/envs/manubot/lib/python3.6/site-packages/weasyprint/layout/inlines.py", line 201, in skip_first_whitespace
    result = skip_first_whitespace(box.children[index], next_skip_stack)
IndexError: list index out of range
Error producing PDF.

Haven't looked at this in detail but wanted to give a heads up.

liZe added a commit that referenced this issue Feb 4, 2018

@liZe liZe reopened this Feb 4, 2018

@liZe

This comment has been minimized.

Member

liZe commented Feb 4, 2018

This changed the build from timing out to failing.

I'm really sorry 😞. Your original URL raises the error, I'll fix that and add another non-regression test.

liZe added a commit that referenced this issue Feb 4, 2018

@liZe

This comment has been minimized.

Member

liZe commented Feb 4, 2018

I've checked the tricky part of the new breaking line algorithm that causes these bugs (see #301 and #528). I've added some comments to help us in the future and corrected a couple of problems. I've also added a test to make sure this case won't happen again.

I really appreciate the time you take to report the issues and provide examples. If you find other crashes, please report them as well, I'll do my best to fix them as soon as possible!

@liZe liZe closed this Feb 4, 2018

liZe added a commit that referenced this issue Feb 4, 2018

liZe added a commit that referenced this issue Feb 4, 2018

@dhimmel

This comment has been minimized.

Contributor

dhimmel commented Feb 4, 2018

I really appreciate the time you take to report the issues and provide examples.

No worries! thanks for the fixes. I updated greenelab/scihub-manuscript#39 to use WeasyPrint 79e2b42 and the build succeeded. So I think we're finally good!

@charno6

This comment has been minimized.

charno6 commented Feb 4, 2018

I concur, thank you for the (rapid) fixes! I've just run WeasyPrint 0.42.2 against my list of articles that were previously causing problems and they all passed without a hitch. Thank you so much!

dhimmel added a commit to dhimmel/manubot-rootstock that referenced this issue Feb 6, 2018

Update environment on 2018-02-06
Should upstream fix issues:

Extra brackets around citations in figure captions
jgm/pandoc#4272

WeasyPrint 0.42 gets stuck
Kozea/WeasyPrint#560

Updated pandoc-xnos with better semantic versioning
tomduck/pandoc-fignos#46

dhimmel added a commit to greenelab/manubot-rootstock that referenced this issue Feb 6, 2018

Update environment on 2018-02-06 (#108)
Fix upstream issues:

Extra brackets around citations in figure captions
jgm/pandoc#4272

WeasyPrint 0.42 gets stuck
Kozea/WeasyPrint#560

Updated pandoc-xnos with better semantic versioning
tomduck/pandoc-fignos#46

dhimmel added a commit to greenelab/manubot-rootstock that referenced this issue Feb 6, 2018

Update environment on 2018-02-06 (#108)
This build is based on
310dd07.

This commit was created by the following Travis CI build and job:
https://travis-ci.org/greenelab/manubot-rootstock/builds/338192586
https://travis-ci.org/greenelab/manubot-rootstock/jobs/338192587

[ci skip]

The full commit message that triggered this build is copied below:

Update environment on 2018-02-06 (#108)

Fix upstream issues:

Extra brackets around citations in figure captions
jgm/pandoc#4272

WeasyPrint 0.42 gets stuck
Kozea/WeasyPrint#560

Updated pandoc-xnos with better semantic versioning
tomduck/pandoc-fignos#46

dhimmel added a commit to greenelab/manubot-rootstock that referenced this issue Feb 6, 2018

Update environment on 2018-02-06 (#108)
This build is based on
310dd07.

This commit was created by the following Travis CI build and job:
https://travis-ci.org/greenelab/manubot-rootstock/builds/338192586
https://travis-ci.org/greenelab/manubot-rootstock/jobs/338192587

[ci skip]

The full commit message that triggered this build is copied below:

Update environment on 2018-02-06 (#108)

Fix upstream issues:

Extra brackets around citations in figure captions
jgm/pandoc#4272

WeasyPrint 0.42 gets stuck
Kozea/WeasyPrint#560

Updated pandoc-xnos with better semantic versioning
tomduck/pandoc-fignos#46
@samdmarshall

This comment has been minimized.

samdmarshall commented Mar 8, 2018

Hi, I am running into the problem described above when using weasyprint 0.42.2. This is the html page that is causing the problem for me (https://pewpewthespells.com/blog/sparse_sdks.html). I am also generating the page via pandoc with the following command:

pandoc 
  --from markdown+grid_tables 
  --to html5 
  --include-in-header "header.html" 
  --highlight-style pygments 
  --email-obfuscation references 
  sparse_sdks.md 
  --output "sparse_sdks.html"

When weasyprint is executed and proceeds to get stuck being hung and is killed via local interrupt this is the traceback I get:

Traceback (most recent call last):
  File "/usr/local/bin/weasyprint", line 11, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.5/dist-packages/weasyprint/__main__.py", line 177, in main
    getattr(html, 'write_' + format_)(output, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/weasyprint/__init__.py", line 182, in write_pdf
    font_config=font_config).write_pdf(
  File "/usr/local/lib/python3.5/dist-packages/weasyprint/__init__.py", line 143, in render
    font_config)
  File "/usr/local/lib/python3.5/dist-packages/weasyprint/document.py", line 326, in _render
    [Page(p, enable_hinting) for p in page_boxes],
  File "/usr/local/lib/python3.5/dist-packages/weasyprint/document.py", line 326, in <listcomp>
    [Page(p, enable_hinting) for p in page_boxes],
  File "/usr/local/lib/python3.5/dist-packages/weasyprint/layout/__init__.py", line 55, in layout_document
    context, root_box, html, cascaded_styles, computed_styles))
  File "/usr/local/lib/python3.5/dist-packages/weasyprint/layout/pages.py", line 601, in make_all_pages
    context, root_box, page_type, resume_at, page_number)
  File "/usr/local/lib/python3.5/dist-packages/weasyprint/layout/pages.py", line 520, in make_page
    positioned_boxes, positioned_boxes, adjoining_margins)
  File "/usr/local/lib/python3.5/dist-packages/weasyprint/layout/blocks.py", line 83, in block_level_layout
    adjoining_margins)
  File "/usr/local/lib/python3.5/dist-packages/weasyprint/layout/blocks.py", line 111, in block_box_layout
    page_is_empty, absolute_boxes, fixed_boxes, adjoining_margins)
  File "/usr/local/lib/python3.5/dist-packages/weasyprint/layout/blocks.py", line 638, in block_container_layout
    adjoining_margins)
  File "/usr/local/lib/python3.5/dist-packages/weasyprint/layout/blocks.py", line 83, in block_level_layout
    adjoining_margins)
  File "/usr/local/lib/python3.5/dist-packages/weasyprint/layout/blocks.py", line 111, in block_box_layout
    page_is_empty, absolute_boxes, fixed_boxes, adjoining_margins)
  File "/usr/local/lib/python3.5/dist-packages/weasyprint/layout/blocks.py", line 638, in block_container_layout
    adjoining_margins)
  File "/usr/local/lib/python3.5/dist-packages/weasyprint/layout/blocks.py", line 83, in block_level_layout
    adjoining_margins)
  File "/usr/local/lib/python3.5/dist-packages/weasyprint/layout/blocks.py", line 111, in block_box_layout
    page_is_empty, absolute_boxes, fixed_boxes, adjoining_margins)
  File "/usr/local/lib/python3.5/dist-packages/weasyprint/layout/blocks.py", line 510, in block_container_layout
    for line, resume_at in lines_iterator:
  File "/usr/local/lib/python3.5/dist-packages/weasyprint/layout/inlines.py", line 51, in iter_line_boxes
    device_size, absolute_boxes, fixed_boxes, first_letter_style)
  File "/usr/local/lib/python3.5/dist-packages/weasyprint/layout/inlines.py", line 108, in get_next_linebox
    waiting_floats, line_children=[])
  File "/usr/local/lib/python3.5/dist-packages/weasyprint/layout/inlines.py", line 727, in split_inline_box
    line_placeholders, waiting_floats, line_children)
  File "/usr/local/lib/python3.5/dist-packages/weasyprint/layout/inlines.py", line 600, in split_inline_level
    waiting_floats, line_children)
  File "/usr/local/lib/python3.5/dist-packages/weasyprint/layout/inlines.py", line 727, in split_inline_box
    line_placeholders, waiting_floats, line_children)
  File "/usr/local/lib/python3.5/dist-packages/weasyprint/layout/inlines.py", line 577, in split_inline_level
    context, box, max_x - position_x, skip)
  File "/usr/local/lib/python3.5/dist-packages/weasyprint/layout/inlines.py", line 915, in split_text_box
    text, box.style, context, available_width, box.justification_spacing)
  File "/usr/local/lib/python3.5/dist-packages/weasyprint/text.py", line 959, in split_first_line
    text, style, context, max_width, justification_spacing)
  File "/usr/local/lib/python3.5/dist-packages/weasyprint/text.py", line 849, in create_layout
    layout = Layout(context, style.font_size, style)
  File "/usr/local/lib/python3.5/dist-packages/weasyprint/text.py", line 637, in __init__
    'cairo_t *', cairo_dummy_context._pointer)),
KeyboardInterrupt
Error in sys.excepthook:
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/apport_python_hook.py", line 47, in apport_excepthook
    try:
KeyboardInterrupt

Original exception was:
Traceback (most recent call last):
  File "/usr/local/bin/weasyprint", line 11, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.5/dist-packages/weasyprint/__main__.py", line 177, in main
    getattr(html, 'write_' + format_)(output, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/weasyprint/__init__.py", line 182, in write_pdf
    font_config=font_config).write_pdf(
  File "/usr/local/lib/python3.5/dist-packages/weasyprint/__init__.py", line 143, in render
    font_config)
  File "/usr/local/lib/python3.5/dist-packages/weasyprint/document.py", line 326, in _render
    [Page(p, enable_hinting) for p in page_boxes],
  File "/usr/local/lib/python3.5/dist-packages/weasyprint/document.py", line 326, in <listcomp>
    [Page(p, enable_hinting) for p in page_boxes],
  File "/usr/local/lib/python3.5/dist-packages/weasyprint/layout/__init__.py", line 55, in layout_document
    context, root_box, html, cascaded_styles, computed_styles))
  File "/usr/local/lib/python3.5/dist-packages/weasyprint/layout/pages.py", line 601, in make_all_pages
    context, root_box, page_type, resume_at, page_number)
  File "/usr/local/lib/python3.5/dist-packages/weasyprint/layout/pages.py", line 520, in make_page
    positioned_boxes, positioned_boxes, adjoining_margins)
  File "/usr/local/lib/python3.5/dist-packages/weasyprint/layout/blocks.py", line 83, in block_level_layout
    adjoining_margins)
  File "/usr/local/lib/python3.5/dist-packages/weasyprint/layout/blocks.py", line 111, in block_box_layout
    page_is_empty, absolute_boxes, fixed_boxes, adjoining_margins)
  File "/usr/local/lib/python3.5/dist-packages/weasyprint/layout/blocks.py", line 638, in block_container_layout
    adjoining_margins)
  File "/usr/local/lib/python3.5/dist-packages/weasyprint/layout/blocks.py", line 83, in block_level_layout
    adjoining_margins)
  File "/usr/local/lib/python3.5/dist-packages/weasyprint/layout/blocks.py", line 111, in block_box_layout
    page_is_empty, absolute_boxes, fixed_boxes, adjoining_margins)
  File "/usr/local/lib/python3.5/dist-packages/weasyprint/layout/blocks.py", line 638, in block_container_layout
    adjoining_margins)
  File "/usr/local/lib/python3.5/dist-packages/weasyprint/layout/blocks.py", line 83, in block_level_layout
    adjoining_margins)
  File "/usr/local/lib/python3.5/dist-packages/weasyprint/layout/blocks.py", line 111, in block_box_layout
    page_is_empty, absolute_boxes, fixed_boxes, adjoining_margins)
  File "/usr/local/lib/python3.5/dist-packages/weasyprint/layout/blocks.py", line 510, in block_container_layout
    for line, resume_at in lines_iterator:
  File "/usr/local/lib/python3.5/dist-packages/weasyprint/layout/inlines.py", line 51, in iter_line_boxes
    device_size, absolute_boxes, fixed_boxes, first_letter_style)
  File "/usr/local/lib/python3.5/dist-packages/weasyprint/layout/inlines.py", line 108, in get_next_linebox
    waiting_floats, line_children=[])
  File "/usr/local/lib/python3.5/dist-packages/weasyprint/layout/inlines.py", line 727, in split_inline_box
    line_placeholders, waiting_floats, line_children)
  File "/usr/local/lib/python3.5/dist-packages/weasyprint/layout/inlines.py", line 600, in split_inline_level
    waiting_floats, line_children)
  File "/usr/local/lib/python3.5/dist-packages/weasyprint/layout/inlines.py", line 727, in split_inline_box
    line_placeholders, waiting_floats, line_children)
  File "/usr/local/lib/python3.5/dist-packages/weasyprint/layout/inlines.py", line 577, in split_inline_level
    context, box, max_x - position_x, skip)
  File "/usr/local/lib/python3.5/dist-packages/weasyprint/layout/inlines.py", line 915, in split_text_box
    text, box.style, context, available_width, box.justification_spacing)
  File "/usr/local/lib/python3.5/dist-packages/weasyprint/text.py", line 959, in split_first_line
    text, style, context, max_width, justification_spacing)
  File "/usr/local/lib/python3.5/dist-packages/weasyprint/text.py", line 849, in create_layout
    layout = Layout(context, style.font_size, style)
  File "/usr/local/lib/python3.5/dist-packages/weasyprint/text.py", line 637, in __init__
    'cairo_t *', cairo_dummy_context._pointer)),
KeyboardInterrupt

If you need any other information, please let me know; I would like to get this resolved as quickly as possible.

@liZe

This comment has been minimized.

Member

liZe commented Mar 8, 2018

Hi, I am running into the problem described above when using weasyprint 0.42.2.

Thank you for this report.

This issue has been closed with a commit fixing the original bug, so your problem is different even if it leads to the same consequences. Could you please open a separate issue?

I would like to get this resolved as quickly as possible.

So do I!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment