Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Infinite loop on nested spans #1013

Closed
tomaszhlawiczka opened this issue Dec 21, 2019 · 5 comments
Closed

Infinite loop on nested spans #1013

tomaszhlawiczka opened this issue Dec 21, 2019 · 5 comments
Labels
Milestone

Comments

@tomaszhlawiczka
Copy link

@tomaszhlawiczka tomaszhlawiczka commented Dec 21, 2019

Hi!

I've got a html test case which causes sth what looks like an infinite loop or at least a very long (too long) layout formatting.

Steps to reproduce:

virtualenv -p python3.7 weasyprint_env
source weasyprint_env/bin/activate
pip install git+https://github.com/Kozea/WeasyPrint#egg=WeasyPrint
python3.7 html2pdf_case1.py
# now it takes 100% of the CPU, so... ctrl+c

The html2pdf_case1.py file:

import io

from weasyprint import HTML

content = """<!doctype html>
<html>
<body>
  <span>
    <span>
      <span>
        <span>
          <span>
            <span>
              <span>
                <span>
                  <span>
                    <span>
                      <span>
                        <span>
                          <span>
                            <span>
                              <span>
                                <span>
                                  <span>
                                    <span>
                                      <span>
                                        <span>
                                          <span>
                                            <span>
                                              <span>
                                                <span>
                                                  TEXT
                                                </span>
                                              </span>
                                            </span>
                                          </span>
                                        </span>
                                      </span>
                                    </span>
                                  </span>
                                </span>
                              </span>
                            </span>
                          </span>
                        </span>
                      </span>
                    </span>
                  </span>
                </span>
              </span>
            </span>
          </span>
        </span>
      </span>
    </span>
  </span>
</body>
</html>
"""

HTML(string=content).write_pdf(io.BytesIO())

Process interruption (using ctrl+c) gives a traceback as follow:

Traceback (most recent call last):
  File "html2pdf_case1.py", line 61, in <module>
    HTML(string=content).write_pdf(io.BytesIO())
  File "weasyprint_env/lib/python3.7/site-packages/weasyprint/__init__.py", line 211, in write_pdf
    font_config=font_config).write_pdf(
  File "weasyprint_env/lib/python3.7/site-packages/weasyprint/__init__.py", line 168, in render
    font_config)
  File "weasyprint_env/lib/python3.7/site-packages/weasyprint/document.py", line 393, in _render
    [Page(page_box, enable_hinting) for page_box in page_boxes],
  File "weasyprint_env/lib/python3.7/site-packages/weasyprint/document.py", line 393, in <listcomp>
    [Page(page_box, enable_hinting) for page_box in page_boxes],
  File "weasyprint_env/lib/python3.7/site-packages/weasyprint/layout/__init__.py", line 116, in layout_document
    initialize_page_maker(context, root_box)
  File "weasyprint_env/lib/python3.7/site-packages/weasyprint/layout/__init__.py", line 54, in initialize_page_maker
    next_page = {'break': 'any', 'page': root_box.page_values()[0]}
  File "weasyprint_env/lib/python3.7/site-packages/weasyprint/formatting_structure/boxes.py", line 375, in page_values
    start_value = start_box.page_values()[0] or start_value
  File "weasyprint_env/lib/python3.7/site-packages/weasyprint/formatting_structure/boxes.py", line 375, in page_values
    start_value = start_box.page_values()[0] or start_value
  File "weasyprint_env/lib/python3.7/site-packages/weasyprint/formatting_structure/boxes.py", line 375, in page_values
    start_value = start_box.page_values()[0] or start_value
  [Previous line repeated 1 more time]
  File "weasyprint_env/lib/python3.7/site-packages/weasyprint/formatting_structure/boxes.py", line 376, in page_values
    end_value = end_box.page_values()[1] or end_value
  File "weasyprint_env/lib/python3.7/site-packages/weasyprint/formatting_structure/boxes.py", line 375, in page_values
    start_value = start_box.page_values()[0] or start_value
  File "weasyprint_env/lib/python3.7/site-packages/weasyprint/formatting_structure/boxes.py", line 375, in page_values
    start_value = start_box.page_values()[0] or start_value
  File "weasyprint_env/lib/python3.7/site-packages/weasyprint/formatting_structure/boxes.py", line 375, in page_values
    start_value = start_box.page_values()[0] or start_value
  File "weasyprint_env/lib/python3.7/site-packages/weasyprint/formatting_structure/boxes.py", line 376, in page_values
    end_value = end_box.page_values()[1] or end_value
  File "weasyprint_env/lib/python3.7/site-packages/weasyprint/formatting_structure/boxes.py", line 376, in page_values
    end_value = end_box.page_values()[1] or end_value
  File "weasyprint_env/lib/python3.7/site-packages/weasyprint/formatting_structure/boxes.py", line 376, in page_values
    end_value = end_box.page_values()[1] or end_value
  File "weasyprint_env/lib/python3.7/site-packages/weasyprint/formatting_structure/boxes.py", line 375, in page_values
    start_value = start_box.page_values()[0] or start_value
  File "weasyprint_env/lib/python3.7/site-packages/weasyprint/formatting_structure/boxes.py", line 375, in page_values
    start_value = start_box.page_values()[0] or start_value
  File "weasyprint_env/lib/python3.7/site-packages/weasyprint/formatting_structure/boxes.py", line 376, in page_values
    end_value = end_box.page_values()[1] or end_value
  File "weasyprint_env/lib/python3.7/site-packages/weasyprint/formatting_structure/boxes.py", line 376, in page_values
    end_value = end_box.page_values()[1] or end_value
  File "weasyprint_env/lib/python3.7/site-packages/weasyprint/formatting_structure/boxes.py", line 375, in page_values
    start_value = start_box.page_values()[0] or start_value
  File "weasyprint_env/lib/python3.7/site-packages/weasyprint/formatting_structure/boxes.py", line 375, in page_values
    start_value = start_box.page_values()[0] or start_value
  File "weasyprint_env/lib/python3.7/site-packages/weasyprint/formatting_structure/boxes.py", line 375, in page_values
    start_value = start_box.page_values()[0] or start_value
  [Previous line repeated 1 more time]
  File "weasyprint_env/lib/python3.7/site-packages/weasyprint/formatting_structure/boxes.py", line 376, in page_values
    end_value = end_box.page_values()[1] or end_value
  File "weasyprint_env/lib/python3.7/site-packages/weasyprint/formatting_structure/boxes.py", line 376, in page_values
    end_value = end_box.page_values()[1] or end_value
  File "weasyprint_env/lib/python3.7/site-packages/weasyprint/formatting_structure/boxes.py", line 375, in page_values
    start_value = start_box.page_values()[0] or start_value
  File "weasyprint_env/lib/python3.7/site-packages/weasyprint/formatting_structure/boxes.py", line 375, in page_values
    start_value = start_box.page_values()[0] or start_value
  File "weasyprint_env/lib/python3.7/site-packages/weasyprint/formatting_structure/boxes.py", line 375, in page_values
    start_value = start_box.page_values()[0] or start_value
  File "weasyprint_env/lib/python3.7/site-packages/weasyprint/formatting_structure/boxes.py", line 376, in page_values
    end_value = end_box.page_values()[1] or end_value
  File "weasyprint_env/lib/python3.7/site-packages/weasyprint/formatting_structure/boxes.py", line 373, in page_values
    if self.children:
KeyboardInterrupt
@liZe liZe added the crash label Dec 21, 2019
@liZe

This comment has been minimized.

Copy link
Member

@liZe liZe commented Dec 21, 2019

Thanks @tomaszhlawiczka for reporting this issue.

(I’m somewhere between 🤣 and 😭)

@tomaszhlawiczka

This comment has been minimized.

Copy link
Author

@tomaszhlawiczka tomaszhlawiczka commented Dec 21, 2019

Looks like each nested <span> greatly enlarges the effort to gets results. With limited number of descendants it works... so an algorithm issue?

@liZe

This comment has been minimized.

Copy link
Member

@liZe liZe commented Dec 21, 2019

The problem is in page_values that gets called way too many times. Adding a cache fixes the problem, but there’s probably a clean way to fix the algorithm.

@liZe liZe closed this in 3f74dd9 Dec 21, 2019
@liZe liZe added this to the 51 milestone Dec 21, 2019
@liZe

This comment has been minimized.

Copy link
Member

@liZe liZe commented Dec 21, 2019

OK, when a parent box has only one child, the function is called twice for this child, and 4 times for the grandchild, and… You get it?

@tomaszhlawiczka

This comment has been minimized.

Copy link
Author

@tomaszhlawiczka tomaszhlawiczka commented Dec 22, 2019

Sure. Now it works perfectly - good job! Thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
2 participants
You can’t perform that action at this time.