Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a --timeout option to the CLI API #1909

Closed
quique opened this issue Jul 13, 2023 · 12 comments · Fixed by #1971
Closed

Add a --timeout option to the CLI API #1909

quique opened this issue Jul 13, 2023 · 12 comments · Fixed by #1971
Labels
feature New feature that should be supported good first issue Issues that can be quite easily solved by Python developers with a good CSS background
Milestone

Comments

@quique
Copy link

quique commented Jul 13, 2023

I'm thinking of something like
weaseyprint --timeout 30 https://weasyprint.org /tmp/weasyprint-website.pdf

The provided value would just have to be passed to the default_url_fetcher(url, timeout=10, ssl_context=None) method.

@liZe liZe added feature New feature that should be supported good first issue Issues that can be quite easily solved by Python developers with a good CSS background labels Jul 13, 2023
@liZe
Copy link
Member

liZe commented Jul 13, 2023

That’s a good idea. It should be possible to pass the default URL fetcher with timeout set to given value in __main__.py. Anyone interested to add this feature?

@liZe liZe changed the title Please add a --timeout option to the CLI API Add a --timeout option to the CLI API Jul 13, 2023
@MalanB
Copy link

MalanB commented Jul 13, 2023

Hey, interested in picking this issue

@liZe
Copy link
Member

liZe commented Jul 13, 2023

Hey, interested in picking this issue

Cool! Don’t hesitate to ask if you want some hints!

@staticvoidzach
Copy link

Wondering if there should be multiple types of timeout options? When looking at weasyprint --timeout 30 in.com out.pdf my first though is the timeout applies to the entire process, so the command will exit prematurely if the fetch + resource gather + render process takes more than 30 seconds.

Although, I'd also like to have the ability to control the timeouts of individual stages, e.g. --timeout-for-initial-page-fetch and --timeout-for-each-individual-asset-fetch-and-load and --timeout-for-pdf-render, perhaps with shorter names however...

I'd also consider what the error behaviour would be? Should that be configurable? Potentially someone doesn't want their entire pdf generation to die if 1 out of 10 images takes longer than 10 seconds to fetch. Seems like an undesirable situation, yes, I wouldn't my company logo to be missing on 1 random report out of 100, however there's always a situation for these things.

Perhaps if timeout control is introduced for url fetching, potentially control for retries also needs to be considered?

Not sure about the best way to implement this though, as adding 8 new obscure arguments may be overkill and clutter the help menu. On the other hand, all of my favourite CLI tools completely fill my screen when I run --help, so possibly not the worst idea?

@liZe
Copy link
Member

liZe commented Aug 25, 2023

I think that WeasyPrint can have a timeout option that will be used for each request, because it’s both useful (at least to avoid waiting forever when there’s a network problem) and easy to implement (just a parameter to give to the standard library.)

An option to control the overall time is useful to, but it’s much harder to implement. Moreover, other tools already exist to handle this, like timeout.

For more complex cases, we won’t be able to handle everything users will want using simple CLI options. That’s why users can use their own URL fetchers and do whatever they want in Python.

So, my advice is to do what’s in the original comment: a timeout option that is passed to the default URL fetcher.

@azhar316
Copy link
Contributor

azhar316 commented Sep 10, 2023

Hey @MalanB, are you still working on this? If not I would love to have a go at it.

@MalanB
Copy link

MalanB commented Sep 13, 2023

@azhar316 please go ahead, day job is hectic and I don't seem to find time till now :(

@azhar316
Copy link
Contributor

Thanks! @MalanB

Hey @liZe, I have some questions it would be really helpful if you can answer them.

  1. I created the setup in my local and ran the tests but 2 of them are failing(output below) before I have even made any modifications to the codebase. Is this any system specific issue?
assert_same_renderings = <function assert_same_renderings.<locals>.<lambda> at 0x10ccf4040>

    @assert_no_logs
    def test_image_exif(assert_same_renderings):
>       assert_same_renderings(
            '''
                <style>@page { size: 10px }</style>
                <img style="display: block" src="not-optimized.jpg">
            ''',
            '''
                <style>@page { size: 10px }</style>
                <img style="display: block" src="not-optimized-exif.jpg">
            ''',
            tolerance=25,
        )

    @assert_no_logs
    def test_font_stretch():
        page, = render_pages('''
          <style>
            p { float: left; font-family: %s }
          </style>
          <p>Hello, world!</p>
          <p style="font-stretch: condensed">Hello, world!</p>
        ''' % SANS_FONTS)
        html, = page.children
        body, = html.children
        p_1, p_2 = body.children
        normal = p_1.width
        condensed = p_2.width
>       assert condensed < normal
E       assert 101.546875 < 101.546875

FAILED tests/draw/test_image.py::test_image_exif - AssertionError: Pixel (8, 8) in image_exif_1: expected rgba(210, 210, 210), got rgba(75, 75, 75)
FAILED tests/layout/test_inline.py::test_font_stretch - assert 101.546875 < 101.546875
  1. I am planning to accomplish this functionality by adding the below piece of code:
url_fetcher = default_url_fetcher
if args.timeout:
    url_fetcher = (lambda url, timeout=args.timeout, ssl_context=None: 
        default_url_fetcher(url, timeout, ssl_context))

del args.timeout

is this good or should I explore a different method?

  1. I assume that I should add tests for this in the 'test_command_line_render' function of 'test_api.py' file but how should I go about making sure that the request takes more than x seconds.

@liZe
Copy link
Member

liZe commented Sep 18, 2023

Hi @azhar316,

  1. I created the setup in my local and ran the tests but 2 of them are failing(output below) before I have even made any modifications to the codebase. Is this any system specific issue?

For the second test, you need to install the DejaVu fonts, including condensed ones. I’ve just updated the documentation about that!

But for the first one, I’ve never seen someone complain (tests pass for sure on my computer, on Linux distributions packaging platforms and on CI). That’s strange. That’s be interesting to find the reason for that, but you can safely ignore the problem if you don’t want to fall down the rabbit hole!

  1. I am planning to accomplish this functionality by adding the below piece of code:
    is this good or should I explore a different method?

This idea should work!

You can:

  • remove the del statement (that was only to deprecate --optimize-size and be sure that it’s not used anywhere else),
  • use functools.partial instead of a lambda function, that’ll be cleaner and future-proof 🤞🏽.
  1. I assume that I should add tests for this in the 'test_command_line_render' function of 'test_api.py' file but how should I go about making sure that the request takes more than x seconds.

That would be complex to test the timeout, but we can at least test that the option is accepted and gives the right document as a first step.

@azhar316
Copy link
Contributor

But for the first one, I’ve never seen someone complain (tests pass for sure on my computer, on Linux distributions packaging platforms and on CI). That’s strange. That’s be interesting to find the reason for that, but you can safely ignore the problem if you don’t want to fall down the rabbit hole!

Will do it as and when I find some time (past one week has been quite hectic work wise!).

remove the del statement (that was only to deprecate --optimize-size and be sure that it’s not used anywhere else),
use functools.partial instead of a lambda function, that’ll be cleaner and future-proof 🤞🏽.

Done.

That would be complex to test the timeout, but we can at least test that the option is accepted and gives the right document as a first step.

Have added the below lines of code to the 'test_command_line_render' function of 'test_api.py' file:

_run('combined.html out23.pdf --timeout 30')   
assert tmpdir.join('out23.pdf').read_binary() == pdf_bytes

Let me know if this is acceptable. I will create a PR!

@liZe
Copy link
Member

liZe commented Sep 25, 2023

@azhar316 That would be perfect!

@azhar316
Copy link
Contributor

@liZe Done! PR #1971

@grewn0uille grewn0uille added this to the 60.0 milestone Sep 25, 2023
netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this issue Oct 15, 2023
Version 60.1
------------

Released on 2023-09-29.

Bug fixes:

* `#1973 <https://github.com/Kozea/WeasyPrint/issues/1973>`_:
  Fix crash caused by wrong UTF-8 indices


Version 60.0
------------

Released on 2023-09-25.

New features:

* `#1903 <https://github.com/Kozea/WeasyPrint/issues/1903>`_:
  Print form fields
* `#1922 <https://github.com/Kozea/WeasyPrint/pull/1922>`_:
  Add support for textLength and lengthAdjust in SVG text elements
* `#1965 <https://github.com/Kozea/WeasyPrint/issues/1965>`_:
  Handle <wbr> tag
* `#1970 <https://github.com/Kozea/WeasyPrint/pull/1970>`_:
  Handle y offset of glyphs
* `#1909 <https://github.com/Kozea/WeasyPrint/issues/1909>`_:
  Add a --timeout option

Bug fixes:

* `#1887 <https://github.com/Kozea/WeasyPrint/pull/1887>`_:
  Fix footnote-call displayed incorrectly for some fonts
* `#1890 <https://github.com/Kozea/WeasyPrint/pull/1890>`_:
  Fix page-margin boxes layout algorithm
* `#1908 <https://github.com/Kozea/WeasyPrint/pull/1908>`_:
  Fix IndexError when rendering PDF version 1.4
* `#1906 <https://github.com/Kozea/WeasyPrint/issues/1906>`_:
  Apply text transformations to first-letter pseudo elements
* `#1915 <https://github.com/Kozea/WeasyPrint/pull/1915>`_:
  Avoid footnote appearing before its call
* `#1934 <https://github.com/Kozea/WeasyPrint/pull/1934>`_:
  Fix balance before "column-span: all"
* `#1935 <https://github.com/Kozea/WeasyPrint/issues/1935>`_:
  Only draw required glyph with OpenType-SVG fonts
* `#1595 <https://github.com/Kozea/WeasyPrint/issues/1595>`_:
  Don’t draw clipPath when defined after reference
* `#1895 <https://github.com/Kozea/WeasyPrint/pull/1895>`_:
  Don’t ignore min-width when computing cell size
* `#1899 <https://github.com/Kozea/WeasyPrint/pull/1899>`_:
  Fix named pages inheritance
* `#1936 <https://github.com/Kozea/WeasyPrint/pull/1936>`_:
  Avoid page breaks caused by children of overflow hidden boxes
* `#1943 <https://github.com/Kozea/WeasyPrint/issues/1943>`_:
  Use bleed area for page’s painting area
* `#1946 <https://github.com/Kozea/WeasyPrint/issues/1946>`_:
  Use margin box of children to define available width for leaders
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature that should be supported good first issue Issues that can be quite easily solved by Python developers with a good CSS background
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants