FPDFRecorder frequent usage makes document generation slow #1092

dmail00 · 2024-01-14T20:39:15Z

It is not immediately apparent to me why the rewind on FPDFRecorder copies the singleton FPDF state, was it imagined that this would be rewound multiple times?

From looking at the commit history of this file, it was introduced around version 2.3; the initial commit [1] does not contain this deepcopy yet strangely the next commit to this file is modifying an already present deepcopy [2].

The reason for asking this, is that I am generating 30 or so documents at a time with a similar format, each with five heading and this is slow, it is copying the singleton state twice for each header which seems like madness. The speed can be drastically improved by removing the second deepcopy or at least using a boolean flag to copy or not; however it can be even better using some logic that at least fits my needs. This is:

cache the current font
set the heading font as the current
get the width of the text for the heading
if this greater than the width minus the margins* then hand off to start_section
else get the height of the font line* and add the height of four lines using the previous font**
** if this new y would exceed the page break trigger create a page
** render the marked sequence
reset the old font

* margins includes any possible heading borders.
** Anything less than four lines, to me at least, does not make sense in putting a header on this page

[1] ea834ec
[2] 9fb2c3b

The text was updated successfully, but these errors were encountered:

Lucas-C · 2024-01-15T08:51:59Z

Hi @dmail00

It is not immediately apparent to me why the rewind on FPDFRecorder copies the singleton FPDF state, was it imagined that this would be rewound multiple times?

Yes indeed, there is even a unit test for that: https://github.com/py-pdf/fpdf2/blob/2.7.7/test/test_recorder.py#L30

The reason for asking this, is that I am generating 30 or so documents at a time with a similar format, each with five heading and this is slow, it is copying the singleton state twice for each header which seems like madness. The speed can be drastically improved by removing the second deepcopy or at least using a boolean flag to copy or not; however it can be even better using some logic that at least fits my needs.

Great if you found some solution that fits your needs 🙂

I think it'd be great to improve fpdf2 performances for as many use cases as possible, as long as we do not lose functionalities.

It is not immediately clear to me what is your use case there: could you provide some minimal script that reproduces the performance issue you mentioned, please? This would help us to understand through which methods calls the FPDFRecorder() is intensively used in your situation (e.g. is it reached through FPDF.offset_rendering() or FPDF.unbreakable()? I'd guess it's the former one from FPDF.start_section()), and also provide a "baseline" use case that we could work on, to improve its speed.

Also, if you already found a solution for your problem, what is your intent in opening this issue?
Would you like to contribute on improving things in fpdf2? 🙂

Lucas-C · 2024-01-15T09:00:25Z

If performances are better, I think that instead of calling .offset_rendering() in .start_section() we could copy the approach used in Table._get_row_layout_info(), that is to call .multi_cell() a first time with output=MethodReturnValue.PAGE_BREAK inside a pdf._disable_writing() context-block.

dmail00 · 2024-01-15T18:22:48Z

If performances are better, I think that instead of calling .offset_rendering() in .start_section() we could copy the approach used in Table._get_row_layout_info(), that is to call .multi_cell() a first time with output=MethodReturnValue.PAGE_BREAK inside a pdf._disable_writing() context-block.

This works really well, thanks and is just as fast as what I was doing. I did change it to return the height output=MethodReturnValue.HEIGHT as I still think having spaces for lines after the heading makes sense.

It is not immediately clear to me what is your use case there: could you provide some minimal script that reproduces the performance issue you mentioned, please?

It would have to be a contrived example, as what is currently written is pulled in from an external source and contains personal information.

Also, if you already found a solution for your problem, what is your intent in opening this issue?

It was solely to make you aware of the problem and provide a potential solution.

Lucas-C · 2024-02-28T10:38:28Z

I opened PR #1127 to improve this

dmail00 added the question label Jan 14, 2024

Lucas-C added performance pending-answer labels Jan 15, 2024

Lucas-C changed the title ~~FPDFRecorder~~ FPDFRecorder frequent usage makes document generation slow Jan 15, 2024

Lucas-C removed the pending-answer label Feb 28, 2024

Lucas-C added a commit that referenced this issue Feb 28, 2024

Perf: improving speed of FPDF.start_section() - fix #1092

911caf0

Lucas-C added a commit that referenced this issue Feb 28, 2024

Perf: improving speed of FPDF.start_section() - fix #1092

ec5e141

Lucas-C mentioned this issue Feb 28, 2024

Perf: improving speed of FPDF.start_section() - fix #1092 #1127

Merged

5 tasks

Lucas-C self-assigned this Feb 28, 2024

Lucas-C added a commit that referenced this issue Feb 28, 2024

Perf: improving speed of FPDF.start_section() - fix #1092

4029b4c

Lucas-C closed this as completed in 5c56598 Feb 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FPDFRecorder frequent usage makes document generation slow #1092

FPDFRecorder frequent usage makes document generation slow #1092

dmail00 commented Jan 14, 2024

Lucas-C commented Jan 15, 2024 •

edited

Loading

Lucas-C commented Jan 15, 2024

dmail00 commented Jan 15, 2024

Lucas-C commented Feb 28, 2024

FPDFRecorder frequent usage makes document generation slow #1092

FPDFRecorder frequent usage makes document generation slow #1092

Comments

dmail00 commented Jan 14, 2024

Lucas-C commented Jan 15, 2024 • edited Loading

Lucas-C commented Jan 15, 2024

dmail00 commented Jan 15, 2024

Lucas-C commented Feb 28, 2024

Lucas-C commented Jan 15, 2024 •

edited

Loading