Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ResetPages() does not clear buffers #102

Closed
martinlindenberg opened this issue Nov 22, 2022 · 5 comments
Closed

ResetPages() does not clear buffers #102

martinlindenberg opened this issue Nov 22, 2022 · 5 comments

Comments

@martinlindenberg
Copy link

Hey

i'm using go-wkhtmltopdf in a lambda function to generate pdfs.
It works fine until the lambda container is re-used for another pdf generation.

To avoid issues i was calling pdfg.ResetPages() which looks on the first sight like the proper solution.
After some time i found out that the generated file size increased and that some pdf readers start showing the data of the first execution and some printers print out multiple pages.

To fix this i'm now additionally calling pdfg.Buffer().Reset() together with pdfg.ResetPages()

Question now:
To me it looks a bit misleading that ResetPages() does not reset all page data - should the Buffer().Reset() be internally included in that ResetPages() function ?

@SebastiaanKlippert
Copy link
Owner

Hi,

That is a very good point. ResetPages only takes care of the input in the current implementation, but it could clear the output as well. The comment does indicate that is how you should use it, so you are correct.
I will update it later today, add a few tests and publish a new version.

@SebastiaanKlippert
Copy link
Owner

I think I will just reset it every time it writes to the buffer on Create() or CreateContext() then it is always safe for reuse.

Also, I am a bit surpised with this behaviour on Lambda. We use it with Docker Lambda's as well but our code creates a new generator using wkhtmltopdf.NewPDFGenerator() for every PDF anyway so the buffer is not reused. Performance wise this is OK, the lookup of wkhtmltopdf on the system is the heaviest part and that is cached by the package internally in a global variable. So you could just create a new instance each time as a workaround.
Are using global variables to store the state between executions?

@martinlindenberg
Copy link
Author

martinlindenberg commented Nov 22, 2022

It is running on AWS lambda for me.

I think the reason why it behaves like this, is that i am only initiating the PDF Generator once per lambda container.
The main() function contains this line (and some additional configs...)
pdfg, err := wkhtmltopdf.NewPDFGenerator()
It is only called once per container initialization and keeps that object in memory.

The handler function uses the following function to render the pdf

func (h handler) renderPdf() ([]byte) {
  h.pdfgen.ResetPages()
  h.pdfgen.Buffer().Reset()                 // added

  h.pdfgen.AddPage(page)
  h.pdfgen.Create()
  return h.pdfgen.Bytes()
}

So it is not stored in global variables or anything - it just lives in the initiated containers memory.

@SebastiaanKlippert
Copy link
Owner

Hi,

Thanks for explaining. In this case the PDF Generator object is indeed reused. Everything you declare in main is kept in memory as Lambda.Start() never returns, it just calls your handler (renderPdf()) for each invocation. Similar to http.ListenAndServe

To make the behaviour of this package more predictable the buffer is now cleared before each run where the output will be written to the buffer.
https://github.com/SebastiaanKlippert/go-wkhtmltopdf/releases/tag/v1.8.0

@martinlindenberg
Copy link
Author

thanks for the fast reaction 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants