[IMP] base/report, web/report: remove pdf merge with single call to w… #17101

smetl · 2017-05-19T11:09:20Z

…khtmltopdf

The old behavior was to call wkhtmltopdf for each report and to put all of them together
using the merge_pdf method. It was a very slow approach because all report needs
5 temporary files, one subprocess and then, a merge.

The new behavior is to perform a single call to wkhtmltopdf by putting all the html
together and so, avoid merge to greatly improve the performances of the reporting.

see task: https://www.odoo.com/web#id=33527&view_type=form&model=project.task&action=333&active_id=248&menu_id=4720

--
I confirm I have signed the CLA and read the PR guidelines at www.odoo.com/submit-pr

lmignon · 2017-05-19T11:26:10Z

@smetl What will happen with the page numbering? Will the page numbering restart for each document? IMO it would be better to have an option on the report to process all the generated document at once or individually. Moreover does the generated invoice be saved in an attachment linked to the related account.invoice instance if you generate a lot of invoices at once?

smetl · 2017-05-19T11:36:22Z

@lmignon The page numbering is not restart for each document.
Unfortunately, we are not able to generate the attachments for each record individually with this commit.

smetl · 2017-05-22T08:41:14Z

@lmignon The page numbering is now restarted at each document.

lmignon · 2017-05-22T08:56:53Z

@smetl This refactoring is not compatible with the requirements of our customers since we lost important functionalities when reports are generated in batch mode:

For each record we must store the generated document as attachment,
For each record we must honor the attachment_use parameter

Even if we have cases where these functionalities are not required, we have others where these are required. If you can't find a way to keep these functionalities we must at least have a way to choice between the 2 modes: call wkhtml2pdf all at once or one by one. The one by one mode will be used when we must keep the current behavior (attachment_use and attachment stored by record)

nhomar · 2017-05-22T09:08:09Z

@lmignon for some reason the attachment_use feature is lost intentionally, I can't find a real reason with real explanation and rationale yet, but what I think is that odoo is betting now to make an special method to record the printed reporst as attachments by use cases (and not as an API as this was before) that's my asumption but @antonylesuisse can explain better the ratioanale, then it is not lost with this change it was lost time ago on master.

One suposition I have is that when you send by email the report is attached and in the past we have the same PDF saved twice (once when sent and once when printed [1]) and the solution was remove the autogeneration part in favor of force the "send by email" feature (which is by code).

I think we need a more elegant solution but this was what odoo's brings until now. :-(

[1] #13838

nhomar · 2017-05-22T09:13:39Z

@smetl

I am worried about this:

Unfortunately, we are not able to generate the attachments for each record individually with this commit.

Every record is an individual unit, join them all will bring more problems than solutions.

Sales orders, Purchase orders, Invoices, Account moves .... all of them ar completelly separated documents.

At least you generate a powerfull analysisi of the content, you will face probelms with page-brakes in-code also and with Javascripts that power reports to show advanced information which depends of the document itself.

I have a question:

In what case you need huge amount of records printed at once....?

If the reason is performance, I think this is an incorrect approach.

And please I know you do not have the decision, but we all use the autoattachment feature a lot, I yet do not understand the removal of such nice feature.

nhomar · 2017-05-22T09:23:27Z

@lmignon

I correct myself I confused a WiP PR with master one, ths removal of attahcment_use has not been merged yet into master.

I tested with tender reports and it works properly as usual.

smetl · 2017-05-24T11:39:47Z

@lmignon @nhomar My third commit uses a trick to split the resulting pdf to allow the storing as attachment for each record individually. Then, we keep all the functionalities as previously but the performance are increased.

lmignon · 2017-06-02T11:44:02Z

odoo/addons/base/ir/ir_actions_report.py

+                    [(r.id, r) for r in self.env[self.model].browse([res_id for res_id in res_ids if res_id])])
+                if len(res_ids) == 1 and res_ids[0] in record_map:
+                    # Only one record, so postprocess directly and append the whole pdf.
+                    self.postprocess_pdf_report(record_map[res_ids[0]], pdf_content)


@smetl Once again you remove a functionality.... The potential modification of the content by the postprocess method will no more be part of the result...

@lmignon After discussion with @antonylesuisse , this functionality will not be longer supported for performance reason. This commit avoids generating a temporary file for each sub-report.

@smetl but if the pdf_content is a StringIO and the StringIO is added to the writer after the call to subprocess it's possible to support this functionality. Am I wrong?
cc @antonylesuisse

@lmignon The problem is we call wkhtmltopdf only once. Then, the file contains all the reports, not only the report for each record individually.

@smetl Yes it's nice to have wkhtmltopdb called only once. By reading your code I've the feeling it's possible to adapt this code to put into the final result a StringIO that could be passed to the postprocess method without additional performance cost.

# Append content of pdf if exists if pdf_content: content_streams = [] # Call the postprocess method for each record. if res_ids and self.attachment_use and self.attachment: # Build a record_map mapping id -> record record_map = dict( [(r.id, r) for r in self.env[self.model].browse([res_id for res_id in res_ids if res_id])]) if len(res_ids) == 1 and res_ids[0] in record_map: # Only one record, so postprocess directly and append the whole pdf. stream = StringIO(pdf_content) self.postprocess_pdf_report(record_map[res_ids[0]], stream) content_streams.append(streams) else: # In case of multiple docs, we need to split the pdf according the records. # To do so, we split the pdf based on outlines computed by wkhtmltopdf. # An outline is a <h?> html tag found on the document. To retrieve this table, # we look on the pdf structure using pypdf to compute the outlines_pages that is # an array like [0, 3, 5] that means a new document start at page 0, 3 and 5. outlines_pages = sorted( [outline.getObject()[0] for outline in reader.trailer['/Root']['/Dests'].values()]) assert len(outlines_pages) == len(res_ids) for i, num in enumerate(outlines_pages): if not res_ids[i]: continue to = outlines_pages[i + 1] if i + 1 < len(outlines_pages) else reader.numPages writer = PdfFileWriter() for j in range(num, to): writer.addPage(reader.getPage(j)) attachment_content = StringIO() writer.write(attachment_content) self.postprocess_pdf_report(record_map[res_ids[i]], attachment_content) content_streams.append(attachment_content) else: stream = StringIO(pdf_content) content_streams.append(stream) # add content to the result for stream in content_streams: reader = PdfFileReader(stream) writer.appendPagesFromReader(reader) close_streams(content_streams)

@lmignon In my last commit, I pass the stringio instead of the attachment_content.

@smetl Thank you. And what do you thing about my proposal into the code above.
The idea it to put each stream passed to the postprocess method into an array and at the end iterate on the array to append the pages to the writer. With this changen if a stream is modified by the postprocess, it's part of the final result.

@lmignon I understand what you mean. I will adapt my code to take into account your concern.

@lmignon It should work now with my last commit.

lmignon · 2017-06-06T08:16:50Z

@smetl 🎉 Thank you for the changes... LGTM

rco-odoo

Overall the code looks okay to me.

However, your naming convention is inconsistent. You should stick to the usual Odoo convention:

"record" is for a browse record,
"records" is for a recordset,
"record_id" is for a record ID (integer),
"record_ids" is for a collection of record IDs.

Thanks,
Raphael

rco-odoo · 2017-06-12T14:27:03Z

odoo/addons/base/ir/ir_actions_report.py

-        :param attachment_name: The name of the attachment.
+        :param record_id: The record that will own the attachment.
+        :param pdf_content: The optional name content of the file to avoid reading both times.
+        :return The newly generated attachment if no AccessError, else None.


typo, colon missing in :return:

rco-odoo · 2017-06-12T14:29:07Z

odoo/addons/base/ir/ir_actions_report.py

@@ -180,73 +146,50 @@ def unlink_action(self):
    # Main report methods
    #--------------------------------------------------------------------------
    @api.multi
-    def retrieve_attachment(self, record_id, attachment_name=None):
+    def retrieve_attachment(self, record_id):


The name record_id is confusing: it suggests that it should be a record ID. I recommend to use record instead.

rco-odoo · 2017-06-12T14:29:46Z

odoo/addons/base/ir/ir_actions_report.py

-    @api.model
-    def postprocess_pdf_report(self, res_id, pdfreport_path, attachment_name):
+    @api.multi
+    def postprocess_pdf_report(self, record_id, buffer):


Same as above, record is less confusing IMHO...

rco-odoo · 2017-06-12T14:31:23Z

odoo/addons/base/ir/ir_actions_report.py

        try:
-            self.env['ir.attachment'].create(attachment)
+            attachment_id = self.env['ir.attachment'].create(attachment)


better naming convention maybe:

attachment → attachment_vals

attachment_id → attachment

rco-odoo · 2017-06-12T14:36:10Z

odoo/addons/base/ir/ir_actions_report.py

+            pdf_content_stream = StringIO(pdf_content)
+            # Build a record_map mapping id -> record
+            record_map = dict(
+                [(r.id, r) for r in self.env[self.model].browse([res_id for res_id in res_ids if res_id])])


Use a dict comprehension and maybe filter to filter out falsy ids:

record_map = {r.id: r for r in self.env[self.model].browse(filter(None, res_ids))}

…khtmltopdf The old behavior was to call wkhtmltopdf for each report and to put all of them together using the merge_pdf method. It was a very slow approach because all report needs 5 temporary files, one subprocess and then, a merge. The new behavior is to perform a single call to wkhtmltopdf by putting all the footers/headers html together and so, avoid call to merge_pdf to greatly improve the performances of the reporting.

ged-odoo · 2017-07-20T12:53:44Z

merged in master

smetl · 2017-07-20T12:59:26Z

@ged-odoo Thanks!

smetl added the RD research & development, internal work label May 19, 2017

smetl requested a review from sle-odoo May 19, 2017 11:09

smetl force-pushed the master-rem-pdf-merge-las branch 3 times, most recently from 85b93cb to fdeacdb Compare May 22, 2017 08:30

smetl force-pushed the master-rem-pdf-merge-las branch from 5253d7d to 9c4b989 Compare May 23, 2017 10:35

smetl force-pushed the master-rem-pdf-merge-las branch from 2375f60 to 2671398 Compare May 24, 2017 12:05

smetl force-pushed the master-rem-pdf-merge-las branch 2 times, most recently from 8e57f1b to f68f2c1 Compare June 2, 2017 10:54

lmignon reviewed Jun 2, 2017

View reviewed changes

smetl force-pushed the master-rem-pdf-merge-las branch from c63181f to 8149b4b Compare June 6, 2017 07:57

smetl requested a review from rco-odoo June 8, 2017 09:21

smetl force-pushed the master-rem-pdf-merge-las branch from 8149b4b to 549fa0d Compare June 8, 2017 09:24

rco-odoo requested changes Jun 12, 2017

View reviewed changes

smetl force-pushed the master-rem-pdf-merge-las branch from 1ceabad to 1696e9a Compare June 16, 2017 08:58

smetl force-pushed the master-rem-pdf-merge-las branch from 1696e9a to b9abf2b Compare July 19, 2017 13:50

smetl force-pushed the master-rem-pdf-merge-las branch from b9abf2b to cc9df4f Compare July 20, 2017 07:13

ged-odoo closed this Jul 20, 2017

ged-odoo deleted the master-rem-pdf-merge-las branch July 20, 2017 12:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[IMP] base/report, web/report: remove pdf merge with single call to w… #17101

[IMP] base/report, web/report: remove pdf merge with single call to w… #17101

smetl commented May 19, 2017 •

edited

Loading

lmignon commented May 19, 2017

smetl commented May 19, 2017 •

edited

Loading

smetl commented May 22, 2017

lmignon commented May 22, 2017

nhomar commented May 22, 2017

nhomar commented May 22, 2017

nhomar commented May 22, 2017

smetl commented May 24, 2017

lmignon Jun 2, 2017 •

edited

Loading

smetl Jun 2, 2017

lmignon Jun 2, 2017 •

edited

Loading

smetl Jun 2, 2017

lmignon Jun 2, 2017

smetl Jun 2, 2017

lmignon Jun 2, 2017

smetl Jun 6, 2017

smetl Jun 6, 2017

lmignon commented Jun 6, 2017

rco-odoo left a comment

rco-odoo Jun 12, 2017

rco-odoo Jun 12, 2017

rco-odoo Jun 12, 2017

rco-odoo Jun 12, 2017

rco-odoo Jun 12, 2017

ged-odoo commented Jul 20, 2017

smetl commented Jul 20, 2017

[IMP] base/report, web/report: remove pdf merge with single call to w… #17101

[IMP] base/report, web/report: remove pdf merge with single call to w… #17101

Conversation

smetl commented May 19, 2017 • edited Loading

lmignon commented May 19, 2017

smetl commented May 19, 2017 • edited Loading

smetl commented May 22, 2017

lmignon commented May 22, 2017

nhomar commented May 22, 2017

nhomar commented May 22, 2017

nhomar commented May 22, 2017

smetl commented May 24, 2017

lmignon Jun 2, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lmignon Jun 2, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lmignon commented Jun 6, 2017

rco-odoo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ged-odoo commented Jul 20, 2017

smetl commented Jul 20, 2017

smetl commented May 19, 2017 •

edited

Loading

smetl commented May 19, 2017 •

edited

Loading

lmignon Jun 2, 2017 •

edited

Loading

lmignon Jun 2, 2017 •

edited

Loading