New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Profile big tables #70

Closed
SimonSapin opened this Issue Apr 4, 2013 · 10 comments

Comments

Projects
None yet
4 participants
@SimonSapin
Member

SimonSapin commented Apr 4, 2013

mop___: finally found time for the performance test :S having one large table which spreads across 100 pages consumes LOADS of memory :S filesize is still 67kb :S
mop___: 6gb memory and counting :D

http://pastebin.com/HLVrTmxt

@si13b

This comment has been minimized.

Show comment
Hide comment
@si13b

si13b Jun 24, 2013

We have a similar issue when printing large documents (hundreds of pages), Weasy consumes several GB of memory. We don't have a single large table spanning multiple pages, but pretty much every page has tables on it.

I'm guessing that GC is not taking place? Or perhaps it needs to take place where it currently isn't?

si13b commented Jun 24, 2013

We have a similar issue when printing large documents (hundreds of pages), Weasy consumes several GB of memory. We don't have a single large table spanning multiple pages, but pretty much every page has tables on it.

I'm guessing that GC is not taking place? Or perhaps it needs to take place where it currently isn't?

@SimonSapin

This comment has been minimized.

Show comment
Hide comment
@SimonSapin

SimonSapin Jun 24, 2013

Member

WeasyPrint creates multiple objects for every element of the document, and keeps it all in memory until the end. So very high memory usage for big documents is kind of "expected". It’s not the GC’s fault if nothing is freed.

Things that we could do include:

  • See if some documents (maybe with big or many tables?) cause pathologically high memory: higher than other documents of similar "size". The pastebin link above is broken. @sbracegirdle, it would help if you can send example documents where WeasyPrint seems to behave badly.
  • Try and not keep around intermediate data that is not needed anymore.
  • Profile overall memory usage and try to reduce the hot spots, with trick like "copy on write" StyleDict.
Member

SimonSapin commented Jun 24, 2013

WeasyPrint creates multiple objects for every element of the document, and keeps it all in memory until the end. So very high memory usage for big documents is kind of "expected". It’s not the GC’s fault if nothing is freed.

Things that we could do include:

  • See if some documents (maybe with big or many tables?) cause pathologically high memory: higher than other documents of similar "size". The pastebin link above is broken. @sbracegirdle, it would help if you can send example documents where WeasyPrint seems to behave badly.
  • Try and not keep around intermediate data that is not needed anymore.
  • Profile overall memory usage and try to reduce the hot spots, with trick like "copy on write" StyleDict.
@si13b

This comment has been minimized.

Show comment
Hide comment
@si13b

si13b Jul 8, 2013

I have uploaded a sample document to http://filebin.ca/nGTQ12Aivfp. This is a fairly common document layout approach that we use.

I printed the above file in Weasy 0.19:

  • For the first ~7 minutes it was running fine, only using 6% (of 16G) mem and 100% of one CPU, which is fairly normal for such a large document.
  • After around the 8th minute, it went absolutely crazy and suddenly spiked up to 50%+ mem (7651M+ Residential set size), before I manually killed the process.

This seems to be a fairly common occurrence, with the same problem occurring on many of our servers on large documents.

si13b commented Jul 8, 2013

I have uploaded a sample document to http://filebin.ca/nGTQ12Aivfp. This is a fairly common document layout approach that we use.

I printed the above file in Weasy 0.19:

  • For the first ~7 minutes it was running fine, only using 6% (of 16G) mem and 100% of one CPU, which is fairly normal for such a large document.
  • After around the 8th minute, it went absolutely crazy and suddenly spiked up to 50%+ mem (7651M+ Residential set size), before I manually killed the process.

This seems to be a fairly common occurrence, with the same problem occurring on many of our servers on large documents.

@xrmx

This comment has been minimized.

Show comment
Hide comment
@xrmx

xrmx May 4, 2014

I give a first try to profiling with memory_profiler, here's the data:
http://pastebin.com/jHNbgkdv

this is how i generated the data:

./bin/python -m memory_profiler  ./bin/weasyprint smalltable.html foo.pdf

values marked as small are generated by this script, the ones marked as huge are generated by generating 10 more times the table lines, if i create 100x tds weasyprint get killed by OOM killer on my 3gb machine.

row = "<tr><td>dsfdsfdfdsf</td><td>sdfdsfsdfdf</td><td>dgfsgfds</td><td>dfdsfsd</td><td>sdfdsdsfdsf</td><td>dsfsddsfsddsfdfdsfd</td></tr>"
print "<html><head></head><body><table>%s</table></body><html>" % (row * 100)

A couple of interesting findings:

  • get_image_from_uri consumes memory even if there are no images
  • interesting memory allocation iterating elements on get_all_computed_styles.
  • make_all_pages looks like the biggest contributor to memory usage

xrmx commented May 4, 2014

I give a first try to profiling with memory_profiler, here's the data:
http://pastebin.com/jHNbgkdv

this is how i generated the data:

./bin/python -m memory_profiler  ./bin/weasyprint smalltable.html foo.pdf

values marked as small are generated by this script, the ones marked as huge are generated by generating 10 more times the table lines, if i create 100x tds weasyprint get killed by OOM killer on my 3gb machine.

row = "<tr><td>dsfdsfdfdsf</td><td>sdfdsfsdfdf</td><td>dgfsgfds</td><td>dfdsfsd</td><td>sdfdsdsfdsf</td><td>dsfsddsfsddsfdfdsfd</td></tr>"
print "<html><head></head><body><table>%s</table></body><html>" % (row * 100)

A couple of interesting findings:

  • get_image_from_uri consumes memory even if there are no images
  • interesting memory allocation iterating elements on get_all_computed_styles.
  • make_all_pages looks like the biggest contributor to memory usage
@liZe

This comment has been minimized.

Show comment
Hide comment
@liZe

liZe Jul 17, 2017

Member

I can't reproduce this issue anymore. I've tried to render a 10000-line table, it took 108 seconds on my computer with about 3GB of RAM used (between 10MB and 15MB per page). I've just used 10000 instead of 100 in the script above. The output is a 239-page, 164kB PDF.

Of course, we could be better. Firefox takes about 2s and 100MB to render the same page.

Member

liZe commented Jul 17, 2017

I can't reproduce this issue anymore. I've tried to render a 10000-line table, it took 108 seconds on my computer with about 3GB of RAM used (between 10MB and 15MB per page). I've just used 10000 instead of 100 in the script above. The output is a 239-page, 164kB PDF.

Of course, we could be better. Firefox takes about 2s and 100MB to render the same page.

@liZe liZe closed this Jul 17, 2017

@liZe liZe reopened this Jul 17, 2017

@liZe

This comment has been minimized.

Show comment
Hide comment
@liZe

liZe Jul 17, 2017

Member

The example provided by @si13b still takes a lot of memory. Stats on my i7-6500U @ 2.5GHz :

  • About 40 minutes.
  • 3000 pages exactly (!).
  • 8.2MB generated PDF.
  • 11.2GB of RAM, less than 4MB per page.

I'll try to use a generator instead of a list when we render the pages. 4MB per page looks like a bad but not awful score for me.

Firefox takes 10s and 600MB of RAM to render the page. Opening and closing the web inspector makes it crash.

Member

liZe commented Jul 17, 2017

The example provided by @si13b still takes a lot of memory. Stats on my i7-6500U @ 2.5GHz :

  • About 40 minutes.
  • 3000 pages exactly (!).
  • 8.2MB generated PDF.
  • 11.2GB of RAM, less than 4MB per page.

I'll try to use a generator instead of a list when we render the pages. 4MB per page looks like a bad but not awful score for me.

Firefox takes 10s and 600MB of RAM to render the page. Opening and closing the web inspector makes it crash.

liZe added a commit that referenced this issue Jul 22, 2017

Don't copy styles when copying boxes, improve memory management
Style is not copied anymore when boxes are duplicated. Style dicts are not
modified anymore during the layout, as it was before for some properties:

- margins, borders and paddings when the box was split between two
  pages (useless as these computed values are stored directly in the box),
- top borders were changed in tables (useless for the same reason),
- bookmark labels and string sets are now stored in the box.

This commit can introduce very subtle bugs that are hard to debug. In the
future, we should try to freeze the style dicts before the layout.

Related to #70.
@liZe

This comment has been minimized.

Show comment
Hide comment
@liZe

liZe Jul 22, 2017

Member

Good news:
perf

  • The red line is Python 3.5 before 344cb08.
  • The blue line is Python 3.6 before 344cb08.
  • The black line is Python 3.6 after 344cb08.

Python 3.6 is a huge improvement thanks to compact dicts.

344cb08 prevents style dicts to be copied each time a box is duplicated. I have to check that it doesn't break anything with the W3C suite, it may have introduced subtle bugs but I'm pretty confident.

Oh, and I think that the "40 minutes" from the last comment were not true 😉.

Member

liZe commented Jul 22, 2017

Good news:
perf

  • The red line is Python 3.5 before 344cb08.
  • The blue line is Python 3.6 before 344cb08.
  • The black line is Python 3.6 after 344cb08.

Python 3.6 is a huge improvement thanks to compact dicts.

344cb08 prevents style dicts to be copied each time a box is duplicated. I have to check that it doesn't break anything with the W3C suite, it may have introduced subtle bugs but I'm pretty confident.

Oh, and I think that the "40 minutes" from the last comment were not true 😉.

@xrmx

This comment has been minimized.

Show comment
Hide comment
@xrmx

xrmx Jul 22, 2017

@liZe amazing, thanks!

xrmx commented Jul 22, 2017

@liZe amazing, thanks!

@liZe

This comment has been minimized.

Show comment
Hide comment
@liZe

liZe Jul 22, 2017

Member

The results are very good for other documents too. I've tested the examples of #384, they're both significantly faster and less memory consuming thanks to Python 3.6 and this commit.

There's room for more improvement though (before I close this issue). Inline boxes need really more memory than block boxes, I don't really know why. That's why you get memory problems with tables: there are lots of text lines in tables (at least one per cell).

Member

liZe commented Jul 22, 2017

The results are very good for other documents too. I've tested the examples of #384, they're both significantly faster and less memory consuming thanks to Python 3.6 and this commit.

There's room for more improvement though (before I close this issue). Inline boxes need really more memory than block boxes, I don't really know why. That's why you get memory problems with tables: there are lots of text lines in tables (at least one per cell).

@liZe liZe changed the title from Profile big tables. to Profile big tables Aug 3, 2017

@liZe

This comment has been minimized.

Show comment
Hide comment
@liZe

liZe Aug 3, 2017

Member

I've done my best to both add optimizations and clean the code, and I'm really happy with the result. I've done some benchmarks with Python 3.6 on Linux, for 4 different versions:

  • 0.39,
  • 344cb08 (first step with important optimizations),
  • 033f473 (style dicts frozen for "security" purpose, but less optimized), and
  • 84bdee1 (style dicts that can be modified, but modification only happens at "secure" moments).

I had also tested 0.31 with Python 3.5 in #384.

I'm closing this issue as the easy part has been done. If anyone is interested in even better performance, you only have to:

  • freeze StyleDicts, as it will avoid terrible future (current?) bugs, help us keeping a simple code (see the TODO in the StyleDict class), and make them hashable; and
  • deduplicate styles given by csssselect2 for elements that share the same style: each box will only carry a reference instead of a whole style copy, that will probably save a little bit time and ~75% of the memory for large documents like the exemple given here (small CSS, huge HTML).

This work can be done here for sure, but also in CSSSelect2 (if possible, it would be better).

Good luck!

(Spoiler alert: named pages coming soon may hurt this speed a little bit.)

Large document

It's the large document given as example here, with 3000 pages of paragraphs and tables.

0.39:

  • More than 30 minutes (?)
  • 8944MB

344cb08:

  • 450s
  • 6419MB

033f473:

  • 430s
  • 7695MB

84bdee1:

  • 385s (wow)
  • 6216MB (30% less than 0.39)

Alice Adventures in Wonderland

https://www.gutenberg.org/files/11/11-h/11-h.htm

Long document with repetitive justified text.

0.31:

  • 12.5s
  • 162MB

0.39:

  • 7.2s
  • 143MB

344cb08:

  • 6.8s
  • 109MB

033f473:

  • 6.3s
  • 111MB

84bdee1:

  • 6.2s (14% less than 0.39)
  • 105MB (27% less than 0.39)

HTML5 Specification

https://www.w3.org/TR/html5/

Long document with a lot of lists and underlined links.

0.31:

  • 6.0s
  • 146MB

0.39:

  • 4.2s
  • 137MB

344cb08:

  • 4.1s
  • 106MB

033f473:

  • 4.0s
  • 117MB

84bdee1:

  • 3.8s (10% less than 0.39)
  • 105MB (23% less than 0.39)

Online Wikipedia

https://en.wikipedia.org/w/index.php?title=HTML5&printable=yes

Printable version of a Wikipedia page, not downloaded before, long left-aligned paragraphs with floats.

0.31:

  • 7.9s
  • 134MB

0.39:

  • 7.2s
  • 130MB

344cb08:

  • 6.1s
  • 104MB

033f473:

  • 6.1s
  • 108MB

84bdee1:

  • 6.0s (17% less than 0.39)
  • 101MB (22% less than 0.39)
Member

liZe commented Aug 3, 2017

I've done my best to both add optimizations and clean the code, and I'm really happy with the result. I've done some benchmarks with Python 3.6 on Linux, for 4 different versions:

  • 0.39,
  • 344cb08 (first step with important optimizations),
  • 033f473 (style dicts frozen for "security" purpose, but less optimized), and
  • 84bdee1 (style dicts that can be modified, but modification only happens at "secure" moments).

I had also tested 0.31 with Python 3.5 in #384.

I'm closing this issue as the easy part has been done. If anyone is interested in even better performance, you only have to:

  • freeze StyleDicts, as it will avoid terrible future (current?) bugs, help us keeping a simple code (see the TODO in the StyleDict class), and make them hashable; and
  • deduplicate styles given by csssselect2 for elements that share the same style: each box will only carry a reference instead of a whole style copy, that will probably save a little bit time and ~75% of the memory for large documents like the exemple given here (small CSS, huge HTML).

This work can be done here for sure, but also in CSSSelect2 (if possible, it would be better).

Good luck!

(Spoiler alert: named pages coming soon may hurt this speed a little bit.)

Large document

It's the large document given as example here, with 3000 pages of paragraphs and tables.

0.39:

  • More than 30 minutes (?)
  • 8944MB

344cb08:

  • 450s
  • 6419MB

033f473:

  • 430s
  • 7695MB

84bdee1:

  • 385s (wow)
  • 6216MB (30% less than 0.39)

Alice Adventures in Wonderland

https://www.gutenberg.org/files/11/11-h/11-h.htm

Long document with repetitive justified text.

0.31:

  • 12.5s
  • 162MB

0.39:

  • 7.2s
  • 143MB

344cb08:

  • 6.8s
  • 109MB

033f473:

  • 6.3s
  • 111MB

84bdee1:

  • 6.2s (14% less than 0.39)
  • 105MB (27% less than 0.39)

HTML5 Specification

https://www.w3.org/TR/html5/

Long document with a lot of lists and underlined links.

0.31:

  • 6.0s
  • 146MB

0.39:

  • 4.2s
  • 137MB

344cb08:

  • 4.1s
  • 106MB

033f473:

  • 4.0s
  • 117MB

84bdee1:

  • 3.8s (10% less than 0.39)
  • 105MB (23% less than 0.39)

Online Wikipedia

https://en.wikipedia.org/w/index.php?title=HTML5&printable=yes

Printable version of a Wikipedia page, not downloaded before, long left-aligned paragraphs with floats.

0.31:

  • 7.9s
  • 134MB

0.39:

  • 7.2s
  • 130MB

344cb08:

  • 6.1s
  • 104MB

033f473:

  • 6.1s
  • 108MB

84bdee1:

  • 6.0s (17% less than 0.39)
  • 101MB (22% less than 0.39)

@liZe liZe closed this Aug 3, 2017

@liZe liZe added this to the v0.40 milestone Aug 4, 2017

liZe added a commit that referenced this issue Aug 17, 2017

Copy styles for split boxes
This fixes blocks split between pages and inlines boxes split by block boxes.

Increases memory usage, related to #70.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment