More performance optimizations (small ones this time) #381

Merged
merged 12 commits into from Aug 17, 2012

Conversation

Projects
None yet
3 participants
Contributor

alexdowad commented Aug 7, 2012

These changes boost the speed of my table generation benchmark by about 6%. Most uses of Prawn, including those which don't include tables, should see a small performance improvement.

I have given Prawn::Graphics#method_missing a similar treatment to what Active Record uses for its shortcut methods. Please have a look.

Contributor

alexdowad commented Aug 7, 2012

Just added another "nickel and dime" optimization which gives ~2% speed boost to table benchmark.

Contributor

alexdowad commented Aug 8, 2012

I discovered that although the optimizations I have done so far have made a big difference, PDF generation in my app is still shockingly slow, much slower than I expected from the benchmark results. I investigated and found that the app is using the following optional arguments to Prawn::Document#table:

:header => true, :row_colors => ['FFFFFF', 'F0F0FF'], :cell_style => {:inline_format => true, :padding => 3, :size => 9}

:inline_format especially kills performance. I added some tests to the table benchmark to measure performance with these optional arguments, profiled, and found some optimizations which bring the performance with :inline_format close to that without.

If you can give any suggestions to improve code style, please do so.

This should be "prawnpdf", not "pdfprawn". Reviewing the rest of your changes when I get to them -- thanks!

Owner

alexdowad replied Aug 10, 2012

Youch, sorry. What is the best thing to do in a situation like this? Cancel the PR and issue a new on a new, "clean" branch?

For pull #381, I'm mostly hung up on this commit, because there are a few things I'd like to address with this code all at once.

  1. The API would be more robust if we explicitly whitelisted the methods that were allowed to be called as shortcuts. If we can define them explicitly in advance rather than using method_missing, the code may be a bit more robust and easier to follow. This would also address #382 (though I think the security implications of that are a bit overblown, the general guidance to use whitelisting would probably be a net benefit).
  2. The use of class_eval(string) always bothers me a bit. If there's a way to do this with class_eval(block) or define_method, I'd prefer that.
  3. The code you're replacing here had a slightly different behavior: previously, it would pass through any block to the original method; for example,
pdf.stroke_rectangle(...) { foobar }

would be equivalent to:

pdf.rectangle(...) { foobar }
pdf.stroke

while your code would not call the block. I have not been able to find any place this is actually used in practice, but I'd either like to preserve the existing behavior or prove that it's never or very seldom used.

Thanks for your great performance work!

-be

Contributor

alexdowad commented Aug 12, 2012

Thanks for reviewing my code. As far as the issue of class_eval("...") vs. class_eval { ... }, please note that by its nature, this code inherently treats the names of methods as strings -- it even uses regular expressions to extract a portion of the method name and treats that as the name of another method to call. To achieve the same effect with class_eval { ... } would by necessity require using send, which would add overhead. With class_eval("..."), we can use regular method dispatch, and make the code every bit as fast as if each "shortcut method" was individually hard-coded.

I will amend the code to make it respect passed blocks. If we want to "whitelist", it would be best to define all the needed "shortcut methods" eagerly, probably something like:

# Define shortcut methods for common combinations of graphics primitives
# Should probably use RDoc directive comments so that all the generated methods are included in RDoc
%w{foo bar}.product(%w{fill stroke stroke_and_fill}).each do |method1, method2|
  class_eval "def #{method1}_and_#{method2}; ...; end"
end

This does not provide any performance advantage over method_missing but is more straightforward. If you like this idea, please let me know.

Contributor

alexdowad commented Aug 12, 2012

BTW, styled_width_of_M is partly in jest. It does make a significant improvement to performance when using :inline_format, but I'm sure there is a better name. styled_width_of_single_character would be clearer, but is very long. On the other hand, I have seen a couple deliberately silly, quirky things in the Prawn source, so maybe styled_width_of_M is not out of place.

Owner

alexdowad commented on 45569b2 Aug 12, 2012

Note that text_rendering_mode is called many times in the "inner loop" of table layout/rendering, so eliminating the recursive call actually makes a measurable (though small) difference in performance. The "parser regex" is not dynamic at all, so there is no reason to redefine and recompile it for each table cell rendered.

Member

bradediger commented Aug 12, 2012

OK, I understand the class_eval(string) issue and I'm OK with your solution. I would prefer the approach where all allowed methods are defined in advance; it's a bit obfuscatory, but in my mind a bit better than method_missing. It's a value judgment, I guess.

As for the "width of M" thing, that's my (admittedly hacky) code. As I recall, its sole purpose is working around some pathological edge cases of text sizing. I think the issue was that the previous interaction between the table sizing code and text wrapping would happily create a column of width too small to render even a single character, at which point an infinite loop ensued as the wrapping code created new lines, each too narrow to render any text. Using a minimum width theoretically large enough to ensure that we can at least write one character per line fixed this problem.

(I know that using the width of "M" is a terrible, Anglocentric choice, and I'd welcome any fixes that actually use the notion of "the widest glyph in this font" or, even better, "the widest character in this possibly styled string".)

Member

bradediger commented Aug 12, 2012

Oh, and if you're asking about naming, I prefer styled_width_of_single_character.

Contributor

alexdowad commented Aug 13, 2012

Brad, I have found that none of the "shape" methods in Prawn::Graphics (such as circle, rectangle, etc.) actually use a passed block. Do you think it's possible one of them might make use of blocks in the future?

Member

bradediger commented Aug 13, 2012

I can't imagine that they would; it's probably fine to remove the block
then.

On Sun, Aug 12, 2012 at 8:18 PM, Alex Dowad notifications@github.comwrote:

Brad, I have found that none of the "shape" methods in Prawn::Graphics(such as
circle, rectangle, etc.) actually use a passed block. Do you think it's
possible one of them might make use of blocks in the future?


Reply to this email directly or view it on GitHubhttps://github.com/prawnpdf/prawn/pull/381#issuecomment-7681797.

Contributor

alexdowad commented Aug 13, 2012

I'm about to push a patch for the Graphics method, but am trying to figure out how to document the generated methods in the RDoc. http://stackoverflow.com/questions/11927534/how-to-add-rdoc-documentation-for-a-method-defined-using-class-eval

Contributor

alexdowad commented Aug 13, 2012

OK, shortcut methods for Prawn::Graphics are defined explicitly now (which should solve the related issue). I also added RDoc for the ones which I think people will actually use. (I omitted RDoc for some generated shortcut methods like fill_line -- why would anyone want to fill a line?)

Although I am doubtful about the utility of methods like stroke_line_to (and didn't include RDoc for it), all the possible shortcuts for all the "shape" methods are defined, to avoid breaking client code. stroke_circle_at, etc. can be removed when the deprecated circle_at is.

@bradediger bradediger merged commit 4113335 into prawnpdf:master Aug 17, 2012

Member

bradediger commented Aug 17, 2012

Merged. Thanks @alexdowad for your tireless performance work!

Member

Bluejade commented Aug 17, 2012

Thank you Alex and Brad!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment