Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

speed mako up a lot #76

Closed
sqlalchemy-bot opened this issue Feb 26, 2008 · 11 comments
Closed

speed mako up a lot #76

sqlalchemy-bot opened this issue Feb 26, 2008 · 11 comments

Comments

@sqlalchemy-bot
Copy link

Migrated issue, originally created by Anonymous

there are too many context.write() in compiled module, it stands for context._buffer_stack[-1].data.append or context._buffer_stack[-1].write, causes a lot of overhead.

make a shortcut for i can reduce the overhead and speed up mako a lot.


Attachments: speedup_2.patch | speedup.patch

@sqlalchemy-bot
Copy link
Author

Michael Bayer (@zzzeek) wrote:

the patchfile has the same issue as the patch in #78:

File to patch: lib/mako/codegen.py
patching file lib/mako/codegen.py
patch: **** malformed patch at line 43: @@ -455,9 +470,9 @@

not sure why that is since the #77 patch works.

Anyway, I'm concerned about nailing "buffer_stack[-1]" into the local namespace since its not apparent to me yet if that value changes at all (also its kind of ugly).

so see if you can get me a patchfile that works at least and i can play with it.

@sqlalchemy-bot
Copy link
Author

Anonymous wrote:

new patch file, please review again.

@sqlalchemy-bot
Copy link
Author

Michael Bayer (@zzzeek) wrote:

thanks. as I expected, because you're nailing to a specific buffer in the context, several unit tests fail:

z-eeks-Computer:mako classic$ python test/alltests.py 
/Users/classic/dev/mako/test/babelplugin.py:36: RuntimeWarning: babel not installed: skipping babelplugin test
  RuntimeWarning, 1)
........................................................................................................................................................F......F......F........................F....................
======================================================================
FAIL: test_capture_ccall (filters.BufferTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/classic/dev/mako/test/filters.py", line 264, in test_capture_ccall
    assert flatten_result(t.render()) == "this is foo. body: ccall body"
AssertionError

======================================================================
FAIL: test_def (filters.FilterTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/classic/dev/mako/test/filters.py", line 37, in test_def
    assert flatten_result(t.render(x="this is x", myfilter=lambda t: "MYFILTER->%s<-MYFILTER" % t)) == "MYFILTER-> this is foo <-MYFILTER"
AssertionError

======================================================================
FAIL: testnonexpression (filters.FilterTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/classic/dev/mako/test/filters.py", line 171, in testnonexpression
    assert flatten_result(t.render()) == "this is b this is a"
AssertionError

======================================================================
FAIL: test_basic (call.SelfCacheTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/classic/dev/mako/test/call.py", line 424, in test_basic
    "this is foo"
AssertionError

----------------------------------------------------------------------
Ran 212 tests in 3.340s

FAILED (failures=4)

@sqlalchemy-bot
Copy link
Author

Anonymous wrote:

new patch was added, all the test passed except test_basic (call.SelfCacheTest), which is too magic and should not been recommanded to use.

@sqlalchemy-bot
Copy link
Author

Michael Bayer (@zzzeek) wrote:

I would greatly prefer that this patch not dig into context like that. While we aim to be fast, when it starts becoming that the internals are being disassembled within compiled templates its getting out of hand. Theres also a try/except block in it - exception catches are very expensive. And any template that defines a variable or function named "write" will fail. It seems like a lot of complexity for a microscopic speed improvement.

From my point of view, mako is already running on http://reddit.com and we're plenty fast - bigger speed improvements would probably come from native extensions if anything. There are some speed issues involving the "import='*'" flag in <%namespace> but core Mako performance is excellent (the removal of the unicode() method is more compelling).

How much does this patch actually speed things up ? Do you have any time-oriented profiling data ?

@sqlalchemy-bot
Copy link
Author

Anonymous wrote:

I take a real page to profile: http://www.douban.com/

before applying this patch:

3436 function calls (3414 primitive calls) in 0.031 CPU seconds

   Ordered by: internal time, call count
   List reduced from 200 to 40 due to restriction <40>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     1020    0.002    0.000    0.002    0.000 runtime.py:44(write)
       23    0.002    0.000    0.003    0.000 subject.py:700(get_subject)
        3    0.002    0.001    0.002    0.001 subject.py:708(get_subjects)
      352    0.002    0.000    0.002    0.000 cgi.py:1051(escape)
        3    0.001    0.000    0.007    0.002 utils.html.py:354(render_subjects_list)
       50    0.001    0.000    0.001    0.000 user.py:934(get_user)
       10    0.001    0.000    0.002    0.000 subject.py:200(alt_titles)
      352    0.001    0.000    0.003    0.000 filters.py:21(html_escape)
        5    0.001    0.000    0.005    0.001 utils.html.py:1122(render__review_short)
       29    0.001    0.000    0.001    0.000 subject.py:214(title_by_language)
        3    0.001    0.000    0.002    0.001 chart.py:15(get_popular_subjects)
        1    0.000    0.000    0.001    0.001 miniblog.py:48(get_miniblogs)
        1    0.000    0.000    0.000    0.000 review.py:276(get_reviews)
      252    0.000    0.000    0.000    0.000 subject.py:98(get_attribute)
       29    0.000    0.000    0.000    0.000 template.py:139(__init__)
       16    0.000    0.000    0.000    0.000 lookup.py:120(__check)
       30    0.000    0.000    0.000    0.000 utf_8.py:15(decode)
        8    0.000    0.000    0.000    0.000 site_util.ptl:1787(format_mtext)

after applying this patch:

2416 function calls (2394 primitive calls) in 0.026 CPU seconds

   Ordered by: internal time, call count
   List reduced from 199 to 40 due to restriction <40>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
       23    0.002    0.000    0.003    0.000 subject.py:700(get_subject)
      352    0.002    0.000    0.002    0.000 cgi.py:1051(escape)
        3    0.002    0.001    0.002    0.001 subject.py:708(get_subjects)
       50    0.001    0.000    0.001    0.000 user.py:934(get_user)
       10    0.001    0.000    0.002    0.000 subject.py:200(alt_titles)
      352    0.001    0.000    0.003    0.000 filters.py:20(html_escape)
        3    0.001    0.000    0.005    0.002 utils.html.py:361(render_subjects_list)
       29    0.001    0.000    0.001    0.000 subject.py:214(title_by_language)
        5    0.001    0.000    0.004    0.001 utils.html.py:1151(render__review_short)
        3    0.001    0.000    0.002    0.001 chart.py:15(get_popular_subjects)
        1    0.000    0.000    0.000    0.000 miniblog.py:48(get_miniblogs)
        1    0.000    0.000    0.000    0.000 review.py:276(get_reviews)
      252    0.000    0.000    0.000    0.000 subject.py:98(get_attribute)
       16    0.000    0.000    0.000    0.000 lookup.py:121(__check)
        8    0.000    0.000    0.000    0.000 site_util.ptl:1787(format_mtext)
       29    0.000    0.000    0.000    0.000 template.py:138(__init__)
       30    0.000    0.000    0.000    0.000 utf_8.py:15(decode)
        1    0.000    0.000    0.000    0.000 cursors.py:270(_do_query)
        4    0.000    0.000    0.000    0.000 event.py:505(get_event)
    34/32    0.000    0.000    0.001    0.000 runtime.py:201(__getattr__)

As we see, it speed up about 15% in whole page rending time (31ms to 26ms). by excluding the data collecting time (outside Mako, about 15ms), it speed up about 30% (16ms to 11ms).

If the page is more complex, it speed up more.

If making using FastEncodingBuffer or cStringIO sure, the "try ... except ..." is not neccessay.

btw, it seems that filter "html_escape" also need some optimization.

@sqlalchemy-bot
Copy link
Author

Anonymous wrote:

It is clear that Mako is pretty fast, and faster than other Python Template, such as Genshi, but is slower than PTL(Python Template Language), used in Quixote.

PTL is not a complete template, but it is super fast. Recently we want to replace PTL as Mako, so want to make Mako as fast as PTL.

@sqlalchemy-bot
Copy link
Author

Michael Bayer (@zzzeek) wrote:

OK, can you try the attached patch please ? Same idea, but we leave the Context in charge of defining what the "writer" function is and abstraction is maintained. writer can also become an instance variable on Context instead of a method to reduce more method calls. Also the actual name of the writer is given a non-colliding name.

@sqlalchemy-bot
Copy link
Author

Michael Bayer (@zzzeek) wrote:

OK, I got around to it, works pretty well and speeds up basic.py by 12%, ebbce74. thanks for the patch !

@sqlalchemy-bot
Copy link
Author

Changes by Michael Bayer (@zzzeek):

  • changed status to closed

@sqlalchemy-bot
Copy link
Author

Michael Bayer (@zzzeek) wrote:

deb58e9 changes FastEncodingBuffer so that FEB.write() is directly to the self.data.append() so we're now equivalent to your original patch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant