Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance is Sluggish #1493

Closed
mlissner opened this issue Oct 9, 2014 · 25 comments
Closed

Performance is Sluggish #1493

mlissner opened this issue Oct 9, 2014 · 25 comments

Comments

@mlissner
Copy link
Contributor

mlissner commented Oct 9, 2014

Since I started using Pelican, I've been a bit annoyed how long it takes for it to generate my content. Right now it takes Pelican about 15s to generate my site (225 posts), which is just too much for fast iteration while I'm writing.

I dug out a profiler this evening and did a little test:

python -m cProfile -s time `which pelican` content  -s pelicanconf.py

This gave me some output that indicated that the slugify function was called a whopping 300,504 times on my puny blog!

Done: Processed 228 article(s), 1 draft(s) and 5 page(s) in 21.45 seconds.
         22876023 function calls (22138959 primitive calls) in 21.782 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    43614    3.182    0.000    3.182    0.000 {method 'write' of '_io.BufferedWriter' objects}
   300504    2.145    0.000   10.506    0.000 utils.py:227(slugify)
1399152/1390065    1.771    0.000    1.903    0.000 {method 'sub' of '_sre.SRE_Pattern' objects}
   300504    1.358    0.000    1.880    0.000 __init__.py:22(unidecode)
1618724/1618686    0.889    0.000    0.903    0.000 {built-in method hasattr}
   303722    0.761    0.000    1.312    0.000 __init__.py:126(unescape)
   303722    0.688    0.000    2.934    0.000 __init__.py:148(striptags)

I'm not an expert on profiling, but if I'm reading this right, it looks like the slowest thing is writing to disk (I have an SSD!) and the second slowest is the slugify function, which took over ten of the 21 seconds.

I started picking at slugify in hopes of making it faster. One idea from an IRC chatroom was to memoize the function, but when I tried that using @functools.lru_cache(maxsize=2048), it threw:

RuntimeError: maximum recursion depth exceeded

And I found the memoize decorator in utils.py, but it threw an error too.

I guess this is a bug report to say:

  1. Slugify is rather, um, sluggish.
  2. What's with all the recursion, can we fix it? I'm not sure I understand it properly, but it looks like it will be a problem when blogs using pelican get bigger.
  3. Why does my minor blog call slugify 300k times? I do NOT have that many links. That'd be over 1000 per page, practically.
  4. Has any effort gone into this and does anybody have ideas for fixing it?

I'm going to move on to other things until I hear back here, but if we can get a conversation going, I'll be happy to try to make this faster.

@justinmayer
Copy link
Member

@smartass101: Any thoughts on this topic?

@mlissner: Sidestepping the question of performance and focusing squarely on the "[this] is just too much for fast iteration while I'm writing" portion of your issue description, you can use the --write-selected flag for faster iterative changes.

@mlissner
Copy link
Contributor Author

mlissner commented Oct 9, 2014

Thanks, @justinmayer. I've played with --write-selected, but I can't for the life of me make it work and there doesn't appear to be any example in the docs.

The obvious way for it to work would be:

pelican content -s pelicanconf.py --debug --write-selected content/snow-mountain-distance-map.md

But that doesn't create any posts. I've tried about 20 other semi-logical approaches too. None create posts. Can't figure it out.

@justinmayer
Copy link
Member

Fairly certain the docs mention that the target is the path to the generated file in question — not the source document.

almet added a commit that referenced this issue Oct 9, 2014
@almet
Copy link
Member

almet commented Oct 9, 2014

I started hacking something; I believe the max recurtion is due to how the comparison works on URLWrapper classes. If we pass the name attribute when there is one, then that should work better.

However, I 'm not sure why the tests are failing now …

almet added a commit that referenced this issue Oct 9, 2014
@smartass101
Copy link
Contributor

One reason why I think slugify is called so often is because of the high dynamic usage of Content.url_format and URLWrapper._normalize_key. This could be made less dynamic, but may break plugins or prevent certain use cases, the extreme reconfigurability by plugins requires everything to be as dynamic as possible.. So while memoizing it might improve the function performance, it is unlikely to decrease the number of calls.

One approach I use to cut down on the calls is that via a plugin I set the override attributes so that they aren't always dynamically created. But that works only because I know it won't break the setup I use.

article.override_url = article.url
article.override_save_as = article.save_as

@mlissner
Copy link
Contributor Author

mlissner commented Oct 9, 2014

@justinmayer I played with --write-selected some more and finally got it to work. I've submitted a PR with more docs for it. Thanks.

@ametaireau, @smartass101, it looks like you two are making progress on the performance issue. I'll stay out of the way so we don't duplicate effort. Let me know if I can help.

One other performance thought I had is that I'm losing a lot of speed to copying static files. Have we looked at creating hard links instead?

@mlissner
Copy link
Contributor Author

mlissner commented Oct 9, 2014

I realize I've been going off on tangents. I moved the hardlinks discussion over here:

#1496

Sorry about that -- a bad tendency of mine.

@justinmayer
Copy link
Member

@mlissner: I believe this has been addressed. If you feel otherwise, please comment here, and I'll re-open.

@mlissner
Copy link
Contributor Author

Great news. I probably won't be profiling this again, but I'll keep my eye out for improvements next time I'm working on a post.

Is there a commit I can check out or a PR where this was completed?

@justinmayer
Copy link
Member

Was referring to caching in general, as well as Alexis's commit noted above.

@mlissner
Copy link
Contributor Author

I could be mistaken, but I don't think the commit above has been merged: https://github.com/getpelican/pelican/compare/fix-1493

@ingwinlu
Copy link
Contributor

@mlissner no, we don't memorize it, @avaris produced an improvement that only generates slugs when needed instead every time; which is even better: #1701

I am interested in another cProfile run if you still have your setup in place. or of anybody else with a big enough site.

@mlissner
Copy link
Contributor Author

Here's a comparison of 3.4.0 and 3.6.0, which looks like it just landed on pip. This creates a lot of output, so I've only included stuff that takes more than half a second. For reference, this is on an i7 laptop with 12GB of RAM and an SSD.

First, 3.4.0:

python -m cProfile -s time `which pelican` content  -s pelicanconf.py
Done: Processed 231 article(s), 2 draft(s) and 5 page(s) in 49.16 seconds.
         23899281 function calls (23139920 primitive calls) in 49.722 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   311538    6.162    0.000   30.052    0.000 utils.py:227(slugify)
1463227/1453360    4.710    0.000    4.942    0.000 {method 'sub' of '_sre.SRE_Pattern' objects}
   311538    3.906    0.000    5.288    0.000 __init__.py:22(unidecode)
1680254/1680216    2.541    0.000    2.553    0.000 {built-in method hasattr}
   314821    2.234    0.000    3.805    0.000 __init__.py:126(unescape)
   314821    1.867    0.000    8.346    0.000 __init__.py:148(striptags)
   832921    1.780    0.000    2.072    0.000 re.py:277(_compile)
     1225    1.687    0.001    1.687    0.001 {built-in method mkdir}
   663549    1.211    0.000    1.716    0.000 {built-in method __new__ of type object at 0x9d1c40}
627979/627758    1.173    0.000    2.319    0.000 <frozen importlib._bootstrap>:2241(_handle_fromlist)
   632826    1.105    0.000    4.322    0.000 __init__.py:70(__new__)
   766495    1.087    0.000    5.172    0.000 re.py:172(sub)
   293982    0.943    0.000   29.487    0.000 urlwrappers.py:43(_normalize_key)
  2918216    0.819    0.000    0.819    0.000 {method 'append' of 'list' objects}
953789/953787    0.744    0.000    0.888    0.000 {built-in method isinstance}
     8034    0.737    0.000   31.941    0.004 article.html:39(block_content)
   287672    0.728    0.000   29.815    0.000 urlwrappers.py:47(__eq__)
    43913    0.636    0.000    0.636    0.000 {method 'write' of '_io.BufferedWriter' objects}
   312954    0.503    0.000    0.506    0.000 {method 'encode' of 'str' objects}

Then, 3.6.0:

python -m cProfile -s time `which pelican` content  -s pelicanconf.py
         23731522 function calls (22971989 primitive calls) in 32.439 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   295672    3.673    0.000   17.576    0.000 utils.py:246(slugify)
1408634/1399174    3.125    0.000    3.418    0.000 {method 'sub' of '_sre.SRE_Pattern' objects}
   295672    2.140    0.000    2.989    0.000 __init__.py:22(unidecode)
1604494/1604456    1.513    0.000    1.528    0.000 {built-in method hasattr}
   298955    1.330    0.000    2.240    0.000 __init__.py:126(unescape)
   814549    1.166    0.000    1.676    0.000 re.py:277(_compile)
   298955    1.102    0.000    4.895    0.000 __init__.py:148(striptags)
   268323    0.798    0.000    0.798    0.000 {method 'match' of '_sre.SRE_Pattern' objects}
   739586    0.727    0.000    3.242    0.000 re.py:172(sub)
599613/599391    0.700    0.000    1.533    0.000 <frozen importlib._bootstrap>:2241(_handle_fromlist)
   623933    0.680    0.000    0.978    0.000 {built-in method __new__ of type object at 0x9d1c40}
   601094    0.664    0.000    2.499    0.000 __init__.py:70(__new__)
   293130    0.631    0.000   18.128    0.000 urlwrappers.py:58(_normalize_key)
  2779828    0.527    0.000    0.527    0.000 {method 'append' of 'list' objects}
972457/972454    0.510    0.000    0.632    0.000 {built-in method isinstance}
   286784    0.507    0.000   18.630    0.000 urlwrappers.py:62(__eq__)

Looks like pretty good improvements. We're still running slugify over 1,000 times per page, which seems insane, but the performance is faster for some reason.

@ingwinlu
Copy link
Contributor

can you add one where you compare a 3.6 with no cache and 3.6 with generator caching as well? Thanks for the data!

EDIT: is that with any plugins?

@mlissner
Copy link
Contributor Author

Sure. How do I turn on generator caching?

On Mon, Jun 15, 2015, 23:00 winlu notifications@github.com wrote:

can you add one where you compare a 3.6 with no cache and 3.6 with
generator caching as well? Thanks for the data!


Reply to this email directly or view it on GitHub
#1493 (comment)
.

@ingwinlu
Copy link
Contributor

CACHE_CONTENT = True
CONTENT_CACHING_LAYER = 'generator'
LOAD_CONTENT_CACHE = True

keep in mind that the first time this won't have any effect (as there is no cache), the second time it should load the appropriate cached content

@avaris
Copy link
Member

avaris commented Jun 16, 2015

#1701 doesn't provide any performance improvement by itself because the slugs for URLWrappers pretty much always produced. ArticlesGenerator will produce them while generating collections for categories, authors and tags. The point of #1701 was to delay slugify call so that a plugin (coincidentally, I'm working on one) can inject a custom slug and slugify won't be called. slugify is rather expensive as you observed.

Regarding the number of slugify calls, that depends on a lot of things. pelican creates a new object for each category, author or tag in every article even if there was an identical one created for some other article. So they all get to call slugify.

Also, these are all connected:

286784    0.507    0.000   18.630    0.000 urlwrappers.py:62(__eq__)
293130    0.631    0.000   18.128    0.000 urlwrappers.py:58(_normalize_key)
295672    3.673    0.000   17.576    0.000 utils.py:246(slugify)

Numbers are excessive for sure, but I doubt this is purely from pelican's internals. If theme is doing some comparisons to URLWrapper objects, it would explain it.

@mlissner is your site source public? If so, I could examine it and maybe see if there is room to improve.

@mlissner
Copy link
Contributor Author

Here's another run after I set the variables mentioned by @ingwinlu and ran it twice:

python -m cProfile -s time `which pelican` content  -s pelicanconf.py
WARNING: WITH_FUTURE_DATES conflicts with CONTENT_CACHING_LAYER set to 'generator', use 'reader' layer instead
Done: Processed 0 articles, 0 drafts, 0 pages and 0 hidden pages in 2.61 seconds.
         1356755 function calls (1321676 primitive calls) in 3.201 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    39027    0.635    0.000    0.635    0.000 {method 'write' of '_io.BufferedWriter' objects}
    39728    0.232    0.000    0.232    0.000 {method 'read' of '_io.BufferedReader' objects}
        2    0.180    0.090    0.180    0.090 {method 'compress' of 'zlib.Compress' objects}
       34    0.117    0.003    0.117    0.003 {built-in method compile}
    12552    0.078    0.000    0.078    0.000 {built-in method stat}
    11365    0.077    0.000    0.119    0.000 posixpath.py:318(normpath)
   125575    0.056    0.000    0.057    0.000 {built-in method isinstance}
      621    0.054    0.000    0.919    0.001 shutil.py:64(copyfileobj)
     1857    0.043    0.000    0.043    0.000 {built-in method open}
      227    0.042    0.000    0.042    0.000 {built-in method loads}
    20689    0.040    0.000    0.089    0.000 nodes.py:164(iter_child_nodes)

Not sure what the warning means, but HOLY COW, that's a massive improvement.

@avaris, yes, my site is public. You can find it here: https://github.com/mlissner/michaeljaylissner.com/. Any thoughts you have would be more than welcome.

@avaris
Copy link
Member

avaris commented Jun 17, 2015

@mlissner not sure what happened there but, yes, doing nothing is usually fast :).

Done: Processed 0 articles, 0 drafts, 0 pages and 0 hidden pages in 2.61 seconds.

BTW, thanks for the link. I'll see what I can do.

@mlissner
Copy link
Contributor Author

@avaris, curious. I literally ran it twice in a row. Seems like the caching worked?

@avaris
Copy link
Member

avaris commented Jun 18, 2015

@mlissner nope, that's not how caching is supposed to work. Cache eliminates reading unmodified input again, pelican should still process articles and pages to generate output.

@mlissner
Copy link
Contributor Author

Oh. Well, that makes sense, and it's friggin' excellent. I hope it's enabled by default.

Look forward to what you learn looking over the site.

@justinmayer
Copy link
Member

In 3.5, caching was enabled by default. But some users would be surprised when they made changes to their settings file and didn't see those changes reflected... due to the caching.

So we just changed the default in 3.6 to caching disabled. This jives better with the "principle of least astonishment", and folks with large sites can enable it easily by adding a few lines to their settings file.

@avaris
Copy link
Member

avaris commented Jun 20, 2015

@mlissner I looked at your site. It was actually the theme that was causing absurd amounts of sluggify calls. This double loop especially with the if aTag == tag. Well, it was also due to the less-than-ideal implementation of __eq__ for URLWrapper objects. The PR above by @ingwinlu will help a great deal with that.

But, if you want a quick fix before that's merged and released, you can change that if to if aTag.slug == tag.slug for a decent performance boost.

@mlissner
Copy link
Contributor Author

Thanks @avaris, for looking into this! I filed a bug in pelican-elegant, and tried to get in touch with the maintainer to get some movement going there. Hopefully we'll get this bottleneck fixed!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants