Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2.8.0 performance regression compared to 2.7.3 #485

Closed
jstasiak opened this issue Sep 1, 2015 · 6 comments
Closed

2.8.0 performance regression compared to 2.7.3 #485

jstasiak opened this issue Sep 1, 2015 · 6 comments

Comments

@jstasiak
Copy link

jstasiak commented Sep 1, 2015

It seems like with Jinja2 2.8.0 there's extra IO being performed even with templates being cached and auto reloading disabled.

Test case (we use template includes a lot and this is a relatively good test):

# main.py
from jinja2 import Environment, FileSystemLoader

if __name__ == '__main__':
    environment = Environment(
        auto_reload=False,
        loader=FileSystemLoader(['.']),
    )
    template = environment.get_template('template.txt')

    for i in xrange(1000):
        template.render()
# template.txt
{% for i in range(100) %}
    {% include 'part.txt' %}
{% endfor %}
# part.txt
this is part

With Jinja2 2.7.3:

./jinja2-27/bin/python -m cProfile -s tottime ./main.py | head -n 20
         2536172 function calls (2535412 primitive calls) in 2.276 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   101000    0.436    0.000    1.033    0.000 runtime.py:53(new_context)
   101000    0.380    0.000    0.546    0.000 runtime.py:112(__init__)
   301000    0.309    0.000    2.148    0.000 template.txt:5(root)
   100001    0.178    0.000    0.665    0.000 environment.py:770(get_template)
   100003    0.143    0.000    0.205    0.000 utils.py:380(__getitem__)
   100001    0.097    0.000    0.394    0.000 environment.py:756(_load_template)
   100003    0.084    0.000    0.290    0.000 utils.py:336(get)
   101003    0.082    0.000    0.094    0.000 nodes.py:81(__init__)
   101000    0.077    0.000    1.110    0.000 environment.py:995(new_context)
   201010    0.070    0.000    0.103    0.000 _compat.py:57(<lambda>)
1003/1002    0.065    0.000    2.213    0.002 {method 'join' of 'unicode' objects}
   102365    0.055    0.000    0.055    0.000 {isinstance}
   100017    0.042    0.000    0.042    0.000 {locals}
   100006    0.040    0.000    0.040    0.000 {method 'acquire' of 'thread.lock' objects}
   201010    0.033    0.000    0.033    0.000 {method 'iteritems' of 'dict' objects}

With Jinja2 2.8.0:

./jinja2-28/bin/python -m cProfile -s tottime ./main.py | head -n 20
         4143255 function calls (4142307 primitive calls) in 6.944 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   100003    0.866    0.000    0.866    0.000 {open}
   100003    0.646    0.000    0.646    0.000 {method 'read' of 'file' objects}
   100003    0.538    0.000    3.989    0.000 loaders.py:167(get_source)
   101000    0.528    0.000    1.246    0.000 runtime.py:55(new_context)
   101000    0.453    0.000    0.647    0.000 runtime.py:115(__init__)
   100003    0.418    0.000    0.418    0.000 {posix.stat}
   301000    0.367    0.000    6.793    0.000 template.txt:5(root)
   100003    0.345    0.000    0.345    0.000 {method 'close' of 'file' objects}
   100001    0.302    0.000    4.688    0.000 environment.py:768(_load_template)
   100001    0.208    0.000    5.014    0.000 environment.py:791(get_template)
   100003    0.207    0.000    0.207    0.000 {_codecs.utf_8_decode}
   100003    0.197    0.000    0.301    0.000 loaders.py:22(split_template_path)
   100003    0.187    0.000    0.279    0.000 utils.py:391(__getitem__)
   100004    0.156    0.000    0.431    0.000 {method 'decode' of 'str' objects}
   100003    0.149    0.000    0.240    0.000 posixpath.py:61(join)

This is CPython 2.7.10.

This regression is likely to have been introduced by:

@urbaniak
Copy link

Got the same issue, are we planning to do something with that?

@ThiefMaster
Copy link
Member

ping @mitsuhiko

@squfrans
Copy link

squfrans commented Mar 2, 2016

i can see a lot of extra IO after adding line 175 to loader.py:

167     def get_source(self, environment, template):
168         pieces = split_template_path(template)
169         for searchpath in self.searchpath:
170             filename = path.join(searchpath, *pieces)
171             f = open_if_exists(filename)
172             if f is None:
173                 continue
174             try:
175                 print("FileSystemLoader: reading file %s" % filename)      # <----
176                 contents = f.read().decode(self.encoding)
177             finally:
178                 f.close()

@iceboy233
Copy link

Encountered the same problem. The FileSystemLoader is reading disk even if the template is in cache due to some logic error in Environment._load_template() introduced in 6671b9. It is deadly in a purely async web server. Downgrade to 2.7.3 for now.

@ifduyue
Copy link

ifduyue commented Mar 4, 2016

I think the problem is at line 670 in 6671b97, where a template file is opened only to get its abspath.

@iceboy233
Copy link

I think this need to be fixed in 2.8.1...

pgjones added a commit to pgjones/jinja that referenced this issue May 8, 2016
In 6671b97 the load_template method
was altered to use a cache key other than the template name. The key
chosen was the abs path as returned from the loader get_source
method. Unless there is no path in which case the name is
used. Unfortunately this introduced a performance regression, pallets#485, as
the get_source method (in the FileStoreLoader) loads the template
(causing IO).

The purpose of pallets#332 was to allow the loader to change whilst ensuring
the correct template was loaded, i.e. to fix this case

    env.loader = loader1
    env.get_template('index.html') # return loader1/index.html
    env.loader = loader2
    env.get_template('index.html') # also return loader1/index.html because of cache

This commit changes the cache key to be a combination of a hash of the
loader and the template name. Therefore fixing the above case without
calling the get_source method and thereby avoiding the IO load.
pgjones added a commit to pgjones/jinja that referenced this issue May 9, 2016
In 6671b97 the load_template method
was altered to use a cache key other than the template name. The key
chosen was the abs path as returned from the loader get_source
method. Unless there is no path in which case the name is
used. Unfortunately this introduced a performance regression, pallets#485, as
the get_source method (in the FileStoreLoader) loads the template
(causing IO).

The purpose of pallets#332 was to allow the loader to change whilst ensuring
the correct template was loaded, i.e. to fix this case

    env.loader = loader1
    env.get_template('index.html') # return loader1/index.html
    env.loader = loader2
    env.get_template('index.html') # also return loader1/index.html because of cache

This commit changes the cache key to be a combination of a hash of the
loader and the template name. Therefore fixing the above case without
calling the get_source method and thereby avoiding the IO load.

A test has been added to ensure the above case works as expected, this
required a minor refactor of the caching tests.
pgjones added a commit to pgjones/jinja that referenced this issue May 10, 2016
In 6671b97 the load_template method
was altered to use a cache key other than the template name. The key
chosen was the abs path as returned from the loader get_source
method. Unless there is no path in which case the name is
used. Unfortunately this introduced a performance regression, pallets#485, as
the get_source method (in the FileStoreLoader) loads the template
(causing IO).

The purpose of pallets#332 was to allow the loader to change whilst ensuring
the correct template was loaded, i.e. to fix this case

    env.loader = loader1
    env.get_template('index.html') # return loader1/index.html
    env.loader = loader2
    env.get_template('index.html') # also return loader1/index.html because of cache

This commit changes the cache key to be a tuple of the loader and the
template name. Therefore fixing the above case without calling the
get_source method and thereby avoiding the IO load.

A test has been added to ensure the above case works as expected, this
required a minor refactor of the caching tests.
pgjones added a commit to pgjones/jinja that referenced this issue May 19, 2016
In 6671b97 the load_template method
was altered to use a cache key other than the template name. The key
chosen was the abs path as returned from the loader get_source
method. Unless there is no path in which case the name is
used. Unfortunately this introduced a performance regression, pallets#485, as
the get_source method (in the FileStoreLoader) loads the template
(causing IO).

The purpose of pallets#332 was to allow the loader to change whilst ensuring
the correct template was loaded, i.e. to fix this case

    env.loader = loader1
    env.get_template('index.html') # return loader1/index.html
    env.loader = loader2
    env.get_template('index.html') # also return loader1/index.html because of cache

This commit changes the cache key to be a tuple of the loader and the
template name. Therefore fixing the above case without calling the
get_source method and thereby avoiding the IO load.

A test has been added to ensure the above case works as expected, this
required a minor refactor of the caching tests.
pgjones added a commit to pgjones/jinja that referenced this issue May 19, 2016
In 6671b97 the load_template method
was altered to use a cache key other than the template name. The key
chosen was the abs path as returned from the loader get_source
method. Unless there is no path in which case the name is
used. Unfortunately this introduced a performance regression, pallets#485, as
the get_source method (in the FileStoreLoader) loads the template
(causing IO).

The purpose of pallets#332 was to allow the loader to change whilst ensuring
the correct template was loaded, i.e. to fix this case

    env.loader = loader1
    env.get_template('index.html') # return loader1/index.html
    env.loader = loader2
    env.get_template('index.html') # also return loader1/index.html because of cache

This commit changes the cache key to be a tuple of the id(loader) and
the template name. Therefore fixing the above case without calling the
get_source method and thereby avoiding the IO load.

A test has been added to ensure the above case works as expected, this
required a minor refactor of the caching tests.
@davidism davidism closed this as completed Jul 6, 2016
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 13, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants