New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Template cache, {% extends %} and globals - design issue? #295
Comments
Yes. This is currently the case. This is related to the general problems with scoping in Jinja2 that need to be tackled. Unfortunately that is a very big change and I am not sure how to fix this without spending a lot of time on it. |
Here are a couple observations for anyone who runs into this in the future, as I have just this evening. If you do @miracle2k's steps in the opposite order, that could be equally problematic: If you first instantiate Of course, I'm pretty sure this breaks as well. Is it documented somewhere that you shouldn't call globals_1 = {"x": 42}
globals_2 = {"y": 101}
jinja2_env = jinja2.Environment(loader=jinja2.FileSystemLoader("."))
instance_1 = jinja2_env.get_template("foo.html", globals=globals_1)
instance_2 = jinja2_env.get_template("foo.html", globals=globals_2)
# This fails because instance_1.globals is instance_2.globals.
assert "y" in instance_2.globals I think you can get around these problems with template globals in exchange for a ~10x performance hit by using a byte code cache and disabling class MemoryBytecodeCache(jinja2.BytecodeCache):
def __init__(self):
self._cache = {}
def load_bytecode(self, bucket):
try:
bucket.code = self._cache[bucket.key]
except KeyError:
return
def dump_bytecode(self, bucket):
self._cache[bucket.key] = bucket.code
jinja2_env = jinja2.Environment(
loader=jinja2.FileSystemLoader("."),
cache_size=0,
bytecode_cache=MemoryBytecodeCache(),
)
instance_1 = jinja2_env.get_template("foo.html", globals=globals_1)
instance_2 = jinja2_env.get_template("foo.html", globals=globals_2)
# This now tests true, like I originally expected.
assert "y" in instance_2.globals Here's a hasty benchmark showing the cost of disabling the template cache (macOS, 2020 MacBook Pro, Python 3.8):
Code that produced that: def run_silly_benchmark(*timer_args, repeat=5, **timer_kwargs):
timer = timeit.Timer(*timer_args, **timer_kwargs)
num_times, time_taken = timer.autorange()
times = [time_taken]
for _ in range(1, repeat):
times.append(timer.timeit(num_times))
print(
f"Time per call for {num_times:6d} loops: ",
" ".join(
f"{(time * 10**6) / num_times:7.2f} μs" for time in sorted(times)
),
)
loader = jinja2.FileSystemLoader(".")
for env in (
jinja2.Environment(loader=loader),
jinja2.Environment(loader=loader, cache_size=0),
jinja2.Environment(loader=loader, bytecode_cache=MemoryBytecodeCache()),
jinja2.Environment(
loader=loader, cache_size=0, bytecode_cache=MemoryBytecodeCache()
),
):
print(
"Testing %s template cache, %s byte code cache"
% (
"without" if env.cache is None else "with",
"without" if env.bytecode_cache is None else "with",
)
)
run_silly_benchmark("env.get_template('foo.html')", globals={"env": env})
|
I'm not sure if you saw, but there's a PR linked above your comment that addresses this. |
I did notice that, thank you. I noticed the PR is a couple months old on an issue that's six years old, and I can't even concretely say that it will be merged, so since I didn't know how much longer this issue might be outstanding I thought my comments might help someone else passing this way. And please don't anyone on the development team take this as a criticism—I absolutely understand how valuable developer time is, and I am very grateful for all the work that's gone into this project. |
It will be merged, along with all the other PRs we've gotten from the MLH program over the last 3 months. There have been so many I haven't been able to get to them all yet. |
This issue is now closed due to #1244, but for what it's worth – the behavior after this PR is still very surprising to me as a user. The new behavior is to merge the new globals in with any globals previously passed but to retain existing globals from the cached template. This means that data can still leak between calls to get_template (even possibly resulting in leaking data between users, depending on how jinja is used – an incredibly severe bug). It also means the semantic behavior depends on if caching is enabled, rather than caching being solely a performance optimization. I hope this behavior can be reconsidered before final release. |
The fact that different globals can be passed to
I'm inclined to go with option 1. I don't believe it's particularly common to load the same template with different globals. Flask, for example, doesn't expose the fact that templates can have distinct globals at all, and systems such as Ansible don't expose this either. |
If I understand correctly, the primary motivation for the cache is to avoid the cost of re-parsing the template HTML multiple times. My hope would be that Jinja could cache that parsing but return a new Template object, which should require only a single small allocation. If Flask and Ansible don't expose the ability to pass different globals in, then they wouldn't be affected by a minor performance regression as long as the caching of templates without globals was not affected (I agree that that seems to be the more common usage pattern). I would argue that even with a documented warning, this is a very large footgun (even if someone is not too likely to run into this, it can be an incredibly bad bug if it does crop up) and so would prefer a solution 2 or 3 – but a warning in the documentation would be still a significant improvement over the status quo. |
The cache stores the loaded template object, keyed on the loader and name. Jinja parses Jinja syntax, not HTML. The parsing is cached by the bytecode cache, not the template cache. The template cache is only to avoid loading the same object more than once, it's not a guarantee that the context the object is rendered with, such as globals, will be unchanged. I'll add docs, but both of you have identified how to avoid the cache issue: by disabling the cache if you want to change template globals every time in a multi call / thread safe way. Given that the docs already discourage this, I'm classifying this as undefined behavior that you shouldn't be using. I don't agree it's an "incredibly severe bug", there's nothing technically wrong with the current behavior, it's just not the behavior you want. |
Sorry, sloppy writing on my part. In case it's not clear, I'm thinking of code something like the following: env = jinja2.Environment()
@app.route('/sendpasswordreset')
def send_password_reset():
resetUrl = generate_password_reset_url(g.logged_in_user_id)
template = env.get_template('password_reset_email.html', globals={'RESET_URL': resetUrl})
html = template.module.password_reset_email()
send_email(g.logged_in_user_id, html) Whereas the very similar-looking template = env.get_template('password_reset_email.html')
html = template.module.password_reset_email(RESET_URL=resetUrl) is correct (I believe). If the globals leak between requests, then you can send one person's password reset link to another person. This is an incredibly severe bug (and I indeed had a serious production issue resulting from this behavior). Obviously not all instances of this bug will be equally severe. |
Globals as they were created are not for per-render instantiations. This used to be a much stronger point made originally when they were the only thing passed between imports. I think Flask might have negatively contributed to people having different expectations in globals because it registers thread locals there (request etc.). Globals inherently have to be cached. It's not a small performance impact, but has some rather fundamental implications about how Jinja2 operates. I do think that this should be more explicitly called out in the docs though if this causes confusion. |
The correct way to pass data into a template is to use |
In case you're misunderstanding me: My confusion isn't with expecting global objects (like my My mental model is that get_template return a new template each time it is called with different arguments, not that it reuses a single template and mutates it. (I generally don't expect functions named "get" to have any effects.) So I wasn't thinking about this as "changing the globals for an existing template", instead as getting a new template with the right globals. Perhaps this is the wrong mental model but I think it is one that is easy to have. This behavior to me is just as unexpected as if multiple instantiations of Environment shared globals, or as if the variables passed to the Certainly I realize now that globals were not intended for this. But I don't think it is at all obvious from the API itself that there is such a meaningful difference in the behavior between passing variables as |
The op's issue was that the template got loaded incidentally as part of On the other hand, your issue is that it wasn't clear that globals are not for render-specific data. I understand how you could arrive at that, but the only thing I can do about it is update the docs and hope that people can't misinterpret them again. #1244 at least allows updating the template globals like the op wanted without clobbering the environment globals. Yes, they're still globals and not local to each render of the template, but that's by definition. If we disallow updating template globals, we're back at the op's issue and your issue. We could raise an error so you wouldn't have been able to do what you tried, but that doesn't change the op's issue. If we try to cache based on the globals as well, that allows what you were trying to do, but we want to discourage that not enable it. |
My proposal given my understanding of your constraints would be to either (a) include the globals in the cache key (even just by dict object identity) – even if that isn't something you want to encourage – or (b) not cache invocations to get_template that pass any globals. That said, I haven't necessarily meant to suggest any particular solution so far – my intent right now is more to make sure I'm properly communicating why I think this is a pitfall large enough to make it worth seriously considering making a code change for. |
Say I have this template with an extends call:
foo.html:
I load it:
This causes two calls to
Environment._load_template
: First, forfoo.html
, the forbase.html
.Environment._load_template
caches both templates.The crucial point is that for the second call, the "globals" parameter to
Environment._load_template
is missingmy_globals
.If I now attempt to load the base template directly:
The cache will be used, and my globals will not be available.
Using 2.7.2
The text was updated successfully, but these errors were encountered: