Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize URL building #1281

Merged
merged 1 commit into from May 23, 2018

Conversation

Projects
None yet
6 participants
@edk0
Copy link
Member

commented Apr 20, 2018

<CounterPillow> if Python was as good as D, url_for would just be a compile time template.

I appreciate how silly this is going to look, but it does make URL building much faster.

$ git checkout -q upstream/master && python3 test.py
0.5681149999982154
$ git checkout -q optimize-building && python3 test.py
0.0826580000029935

(test.py)

I have some ideas for related tests, but don't want to put any more time into this unless it might actually go somewhere. It passes the existing tests on py2 and 3, at least on a clear Thursday night with a waxing crescent moon.

Worth noting up front: this approach is mutually exclusive with anything like #724

@davidism

This comment has been minimized.

Copy link
Member

commented Apr 20, 2018

Wow, thanks for looking into this. Building URLs definitely needs to be faster, and this is an interesting approach.

However, I need to be able to maintain this, which means I need to understand what's going on. Would you give at least a basic explanation of what this does?

@ThiefMaster

This comment has been minimized.

Copy link
Member

commented Apr 20, 2018

Heh, exactly what I thought - this looks like black magic, which is something I usually like, but from a quick look it wasn't obvious to me what you are doing there. Something with assembling Python bytecode yourself?

@edk0

This comment has been minimized.

Copy link
Member Author

commented Apr 20, 2018

Would you give at least a basic explanation of what this does?

I'll give it a go!

@ThiefMaster is right—I'm emitting python bytecode for a function (well, actually two) per Rule that builds its return value in an optimal way, or at least the best way I could see. Apart from the obvious advantages of pre-compiling something, this has the neat (and, from what I've seen, probably quite significant) side effect of destructuring the values dict, which is a pain otherwise.

The bytecode for a_rule from above, /a/<int:x>/<y>, looks like this (though there's also a shorter version used when append_unknown is false). I've omitted some extraneous detail, like memory addresses, from the pastes a bit for brevity:

  1           0 LOAD_CONST               0 ('')
              2 LOAD_CONST               1 ('/a/')
              4 LOAD_CONST               2 (<bound method NumberConverter.to_url>)
              6 LOAD_FAST                0 (x)
              8 CALL_FUNCTION            1
             10 LOAD_CONST               3 ('/')
             12 LOAD_CONST               4 (<bound method BaseConverter.to_url>)
             14 LOAD_FAST                1 (y)
             16 CALL_FUNCTION            1
             18 LOAD_FAST                2 (.keyword_arguments)
             20 JUMP_IF_TRUE_OR_POP     28
             22 LOAD_CONST               0 ('')
             24 DUP_TOP
             26 JUMP_FORWARD            10 (to 38)
        >>   28 LOAD_CONST               5 (functools.partial(url_encode, [lots of stuff]))
             30 ROT_TWO
             32 CALL_FUNCTION            1
             34 LOAD_CONST               6 ('?')
             36 ROT_TWO
        >>   38 BUILD_STRING             6
             40 BUILD_TUPLE              2
             42 RETURN_VALUE

which is an awful lot like the code you'd get by compiling this:

def builder(x, y, **kwargs):
    return ('',
            ''.join(('/a/',
                     self._converters['x'].to_url(x),
                     '/',
                     self._converters['y'].to_url(y),
                     '?' if kwargs else '',
                     url_encode(kwargs, [lots of stuff]) if kwargs else '')))

but with one less jump, and with all the information that's known ahead-of-time loaded into the function as literals. I chose this route rather than, say, generating a Python AST (let's not even talk about generating literal source) and compiling that because it makes certain things easier: for example, Python 3.6 has formatted string literals, and the bytecode compiler can use their BUILD_STRING opcode just as easily as the ''.join technique.

As for altogether saner options like generating a list of things to do and executing them with regular Python, I considered it, then realised that list is basically what Python bytecode already does. With that said, I'm sure a middle ground is possible, and I recognize that sometimes fun has to give way to maintainability :)

@lepture

This comment has been minimized.

Copy link
Member

commented Apr 21, 2018

This is really fantastic. But I don't agree with " I recognize that sometimes fun has to give way to maintainability". Werkzeug is widely used by many people, I think maintainability is very important.

@edk0

This comment has been minimized.

Copy link
Member Author

commented Apr 21, 2018

That's what I meant—maintainability is more important than fun

@agrif

This comment has been minimized.

Copy link

commented Apr 21, 2018

Just two cents from Random Internet Passerby: Werkzeug is widely used by many people, who use it specifically so that Werkzeug can solve the heck out of common WSGI problems. It can afford some complexity to solve those problems well. I don't think a bytecode generator is going too far, if there is a practical benefit.

@edk0 edk0 force-pushed the edk0:optimize-building branch 3 times, most recently from 44ce9b1 to 261c7c1 Apr 21, 2018

@CounterPillow

This comment has been minimized.

Copy link

commented Apr 26, 2018

Hi,

I'm the person who originally nerdsniped edk into writing this, and I have some interest in it being merged into upstream.

I'm a contributor to nyaa, a flask application codebase which has a fairly large deployment that currently ranks in the low 700s of Alexa's top sites. As the code is written to be deployable even if on a shoestring budget, any optimisations in hot paths are very much welcome.

url_for is such a hot path. Predictably, the majority of the server time is taken up by database queries, but when profiling a page, url_for is usually near the top as well. Pregenerating url_for calls is of course always a possibility, but making url_for faster is still the easier solution considering what it is.

@davidism

This comment has been minimized.

Copy link
Member

commented May 22, 2018

@edk0 Why can't #724 be used with this?

@edk0

This comment has been minimized.

Copy link
Member Author

commented May 22, 2018

First off: I was assuming that trying to destructure a MultiDict with ** would simply fail, but testing it now I see that a roundtrip through MultiDict and ** wraps everything in []:

>>> def x(**kw): print(kw)
... 
>>> x(**MultiDict({'a': 3}))
{'a': [3]}

which isn't nearly as fatal for my approach.

Anyway, the upshot is that MultiDict is not a problem as long as Rule.build is internal/undocumented, and MapAdapter.build (lossily) converts MultiDict down to dict. If that stops happening, my compiled builder functions will see lists of things where plain things were expected, and either fail to convert:

>>> a_rule.build({'x': 1, 'y': 2})
('', '/a/1/2')
>>> a_rule.build(MultiDict({'x': 1, 'y': 2}))
>>>

or yield silly URLs like /a/%5B1%5D, depending on the failure mode for the particular converter.

So, contrary to what I said in the OP, this wouldn't rule out #724, but it would take a little extra work to deal with it.

@davidism

This comment has been minimized.

Copy link
Member

commented May 22, 2018

Just to make sure I understand: MultiDict, when passed a dict, will convert single values (as opposed to lists) to lists of length 1. As long as the reverse happens during build, everything will work?

>>> a = MultiDict({'x': 1, 'y': [2, 3]})
>>> a
MultiDict([('a', 1), ('b', 2), ('b', 3)])

# build should do this
>>> to_dict(a)
{'a': 1, 'b': [2, 3]}

That's how the PR works now.

@edk0

This comment has been minimized.

Copy link
Member Author

commented May 22, 2018

As long as the reverse happens during build, everything will work?

Yes.

@davidism

This comment has been minimized.

Copy link
Member

commented May 22, 2018

I'm going to clean up that other PR, merge both locally and run the tests, then merge both here if everything works.

@davidism davidism added this to the 0.15 milestone May 22, 2018

@@ -741,6 +745,9 @@ def _build_regex(rule):
if not self.is_leaf:
self._trace.append((False, '/'))

self._build = self.compile_builder(False)
self._build_u = self.compile_builder(True)

This comment has been minimized.

Copy link
@davidism

davidism May 23, 2018

Member

Let's use a more descriptive name like _build_unknown.

return quote


fast_url_quote = _make_url_encoder()

This comment has been minimized.

Copy link
@davidism

davidism May 23, 2018

Member

Just checking, did you benchmark this to confirm it's actually faster? If it is faster and does the same thing as url_quote, why not replace url_quote completely?

This comment has been minimized.

Copy link
@edk0

edk0 May 23, 2018

Author Member

I did. It's faster, but achieves fastness by pre-baking all the parameters except the string to quote, so it's not a suitable replacement in general.

(edit: this much faster, with the test script from the OP)

fn = types.FunctionType(co, {}, None, self.argdefs)
return fn

def compile_builder(self, append_unknown=True):

This comment has been minimized.

Copy link
@davidism

davidism May 23, 2018

Member

This should probably be _compile_builder, it's not part of the public API for a rule.

This comment has been minimized.

Copy link
@edk0

edk0 May 23, 2018

Author Member

👍


return domain_part, url
return self._build(**values)
except TypeError:

This comment has been minimized.

Copy link
@davidism

davidism May 23, 2018

Member

I'm assuming this replaces the except ValidationError from the previous code, but why has it changed to TypeError? BuilderCompiler doesn't appear to hanlde ValidationError either.

This comment has been minimized.

Copy link
@edk0

edk0 May 23, 2018

Author Member

Functions raise TypeError when they're called with improper arguments, so I caught it—with hindsight, I think I'd missed that Rule.build is internal and doesn't need to handle such cases, so it could go. As for ValidationError, converters never raise it when going from Python to URL anyway.

This comment has been minimized.

Copy link
@davidism

davidism May 23, 2018

Member

The NumberConverter raises a ValidationError in both to_python and to_url. The fact that it was being handled suggests it was part of the API even if few of the built-in converters used it.

This comment has been minimized.

Copy link
@edk0

edk0 May 23, 2018

Author Member

I'm not seeing where, but I'm sure putting it back can't hurt. Will do.

This comment has been minimized.

Copy link
@davidism

davidism May 23, 2018

Member

Well, you're right, I was mistaken about NumberConverter.to_python, but I think we should still handle the case.

This comment has been minimized.

Copy link
@edk0

edk0 May 23, 2018

Author Member

Sure, done 7d10b58

@edk0 edk0 force-pushed the edk0:optimize-building branch 3 times, most recently from c935441 to 7d10b58 May 23, 2018

@davidism

This comment has been minimized.

Copy link
Member

commented May 23, 2018

OK, I think this is in a good place at this point. Would you rebase this to squash all the intermediate work into one commit against current master?

git fetch origin  # or whatever you called this remote
git rebase -i origin/master
git push -f

(You can try it out on a branch first if you want to make sure it will rebase cleanly.)

@edk0 edk0 force-pushed the edk0:optimize-building branch 3 times, most recently from f035a7f to f4c8d12 May 23, 2018

@edk0

This comment has been minimized.

Copy link
Member Author

commented May 23, 2018

@davidism alright, rebased :)

@davidism davidism merged commit 6cb8227 into pallets:master May 23, 2018

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

davidism added a commit that referenced this pull request May 23, 2018

@davidism

This comment has been minimized.

Copy link
Member

commented May 24, 2018

Just curious, since someone reminded me I forgot to ask this during review: is there a reason not to just eval a formatted string to produce the functions? Are we gaining something by taking on the complexity of bytecode generation?

@edk0

This comment has been minimized.

Copy link
Member Author

commented May 24, 2018

I guess the reasons boil down to two general ones: generating source code that does what we want is deceptively hard, and generated bytecode is more optimized than what Python makes. It's difficult to get Python to not be dynamic :)

Specifically:

  • There is no literal syntax for self._converters['x'].to_url (or ''.join, or url_encode). You can get close with an argument with a default, but its name will be unavailable as a real argument name.
  • The bytecode generator can (ab)use the BUILD_STRING opcode to avoid a function call. I'm not sure if it's possible to get this any other way (you could generate an f-string, but then you'd get all the extra stuff f-strings have).
  • We know things Python doesn't: for example, when Python compiles my example code above it loads kwargs three times and tests it twice, though once each is sufficient.
@davidism

This comment has been minimized.

Copy link
Member

commented Mar 19, 2019

Just released Werkzeug 0.15 with this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.