Use FASTCALL in dict.update() #73498

vstinner · 2017-01-18T17:07:43Z

BPO	29312
Nosy	@rhettinger, @vstinner, @larryhastings, @methane, @jdemeyer
PRs	bpo-29312: use METH_FASTCALL for dict.update #14589
Files	dict_update_fastcall.patch dict_update_fastcall-2.patch

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2021-09-21.22:17:04.428>
created_at = <Date 2017-01-18.17:07:42.501>
labels = ['interpreter-core', 'type-feature', '3.9']
title = 'Use FASTCALL in dict.update()'
updated_at = <Date 2021-09-21.22:17:04.427>
user = 'https://github.com/vstinner'

bugs.python.org fields:

activity = <Date 2021-09-21.22:17:04.427>
actor = 'vstinner'
assignee = 'none'
closed = True
closed_date = <Date 2021-09-21.22:17:04.428>
closer = 'vstinner'
components = ['Interpreter Core']
creation = <Date 2017-01-18.17:07:42.501>
creator = 'vstinner'
dependencies = []
files = ['46330', '46334']
hgrepos = []
issue_num = 29312
keywords = ['patch']
message_count = 22.0
messages = ['285744', '285745', '285747', '285766', '285768', '285770', '285774', '285778', '347262', '347271', '347274', '347275', '347276', '347277', '347278', '347279', '347280', '347286', '347506', '347515', '347539', '402391']
nosy_count = 6.0
nosy_names = ['rhettinger', 'vstinner', 'larry', 'methane', 'python-dev', 'jdemeyer']
pr_nums = ['14589']
priority = 'normal'
resolution = 'rejected'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue29312'
versions = ['Python 3.9']

vstinner · 2017-01-18T17:07:42Z

Follow-up of the issue bpo-29311 "Argument Clinic: convert dict methods".

The dict.update() method hs a special prototype:

   def update(arg=None **kw): ...

I don't think that Argument Clinic supports it right now, so I propose to first optimize dict_update() to use METH_FASTCALL.

Attached patch is a first step: convert kwnames tuple to a dict.

A second step would be to avoid the temporary dict and update the dict using args + kwnames directly.

vstinner · 2017-01-18T17:10:15Z

My patch doesn't use _PyStack_AsDict() since this function currently fails with an assertion error if a key is not exactly a string (PyUnicode_CheckExact). dict_update_common() already checks if all keys are string.

vstinner · 2017-01-18T17:14:14Z

See also issue bpo-20291: "Argument Clinic should understand *args and **kwargs parameters".

methane · 2017-01-19T01:38:56Z

Using FASTCALL for methods accepting **kwargs can't skip creating dict in most cases. Because they accepts dict.

Even worth, when calling it like d.update(**config) (yes, it's abuse of
**, but it can be happen in some C methods), FASTCALL may unpack the passed
dict, and pack it again.

So, when considering METH_FASTCALL, supporting **kwargs is lowest priority.
(Off course, supporting it by AC with METH_KEYWORDS is nice to have)

rhettinger · 2017-01-19T06:48:40Z

I like the other AC changes to dict in 29311, but this one seems like it shouldn't be done. There is too much twisting around existing code to force it to use AC and the benefit will be almost nothing. dict.update() is mainly used with a list of tuples argument or with another mapping. The O(1) time spent on the method call is inconsequential compared to the O(n) step of looping over all the inputs and putting them in the dict. Accordingly, I think this method should be skipped, leaving the current clear, stable, fast-enough code in-place.

vstinner · 2017-01-19T10:03:48Z

Oops, I forgot a DECREF: fixed in the patch version 2.

--

Oh wait, I misunderstood how dict.update() is called. In fact, they are two bytecodes to call a function with keyword arguments.

(1) Using **kwargs:

>>> def f():
...  d.update(**d2)
... 
>>> dis.dis(f)
  2           0 LOAD_GLOBAL              0 (d)
              2 LOAD_ATTR                1 (update)
              4 BUILD_TUPLE              0
              6 LOAD_GLOBAL              2 (d2)
              8 CALL_FUNCTION_EX         1
             10 POP_TOP
             12 LOAD_CONST               0 (None)
             14 RETURN_VALUE

(2) Using a list of key=value:

>>> def g():
...  d.update(x=1, y=2)
... 
>>> dis.dis(g)
  2           0 LOAD_GLOBAL              0 (d)
              2 LOAD_ATTR                1 (update)
              4 LOAD_CONST               1 (1)
              6 LOAD_CONST               2 (2)
              8 LOAD_CONST               3 (('x', 'y'))
             10 CALL_FUNCTION_KW         2
             12 POP_TOP
             14 LOAD_CONST               0 (None)
             16 RETURN_VALUE

The problem is that the dict.update() method has a single implementation, the C dict_update() function.

For (2), there is a speedup, but it's minor:

$ ./python -m perf timeit -s 'd={"x": 1, "y": 2}' 'd.update(x=1, y=2)' -p10 --compare-to=../default-ref/python 
Median +- std dev: [ref] 185 ns +- 62 ns -> [patched] 177 ns +- 2 ns: 1.05x faster (-5%)

For (1), I expected that **kwargs would be unpacked *before* calling dict.update(), but kwargs is passed unchanged to dict.update() directly! With my patch, CALL_FUNCTION_EX calls PyCFunction_Call() which uses _PyStack_UnpackDict() to create kwnames and then dict_update() rebuilds a new temporary dictionary. It's completely inefficient! As Raymond expected, it's much slower:

haypo@smithers$ ./python -m perf timeit -s 'd={"x": 1, "y": 2}; d2=dict(d)' 'd.update(**d2)' -p10 --compare-to=../default-ref/python
Median +- std dev: [ref] 114 ns +- 1 ns -> [patched] 232 ns +- 21 ns: 2.04x slower (+104%)

I expect that (1) dict.update(**kwargs) is more common than (2) dict.update(x=1, y=2). Moreover, the speedup for (2) is low (5%), so I prefer to reject this issue.

--

Naoki: "So, when considering METH_FASTCALL, supporting **kwargs is lowest priority. (Off course, supporting it by AC with METH_KEYWORDS is nice to have)"

Hum, dict.update() is the first function that I found that really wants a Python dict at the end.

For dict1.update(**dict2), METH_VARARGS|METH_KEYWORDS is already optimal.

So I don't think that it's worth it to micro-optimize the way to pass positional arguments. The common case is to call dict1.update(dict2) which requires to build a temporary tuple of 1 item. PyTuple_New() uses a free list for such small tuple, so it should be fast enough.

I found a few functions which pass through keyword arguments, but they are "proxy". I'm converting all METH_VARARGS|METH_KEYWORDS to METH_FASTCALL, so most functions will expects a kwnames tuple at the end of the call for keyword arguments. In this case, using METH_FASTCALL for the proxy is optimum for func(x=1, y=2) (CALL_FUNCTION_KW), but slower for func(**kwargs) (CALL_FUNCTION_EX).

With METH_FASTCALL, CALL_FUNCTION_EX requires to unpack the dictionary if I understood correctly.

vstinner · 2017-01-19T11:34:30Z

When analyzing how FASTCALL handles "func(**kwargs)" calls for Python functions, I identified a missed optimization. I created the issue bpo-29318: "Optimize _PyFunction_FastCallDict() for **kwargs".

python-dev · 2017-01-19T11:45:26Z

New changeset e371686229e7 by Victor Stinner in branch 'default':
Add a note explaining why dict_update() doesn't use METH_FASTCALL
https://hg.python.org/cpython/rev/e371686229e7

jdemeyer · 2019-07-04T09:55:51Z

For the benefit of PR 37207, I would like to re-open this discussion. It may have been rejected for the wrong reasons. Victor's patch was quite inefficient, but that's to be expected: msg285744 mentions a two-step process, but during the discussion the second steps seems to have been forgotten.

methane · 2019-07-04T11:07:56Z

How can we avoid unpacking dict in case of d1.update(**d2)?

jdemeyer · 2019-07-04T11:35:53Z

How can we avoid unpacking dict in case of d1.update(**d2)?

We cannot. However, how common is that call? One could argue that we should optimize for the more common case of d1.update(d2).

jdemeyer · 2019-07-04T11:37:51Z

Above, I meant bpo-37207 or PR 13930.

jdemeyer · 2019-07-04T11:48:03Z

How can we avoid unpacking dict in case of d1.update(**d2)?

The unpacking is only a problem if you insist on using PyDict_Merge(). It would be perfectly possible to implement dict merging from a tuple+vector instead of from a dict. In that case, there shouldn't be a performance penalty.

methane · 2019-07-04T11:55:14Z

The unpacking is only a problem if you insist on using PyDict_Merge(). It would be perfectly possible to implement dict merging from a tuple+vector instead of from a dict. In that case, there shouldn't be a performance penalty.

Really?

class K:
    def __eq__(self, other):
        return True
    def __hash__(self):
        time.sleep(10)
        return 42

d1 = {"foo": 1, "bar": 2, "baz": 3, K(): 4}
d2 = dict(**d1)

I think dict(**d1) doesn't call K.hash() in this example, because hash value is cached in d1.

methane · 2019-07-04T11:56:28Z

d2 = dict(**d1)
+ d2 = {"fizz": "buzz"}
+ d2.update(**d1)

jdemeyer · 2019-07-04T12:22:14Z

You are correct that PyDict_Merge() does not need to recompute the hashes of the keys. However, your example doesn't work because you need string keys for **kwargs. The "str" class caches its hash, so you would need a dict with a "str" subclass as keys to hit that problem.

I think that calling d.update(**kw) with kw having str-subclass keys should be very rare. I'm not sure that we should care about that.

methane · 2019-07-04T12:34:42Z

OK, d1.update(**d2) is not useful in practice. Practical usages of dict.update() are:

d.update(d2)
d.update([(k1,k2),...])
d.update(k1=v1, k2=v2, ...)
d.update(**d2, **d3, **d4) # little abuse, but possible.

In all of them, kwdict is not used at all or can't avoid unpacking the kwdict.

vstinner · 2019-07-04T16:13:14Z

Changing dict.update() calling convention may save a few nanoseconds on d1.update(d2) call, but it will make d1.update(**d2) way slower with a complexity of O(n): d2 must be converted to 2 lists (kwnames and args) and then a new dict should be created.

I don't see the point of micro-optimizing d1.update(d2), if d1.update(**d2) would become way slower.

jdemeyer · 2019-07-08T19:08:05Z

d2 must be converted to 2 lists (kwnames and args) and then a new dict should be created.

The last part is not necessarily true. You could do the update directly, without having that intermediate dict.

methane · 2019-07-09T02:46:32Z

Changing dict.update() calling convention may save a few nanoseconds on d1.update(d2) call, but it will make d1.update(**d2) way slower with a complexity of O(n): d2 must be converted to 2 lists (kwnames and args) and then a new dict should be created.

But who/why use d1.update(**d2)?
In case of dict(), dict(d1, **d2) was idiom to merge two dicts.
But I don't know any practical usage of d1.update(**d2). d1.update(d2) should be preferred.

jdemeyer · 2019-07-09T10:19:34Z

but it will make d1.update(**d2) slower with a complexity of O(n): d2 must be converted to 2 lists

This part is still true and it causes a slow-down of about 23% for dict.update(**d), see benchmarks at #14589 (comment)

vstinner · 2021-09-21T22:17:04Z

It seems like using FASTCALL would make the code slower, not faster. I close this old issue.

vstinner added 3.7 (EOL) end of life type-feature A feature request or enhancement labels Jan 18, 2017

vstinner added the topic-argument-clinic label Jan 18, 2017

vstinner closed this as completed Jan 19, 2017

methane added interpreter-core (Objects, Python, Grammar, and Parser dirs) 3.9 only security fixes and removed topic-argument-clinic 3.7 (EOL) end of life labels Jul 4, 2019

methane reopened this Jul 4, 2019

vstinner closed this as completed Sep 21, 2021

ezio-melotti transferred this issue from another repository Apr 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use FASTCALL in dict.update() #73498

Use FASTCALL in dict.update() #73498

vstinner commented Jan 18, 2017

vstinner commented Jan 18, 2017

vstinner commented Jan 18, 2017

vstinner commented Jan 18, 2017

methane commented Jan 19, 2017

rhettinger commented Jan 19, 2017

vstinner commented Jan 19, 2017

vstinner commented Jan 19, 2017

python-dev mannequin commented Jan 19, 2017

jdemeyer commented Jul 4, 2019

methane commented Jul 4, 2019

jdemeyer commented Jul 4, 2019

jdemeyer commented Jul 4, 2019

jdemeyer commented Jul 4, 2019

methane commented Jul 4, 2019

methane commented Jul 4, 2019

jdemeyer commented Jul 4, 2019

methane commented Jul 4, 2019

vstinner commented Jul 4, 2019

jdemeyer commented Jul 8, 2019

methane commented Jul 9, 2019

jdemeyer commented Jul 9, 2019

vstinner commented Sep 21, 2021

Use FASTCALL in dict.update() #73498

Use FASTCALL in dict.update() #73498

Comments

vstinner commented Jan 18, 2017

vstinner commented Jan 18, 2017

vstinner commented Jan 18, 2017

vstinner commented Jan 18, 2017

methane commented Jan 19, 2017

rhettinger commented Jan 19, 2017

vstinner commented Jan 19, 2017

vstinner commented Jan 19, 2017

python-dev mannequin commented Jan 19, 2017

jdemeyer commented Jul 4, 2019

methane commented Jul 4, 2019

jdemeyer commented Jul 4, 2019

jdemeyer commented Jul 4, 2019

jdemeyer commented Jul 4, 2019

methane commented Jul 4, 2019

methane commented Jul 4, 2019

jdemeyer commented Jul 4, 2019

methane commented Jul 4, 2019

vstinner commented Jul 4, 2019

jdemeyer commented Jul 8, 2019

methane commented Jul 9, 2019

jdemeyer commented Jul 9, 2019

vstinner commented Sep 21, 2021