Add RETURN_NONE bytecode instruction #72986

vstinner · 2016-11-25T10:55:16Z

BPO	28800
Nosy	@tim-one, @brettcannon, @rhettinger, @vstinner, @serhiy-storchaka, @1st1, @matrixise
Files	return_none.patch

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2016-11-28.17:07:17.751>
created_at = <Date 2016-11-25.10:55:15.815>
labels = ['interpreter-core', 'type-feature', '3.7']
title = 'Add RETURN_NONE bytecode instruction'
updated_at = <Date 2016-11-28.17:07:17.750>
user = 'https://github.com/vstinner'

bugs.python.org fields:

activity = <Date 2016-11-28.17:07:17.750>
actor = 'vstinner'
assignee = 'none'
closed = True
closed_date = <Date 2016-11-28.17:07:17.751>
closer = 'vstinner'
components = ['Interpreter Core']
creation = <Date 2016-11-25.10:55:15.815>
creator = 'vstinner'
dependencies = []
files = ['45640']
hgrepos = []
issue_num = 28800
keywords = ['patch']
message_count = 11.0
messages = ['281696', '281699', '281702', '281719', '281720', '281821', '281831', '281835', '281837', '281891', '281893']
nosy_count = 7.0
nosy_names = ['tim.peters', 'brett.cannon', 'rhettinger', 'vstinner', 'serhiy.storchaka', 'yselivanov', 'matrixise']
pr_nums = []
priority = 'normal'
resolution = 'rejected'
stage = 'patch review'
status = 'closed'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue28800'
versions = ['Python 3.7']

vstinner · 2016-11-25T10:55:15Z

Attached patch adds a new bytecode instruction: RETURN_NONE. It is similar to "LOAD_CONST; RETURN_VALUE" but it avoids the need of adding None to constants of the code object. For function with an empty body, it reduces the stack size from 1 to 0.

Example on reference Python 3.7:

>>> def func(): return
... 
>>> import dis; dis.dis(func)
  1           0 LOAD_CONST               0 (None)
              2 RETURN_VALUE
>>> func.__code__.co_stacksize
1

Example on patched Python 3.7:

>>> def func(): return
... 
>>> import dis; dis.dis(func)
  1           0 RETURN_CONST
>>> func.__code__.co_stacksize
0

If the function has a docstring, RETURN_CONST avoids adding None to code constants:

>>> def func():
...  "docstring"
...  return
... 
>>> func.__code__.co_consts
('docstring',)

I will now run benchmarks on the patch.

vstinner · 2016-11-25T11:05:41Z

I'm proposing this patch because I noticed that reducing the number of opcodes makes Python faster in my old registervm project: a fork of CPython which uses a register-based bytecode:
http://faster-cpython.readthedocs.io/registervm.html

I also plan to propose a CALL_PROCEDURE method to replace CALL_FUNCTION+POP_TOP. Only for the the simple CALL_FUNCTION, not for complex CALL_FUNCTION_KW nor CALL_FUNCTION_EX.

serhiy-storchaka · 2016-11-25T11:27:24Z

Do you want to add RETURN_NONE or RETURN_CONST? Or both?

Adding new special opcodes can decrease the size of the code and increase performance of some cases. But it adds maintenance burden, increases the complexity of the compiler and peephole optimizer, and increases the size of ceval loop. The latter can have negative effect on the performance. I think we should add new specialized opcodes only if they adds measurable gain to global performance or large speed up of important particular cases.

It would help if you gather the statistics of RETURN_* opcodes. How many RETURN_VALUE, RETURN_CONST and RETURN_NONE instructions are compiled and executed during running Python tests? Compare it with total number of compiled and executed instructions.

vstinner · 2016-11-25T13:45:10Z

Results of performance 0.5.0 on speed-python:

haypo@speed-python$ python3 -m perf compare_to -G --min-speed=3 2016-11-23_19-34-default-3d660ed2a60e.json patch.json
Slower (3):

spectral_norm: 265 ms +- 2 ms -> 277 ms +- 7 ms: 1.04x slower
xml_etree_iterparse: 217 ms +- 2 ms -> 226 ms +- 4 ms: 1.04x slower
nqueens: 261 ms +- 2 ms -> 269 ms +- 3 ms: 1.03x slower

Faster (3):

scimark_sor: 519 ms +- 10 ms -> 496 ms +- 7 ms: 1.05x faster
mako: 43.0 ms +- 0.8 ms -> 41.5 ms +- 0.2 ms: 1.04x faster
call_method: 16.2 ms +- 0.2 ms -> 15.7 ms +- 0.3 ms: 1.03x faster

Benchmark hidden because not significant (58): 2to3, call_method_slots, call_method_unknown, (...)

Hum, boring result. This change alone doesn't change any significant speedup, even some slowndon. Maybe it's just a bad idea. Maybe it should be combined with other new bytecode instructions. Maybe only a full new instruction set using registers show significant speedup. I don't know :-(

vstinner · 2016-11-25T14:08:21Z

Do you want to add RETURN_NONE or RETURN_CONST? Or both?

Only RETURN_NONE because on some corner cases it allow to avoid completely the None constant from code.co_consts and reduce the stack size.

I'm not sure that I want to start taking the same road of WPython which added a *lot* of specialized instructions (combining two or more existing instructions). As you said, it has a cost on the maintenance, and might have a negative impact if _PyEval_EvalFrameDefault() becomes too big.

serhiy-storchaka · 2016-11-27T11:15:22Z

The pair LOAD_CONST/RETURN_VALUE is on 19th place of the top of opcode pairs (see msg269391 in bpo-27255). Not all of these constants are None. And since the time of LOAD_CONST is much smaller then the time of RETURN_VALUE (the latter includes destroying a frame and should be in a pair with CALL_FUNCTION), I think the performance effect of RETURN_NONE is much less than 1%.

rhettinger · 2016-11-27T18:17:20Z

When we added LIST_APPEND (the first of the custom opcodes intended for optimization), it was done only because it was frequently used in inner loops where it provided real benefits to users.

In contrast this opcode seems like a waste. Historically for opcodes, we've valued orthogonality and parsimony. The opcode set was intentionally kept simple and minimal. The recent opcode additions have disregarded these values.

brettcannon · 2016-11-27T21:30:45Z

My vote is to close this issue since the performance isn't panning out.

tim-one · 2016-11-27T21:43:07Z

I also don't see a good reason to keep this open now - adds complication for no quantifiable payoff.

1st1 · 2016-11-28T17:02:21Z

-1; we have too many opcodes already. Let's not complicate the code if there's no performance improvement.

vstinner · 2016-11-28T17:07:18Z

Sorry, I didn't want to bother you. I should run the benchmark *before* opening an issue next time :-) I agree that the speedup is negligible, so it's not worth it. The main purpose of the patch was an optimization. I close the issue. Thanks ;-)

vstinner added the 3.7 (EOL) end of life label Nov 25, 2016

vstinner changed the title ~~Add RETURN_NONE~~ Add RETURN_NONE bytecode instruction Nov 25, 2016

serhiy-storchaka added interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement labels Nov 25, 2016

vstinner closed this as completed Nov 28, 2016

ezio-melotti transferred this issue from another repository Apr 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add RETURN_NONE bytecode instruction #72986

Add RETURN_NONE bytecode instruction #72986

vstinner commented Nov 25, 2016

vstinner commented Nov 25, 2016

vstinner commented Nov 25, 2016

serhiy-storchaka commented Nov 25, 2016

vstinner commented Nov 25, 2016

vstinner commented Nov 25, 2016

serhiy-storchaka commented Nov 27, 2016

rhettinger commented Nov 27, 2016

brettcannon commented Nov 27, 2016

tim-one commented Nov 27, 2016

1st1 commented Nov 28, 2016

vstinner commented Nov 28, 2016

Add RETURN_NONE bytecode instruction #72986

Add RETURN_NONE bytecode instruction #72986

Comments

vstinner commented Nov 25, 2016

vstinner commented Nov 25, 2016

vstinner commented Nov 25, 2016

serhiy-storchaka commented Nov 25, 2016

vstinner commented Nov 25, 2016

vstinner commented Nov 25, 2016

serhiy-storchaka commented Nov 27, 2016

rhettinger commented Nov 27, 2016

brettcannon commented Nov 27, 2016

tim-one commented Nov 27, 2016

1st1 commented Nov 28, 2016

vstinner commented Nov 28, 2016