-
-
Notifications
You must be signed in to change notification settings - Fork 29.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
reversible dict #77643
Comments
Now that dicts are tracking insertion order, they can be made reversible via the built-in reversed, just like OrderedDict. |
Hi, I'm taking a look this issue, it look like a new type |
Right, a blend of the code from dictiterobject (https://github.com/python/cpython/blob/master/Objects/dictobject.c#L3309) and listreviterobject (https://github.com/python/cpython/blob/master/Objects/listobject.c#L3128). |
If implement __reduce__ in dict, it should be implemented in dict views too. |
Hi Serhiy Storchaka, I will update the PR to implement this functionality in the views too |
I updated the pull request, now reversed work on the dict and dict views: ➜ cpython git:(master) ./python.exe
Python 3.8.0a0 (heads/master-dirty:128576b88c, May 23 2018, 16:33:46)
[Clang 9.0.0 (clang-900.0.39.2)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> d = dict(a=1, b=2)
>>> list(reversed(d))
['b', 'a']
>>> list(reversed(d.keys()))
['b', 'a']
>>> list(reversed(d.values()))
[2, 1]
>>> list(reversed(d.items()))
[('b', 2), ('a', 1)] reversed on dict and dict views is not symmetric and applying twice will raise TypeError which is also the behavior on list. I'm not sure why this behavior has been chosen but it's coherent. |
I'm not sure it's worth enough for adding more builtin classes. Adding builtin class means Python interpreter core makes more fat, slow to start, and hard to maintain. |
This change does add built-in types but I think it's a reasonable expectation as a python user to be able to do reversed(dict(a=1, b=20) since the order is know defined in the specifications. It seems inconsistent to have an order on dict, views and not have reversed work on them. It does increase fat in the interpreter, but the change is really simple and does not really increase its complexity or make it really harder to maintain. |
I agree about "reasonable expectation". But I'm interested in is it really useful in real world?
"Have an order" doesn't mean "reversible". For example, single linked list is ordered, but not reversible. While CPython implementation can provide efficient __reverse__, adding __reverse__ means **all** Python implementation is expected to provide it. "Preserve insertion order" is very useful for many people. So it's guaranteed. |
I do agree it's certainly used than the conservation of order but it's not useless either. For example, it could help to get the latest section defined in a YAML or INI file once parsed.
Indeed they would have to use a double-linked-list here.
While this is true, the same argument could be said about the dict views. Many many people don't know about them but they are still an interesting feature that has its place in the standard library. It definitely won't be the most used feature in Python nor a killer feature but it seemed important enough to be included in OrderedDict (https://github.com/python/cpython/blob/master/Lib/collections/__init__.py#L63) since 3.5 and a C implementation of OrderedDict has been added in the same release so it seems to have mattered at the time. Having this feature in the built-in dicts could actually help to simplify the implementation of the collections module in this case. |
For rare cases, OrderedDict can be used.
Doubly linked list is memory inefficient than singly linked list.
It's different problem and out of scope of this discussion.
OrderedDict is more about "preserving insertion order". It provides O(1)
Would you elaborate more? |
I concur with Inada. It is a "nice to have" feature, but it has too high cost. PR 6827 adds around 300 lines of new code in dictobject.c. Too large for rarely used feature. And dictobject.c is critically important, adding new code here can make the compiler generating less optimal code for other parts. That is besides increasing the maintenance cost. I may have a case for iterating dict in the reversed order (see bpo-33331), but adding three new types requires adding too much boilerplate code. In the current form PR 6827 can't be merged, and I don't see a way how to make the code much shorter without impact on the current code. :-( |
Since there seems to be a consensus about this change being too much, should we go back to the first proposal to implement dict.__reversed__ only and not reversed for the views, this would greatly reduce the bload or dump the PR as a whole ? |
This would make the feature incomplete and will add a pressure of implementing support for dict views. And even one new type adds much bloat. |
Since API of builtin type is part of language (not only CPython), I want agreement on python-dev before adding it. |
It looks like there's general agreement on python-dev that this is appropriate for v3.8 (not v3.7). Guido van Rossum and Ramsey D'silva gave a +1. Raymond Hettinger noted some use cases. INADA Naoki raised a point about waiting for other implementations, which Guido agreed with, but after some research came around to OK'ing this for v3.8. Responding to Serhiy Storchaka's worry about adding new code possibly upsetting the compiler optimizations: The implementation seems straightforward and unlikely to confuse the compiler. Is there evidence that similar code has previously made CPython slower? If the compiler's black magic is too difficult to understand, it may be equally plausible that this code causes an improvement to the optimization of unrelated code. |
Would you try python_startup and python_startup_nosite benchmark with:
|
That seems entirely unnecessary. If adding __reversed__ has any effect on the rest of the build, that is pure random noise that can be ignored. It is normal to get small improvements or detriments that vary from compiler to compiler, os to os, etc. Our normal practice is to ignore those and not optimize for random or hard-to-control compiler idiosyncrasies which appear and disappear as we add methods or as compilers get updated. |
Although initialization cost of each one type is small, time for _Py_Ready() is not negligible. And ABC.register() too. Import time for _collections_abc is not negligible too. I agree that cost of adding three builtin types and register them to ABC will be negligible or small enough. |
I confirmed the cost is negligible. python_startup_no_site Mean +- std dev: [master] 7.31 ms +- 0.39 ms -> [reverse] 7.41 ms +- 0.44 ms: 1.01x slower (+1%) "register" is "reverse" + following patch: diff --git a/Lib/_collections_abc.py b/Lib/_collections_abc.py
index dbe30dff1f..28a7e2586c 100644
--- a/Lib/_collections_abc.py
+++ b/Lib/_collections_abc.py
@@ -280,6 +280,9 @@ Iterator.register(bytearray_iterator)
Iterator.register(dict_keyiterator)
Iterator.register(dict_valueiterator)
Iterator.register(dict_itemiterator)
+Iterator.register(type(iter(reversed({}.keys()))))
+Iterator.register(type(iter(reversed({}.values()))))
+Iterator.register(type(iter(reversed({}.items()))))
Iterator.register(list_iterator)
Iterator.register(list_reverseiterator)
Iterator.register(range_iterator)
@@ -306,6 +309,12 @@ class Reversible(Iterable):
return NotImplemented +Reversible.register(dict) __slots__ = () |
Hi INADA thanks for the benchmark, I did both of them too and got the same results (though I had to apply python/pyperformance#41 to get the performance module working). Should I apply your patch in PR 6827? |
My patch was quick and dirty.
|
Thanks INADA, I made the necessary changes to _collections_abc. Is there anything that I should change? |
Would it be possible to re-use the __reverse__() support that exists for OrderedDict? Doing so could help avoid adding a bunch of duplicated code. |
Hi, I took a look at the code of OrderedDict it's using the double linked-list to iterate through the items using _odictnode_PREV and _odictnode_NEXT. Since ordereddict needs to support move_to_end that will change the iterating order while dict does not, is it possible to share the code for __reversed__? As I understand it, the current code for OrderedDict and dict are not sharing code for the implementation of __iter__ and I'm not sure how it would be possible to do so. |
Hi, is everything good with attached PR or should I refactor it further? |
Hi Serhiy and Inada, is there a reason not to move forward with this patch? |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: