Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preserve type on List.copy() and slicing #149

Open
Tracked by #156
MestreLion opened this issue Oct 27, 2021 · 5 comments
Open
Tracked by #156

Preserve type on List.copy() and slicing #149

MestreLion opened this issue Oct 27, 2021 · 5 comments

Comments

@MestreLion
Copy link
Contributor

I just noticed List.copy() returns a plain list, as it doesn't override the copy method, and was about to create a PR implementing it. Then I realized this is a deeper issue, so I need your opinion before moving on:

  • list.copy() == list[:], so copy if was changed to preserve type, should slicing be changed too? I.e., should List.__getitem__ use the a similar logic as __setitem__, and do something like this:if isinstance(index, slice): return self.__class__(super().__getitem__(index)).

  • It would be surprising if copy() preserve type but slicing didn't. In one hand, some might expect (and rely on) copy and slicing to return a plain list. OTOH, it's awkward to write b = List[xxx](a.copy()) to preserve type and easy to do b = list(a.copy()) to guarantee a plain list.

  • In a general way, should the mutables Compound and List preserve type in slicing and subsets?

I could do some research to see what numpy.ndarray and other similar list subclasses do, and also if there's any principle suggested in collections.abc.MutableSequence.

Meanwhile, what's your opinion on this? Interested in a PR to implement copy() and related methods in Compound and List?

@MestreLion
Copy link
Contributor Author

Some research indicates slicing (and by consequence .copy()) should preserve type:

  • Data Model, 3.2 - The standard type hierarchy, Sequences: "Sequences also support slicing: a[i:j] selects all items with index k such that i <= k < j. When used as an expression, a slice is a sequence of the same type. This implies that the index set is renumbered so that it starts at 0." (my emphasis)
  • Tuple slicing does preserve type: (1, 2, 3, 4)[1:3] == (2, 3), a strong argument for not returning plain list
  • Standard types - Mutable Sequence types: "s.copy() | creates a shallow copy of s (same as s[:])" - not only it officially ties slicing and copy, it also strongly suggests (both) should preserve type
  • collections.UserList.copy() does preserve type, but slicing surprisingly doesn't ! Maybe a bug?
  • numpy.ndarray, even if supports slicing, does not support append() or extend() or any of the regular methods of the Mutable Sequence protocol, so it should not be considered.

Given that, now I have a strong opinion that at least List|Compound.copy() should preserve type, and so should slicing. You OK for a PR implementing this?

@vberlier vberlier mentioned this issue Nov 2, 2021
9 tasks
@vberlier
Copy link
Owner

vberlier commented Nov 3, 2021

Yeah this can't be easily handled right now because lists and compounds inherit directly from the builtins instead of MutableSequence and MutableMapping, so you would need to manually override each method. Inheriting from the builtins has advantages but it seems that the pythonic way to do it is to go through the ABCs and wrap an underlying list or dict. However it could also be less efficient, and I'm not sure if ABCs play well with mypyc if we go down that route, so this needs further investigation.

@MestreLion
Copy link
Contributor Author

I was planning on overriding each method anyway, no problem with that. You're already overriding __getitem__, I would just adjust it and add copy and whatever additional methods are needed.

Yeah, ABCs used to be the norm, but there are legitimate cases for subclassing the builtins direcly. It used to be a chore and have a million caveats, but not anymore, specially in Python 3 (and even more with Protocols, etc). The builtins do inherit from the ABCs, so you can already use isinstance(Compound, MutableMapping). So I don't think it's worth converting tags to plain ABCs. You already had the trouble to deal with the builtins caveats (and this issue would improve that)

@vberlier
Copy link
Owner

vberlier commented Nov 4, 2021

Hmmm, now I'm also wondering about other tags like numeric and strings. Should string slicing or methods like .lower() return String instances? Should mathematical operations also return wrappers? Currently things like Int(123) + Int(1) return a builtin int. I think it's worth considering if we end up implementing that for lists and compounds.

@MestreLion
Copy link
Contributor Author

Hum, very interesting point!

Yeah, I believe they should preserve type too. Both str and bytes preserve type on .lower|upper() (and I assume related methods too), and it would make API much smoother to use on numericals

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants