Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalize the 'open' plugin for 'pathlib.Path.open' #7643

Merged

Conversation

@Michael0x2a
Copy link
Collaborator

Michael0x2a commented Oct 6, 2019

This pull request adds a plugin to infer a more precise return type for the pathlib.Path.open(...) method.

This method is actually nearly identical to the builtin open(...) method, with the only difference being that pathlib.Path.open(...) method doesn't have the file parameter. So, I refactored the logic in both plugins into a shared helper method.

This pull request adds a plugin to infer a more precise
return type for the `pathlib.Path.open(...)` method.

This method is actually nearly identical to the builtin
`open(...)` method, with the only difference being that
`pathlib.Path.open(...)` method doesn't have the `file`
parameter. So, I refactored the logic in both plugins
into a shared helper method.
@Michael0x2a

This comment has been minimized.

Copy link
Collaborator Author

Michael0x2a commented Oct 6, 2019

Probably the better long-term fix here is to just modify the signatures of both functions in typeshed to use overloads + Literal types and delete these plugins entirely.

But that sounds like a lot of work tbh.

@gvanrossum

This comment has been minimized.

Copy link
Member

gvanrossum commented Oct 6, 2019

But that sounds like a lot of work tbh.

Why?

@Michael0x2a

This comment has been minimized.

Copy link
Collaborator Author

Michael0x2a commented Oct 6, 2019

But that sounds like a lot of work tbh.

Why?

I guess it's just a somewhat intrusive change, which would require more due diligence. Basically, we'd need to create unions of every possible combination of modes, make sure we properly handle differences between Python 2 and 3 (e.g. the U and x modes), and run this all against our internal codebases to make sure we didn't miss some edge case/didn't regress somewhere.

The number of combinations is also somewhat large (I think ~76 of them in total?), so we'd also maybe want to make sure this doesn't cause a performance regression when handling union subtype checks and such. I think the impact would be negligible since open(...) is probably called relatively infrequently in most code? But not 100% sure.

None of this is really difficult, just mildly time-consuming and tedious, so I went with this quick and simple 5 minute PR.

But if you prefer we avoid investing any more time into the open(...) plugin, I can close this PR and try working on the other approach sometime later.

@gvanrossum

This comment has been minimized.

Copy link
Member

gvanrossum commented Oct 6, 2019

Yeah, the sheer number of combinations is unfortunate.

I just feel that if we can solve something with a feature of the type system that is better than solving it with a plugin -- a plugin only solves it for mypy, and the plugin API is still poorly documented and evolves more quickly than the type system (and it has no backwards compatibility guarantees).

Another thing is that there are likely quite a few other functions that wrap open() that should technically also be covered by such a plugin (e.g. lzma.open(), _pyio.open()). But they don't necessarily have the exact same API, and refactoring the plugin each time we discover a new pattern also doesn't sound very attractive. I recall this pattern (wrapping open()) appearing in 3rd party code too.

If it's too tedious to write the stubs due to some limitation of Literal[], maybe we could fix that in a follow-up of PEP 586? (It's particularly grating since open() is literally the first example in PEP 586. :-) But I don't really see what's so terrible about it. We could define two types _TEXT_MODE and _BINARY_MODE, e.g.

_TEXT_MODE =  Literal["r", "w", "a", "x", "r+", "w+", "a+", "x+"]
_BINARY_MODE =  Literal["rb", "wb", "ab", "xb", "r+b", "w+b", "a+b", "x+b"]

and use those in the stubs defining wrappers? (Add whatever other strings are actually supported.)

@JelleZijlstra

This comment has been minimized.

Copy link
Collaborator

JelleZijlstra commented Oct 7, 2019

For what it's worth, so far we use Literal overloads for one file in typeshed: https://github.com/python/typeshed/blob/master/stdlib/3/tempfile.pyi.

@msullivan

This comment has been minimized.

Copy link
Collaborator

msullivan commented Oct 8, 2019

Copy link
Collaborator

msullivan left a comment

I don't have any problem with merging this now, though I agree that it would be better to do it in typeshed.

(Though not if it requires 76 overloads?? I hope that is just the number of elements in the Literals. I needed like 8 overloads for subprocess and found it extremely frustrating.)

@gvanrossum

This comment has been minimized.

Copy link
Member

gvanrossum commented Oct 9, 2019

Let's merge this now and do the typeshed update later.

@gvanrossum gvanrossum merged commit 83748a4 into python:master Oct 9, 2019
2 checks passed
2 checks passed
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.