Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH PyProxy getattr/etc traps & added getitem/etc for PyMappings #1175

Merged
merged 41 commits into from Feb 23, 2021

Conversation

hoodmane
Copy link
Member

@hoodmane hoodmane commented Jan 28, 2021

In javascript x.a and x["a"] do the same thing. For the equivalent of Python getitem, javascript typically uses x.get("a"). This PR fixes the error handling for the traps for x.a (which uses getattr) and adds new functions like x.get("a") (which uses getitem).

This has the effect of changing the way we need to look stuff up on pyodide.globals: instead of pyodide.globals.a, we now need to use pyodide.globals.get("a").

Edit: I added an extra Proxy around globals so that you can still look up methods using ordinary indexing, to preserve backwards compatibility.

@hoodmane hoodmane force-pushed the pyproxy-mapping-methods branch 3 times, most recently from ced03e7 to 59b2889 Compare January 28, 2021 06:27
@phorward
Copy link
Contributor

phorward commented Feb 2, 2021

Hello @hoodmane,

This has the effect of changing the way we need to look stuff up on pyodide.globals: instead of pyodide.globals.a, we now need to use pyodide.globals.get("a").

this change is not much, but what if there's already existing JavaScript code using this notation in other projects using Pyodide?

First, I think an entry in the CHANGELOG should not be missing here.

Next, maybe, it may help to first declare the old notation as deprecated but still preserve the old behavior, and print a warning to the JS-console. Later on, it can be removed entirely, to make Pyodide-Objects finally more Pythonic in the JavaScript world, instead of this hard cut.

What is your (and the others) opinion on this?

@hoodmane
Copy link
Member Author

hoodmane commented Feb 2, 2021

First, I think an entry in the CHANGELOG should not be missing here.

Absolutely. (My plan on documentation is to make a separate PR that just updates the type conversions documentation to reflect many of the #900 changes.)

Next, maybe, it may help to first declare the old notation as deprecated but still preserve the old behavior, and print a warning to the JS-console. Later on, it can be removed entirely, to make Pyodide-Objects finally more Pythonic in the JavaScript world, instead of this hard cut.

So far we haven't been bothering with deprecation warnings at all. I am open to the idea of doing deprecation warnings if we agree that it's a good idea, though I think we'd need to discuss how to design the deprecation system. We'll presumably want eventually as pyodide gets more stable so it's probably not wasted effort to set up a system. Perhaps I should open another "Deprecation?? Should we be doing it??" issue. @rth @dalcde Thoughts?

For the globals object, if people have been accessing it through pyimport then nothing needs to change. I guess what I can do in this case is wrap the globals object in an extra Javascript Proxy and implement get/set/delete/has traps that use the .get, .set, .delete, and .has methods (and perhaps print a deprecation warning?).

@rth
Copy link
Member

rth commented Feb 4, 2021

Perhaps I should open another "Deprecation?? Should we be doing it??"

Yeah, let's discuss it in a separate issue.

@hoodmane
Copy link
Member Author

@phorward @rth Would appreciate review on this whenever anyone gets to it.

@hoodmane
Copy link
Member Author

@phorward @rth @dalcde I would appreciate review on this one if any of you get a chance.

Copy link
Member

@rth rth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @hoodmane ! It's indeed important to be able to differentiate getattr from getitem in PyProxy.

If I have a Python class that does define the get method (fairly common), how would I call it via a PyProxy with this PR?


// Wrap "globals" in a special Proxy that allows `pyodide.globals.x` access.
// TODO: Should we have this?
Module.globals = Module.wrapNamespace(Module.globals);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally with a warning (maybe with console.warn), saying we will remove it in 0.18.0.

@hoodmane
Copy link
Member Author

If I have a Python class that does define the get method (fairly common), how would I call it via a PyProxy with this PR?

An excellent question which I don't have a good answer for. Note that at least there is only a name collision if the class implements both __getitem__ and get, but already this is a problem for dictionaries. We're going to have other similar issues like this like with length for __len__, has for __contains___ and a few others, but this might be the only really bad collision. Maybe we should open an issue for this?

@hoodmane
Copy link
Member Author

Oh well at least in the dictionary case, it's not a big deal at all: the only difference between dict.__getitem__ and dict.get() is that dict.get can provide a default. But Javascript standard is to return undefined on lookup of a missing item. So if we make dictproxy.get go to the actual get method (not dict.__getitem__) then there's no loss of functionality.

@hoodmane
Copy link
Member Author

So how about this: suppose proxy proxies obj. proxy.get does:

  1. If hasattr(obj, "get"), return `getattr(obj, "get").
  2. If hasattr(obj, "__getitem__") return obj.__getitem__
  3. Return undefined.

If the object has both get and __getitem__ then to access __getitem__ explicitly, use obj.__getitem__(x)

@hoodmane
Copy link
Member Author

@rth Do you know of other standard objects that define both get and __getitem__?

@rth
Copy link
Member

rth commented Feb 15, 2021

If the object has both get and __getitem__ then to access __getitem__ explicitly, use obj.__getitem__(x)

It makes sense, though the issue is when reading code that does obj.get('x') it won't be easy to know if it does indexing or calling the method get, which is not ideal. Let me think about it.

Do you know of other standard objects that define both get and __getitem__?

For instance, collections.Counter, queue.Queue. Otherwise pandas.DataFrame.get

@hoodmane
Copy link
Member Author

hoodmane commented Feb 15, 2021

Well collections.Counter and pandas.DataFrame are in the same boat as dict: get and __getitem__ do basically the same thing, especially since Javascript returns undefined instead of throwing an IndexError, which is ordinarily the difference between get and __getitem__. With get you can also specify a default, but if we preferentially use get then this option is preserved. Queue doesn't implement __getitem__ so queue.get is unambiguous.

@rth
Copy link
Member

rth commented Feb 16, 2021

OK, sounds good. +1 for you proposal in #1175 (comment)

@hoodmane
Copy link
Member Author

@rth I think the changes required to implement the "python methods override javascript methods" suggestion are too complicated for this PR, I'm going to open it as a separate PR on top of this one.

@hoodmane hoodmane mentioned this pull request Feb 18, 2021
Copy link
Member

@rth rth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM. Could we just add a test on a class with that subclasses dict, and with a custom get method, and make sure that it doesn't call the __getitem__ method?

class CustomDict(dict):
    def get(self, key):
        return 1

x = CustomDict(a = 2)
assert x.get(x) == 1  # from JS

@hoodmane
Copy link
Member Author

hoodmane commented Feb 23, 2021

Sounds good. By the way, note that Javascript Symbols are often used to add names to a proxy that are guaranteed not to clash with the proxied names. If you define a symbol pyodide.Get then we could unambiguously do obj[pyodide.Get]("name"). For instance, this is the approach used here. Unfortunately it's pretty verbose.

One sad thing about the current behavior of these proxies is that they leak Python objects like crazy. Like if you say:

let namespace = pyodide.globals.dict();
namespace.get("a");
namespace.destroy();

then you've actually leaked namespace. I suppose I should open more issues about the memory management horrors. Hopefully FinalizationRegistry will save us.

@hoodmane
Copy link
Member Author

Oh @rth the test you suggested isn't going to pass on this branch, but it will pass on #1264.

@rth
Copy link
Member

rth commented Feb 23, 2021

OK, sounds good. Thanks!

@rth rth changed the title Improve PyProxy getattr/etc traps & added methods for getitem etc for PyMappings ENH PyProxy getattr/etc traps & added getitem/etc for PyMappings Feb 23, 2021
@rth rth merged commit 4d213a6 into pyodide:master Feb 23, 2021
@hoodmane hoodmane deleted the pyproxy-mapping-methods branch February 23, 2021 19:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants