Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Changes to type conversions #900

Closed
hoodmane opened this issue Dec 19, 2020 · 9 comments
Closed

Proposal: Changes to type conversions #900

hoodmane opened this issue Dec 19, 2020 · 9 comments

Comments

@hoodmane
Copy link
Member

hoodmane commented Dec 19, 2020

Currently there are a couple of issues (#780, #892) with type conversions. I have an idea for how to fix this but it would be a pretty significant change so we should discuss first.

The source of both #780 and #892 is that rather than proxying Python dicts and lists into javascript, python2js copies them into js objects and Arrays. On the other hand, js2python proxies most mutable types. This means that converting javascript ==> python ==> javascript gives back the original object more often but converting python ==> javascript ==> python doesn't give the original type for lists and dicts.

I think the current problematic copy conversions are:

  • dict ==> javascript
  • list ==> javascript
  • typedarray ==> python

Also, we convert python bytes object to a Uint8ClampedArray which when imported back to python turns into a memoryview. We should add a flag to the Uint8ClampedArray to indicate that it came from a python bytes object and convert it to bytes on return to python rather than a memoryview.

I propose that we change the apis so that all copies are explicit. This is easier to do for dict and list than for typedarray. I think for dict and list we can just remove lines 255-258 in python2.js.c and add deep_copy_into_native_js and shallow_copy_into_native_js methods to PyProxy.

I think for typedarray we should import it as WrappedTypedArray and provide methods like copy_into_memoryview, copy_from_memoryview and clone_as_memoryview. This way direct modification of the imported object actually modifies the Javascript TypedArray, it's clear from the api that the memoryview is a copy and so modifying it won't change the original TypedArray and there is an api for writing the result of a computation back into the original TypedArray.

Because tuples are immutable, we could consider adding back a shallow copy to convert tuples to javascript arrays. We could add a marker to the returned array to indicate that it came from a python tuple. This would mean that the round trip python ==> js ==> python would return a different tuple with identical contents, but because tuples are immutable, tuples with identical contents are exchangeable.

The docs say:

The most notable limitation is that while Python has distinct ways of accessing attributes and items (x.foo and x[foo]), Javascript conflates these two concepts.

I think here the correct model is the Javascript object Map. So:

  • dict["x"] <==> map.get("x")
  • dict["x"] = 2 <==> map.set("x", 2)
  • "x" in dict <==> map.has("x")
  • del dict["x"] <==> map.delete("x")

We should mimic this behavior with PyProxy. This also mitigates the annoying issue in javascript that a[b] is equivalent to a[b.toString()].

@rth
Copy link
Member

rth commented Dec 20, 2020

Thanks for doing this analysis @hoodmane !

I propose that we change the apis so that all copies are explicit.

So if I understand correctly you are proposing to return PyProxy for dict, list as well?

and add deep_copy_into_native_js and shallow_copy_into_native_js methods to PyProxy.

That sounds good but maybe with some less verbose names. BTW I think we are currently missing documentation for available PyProxy methods in https://pyodide.readthedocs.io/en/latest/api_reference.html#javascript-api

I agree that the inability to do a round trip conversion is fairly problematic, and overall the proposal sounds reasonable. Though I have not worked on type conversions a lot, so if possible, it might be good to try to find the initial rationale for doing the type conversions that was at time in git history. Also maybe @phorward would have some comments.

@phorward
Copy link
Contributor

Hello @hoodmane, thanks for your considerations on this important issue.

I think that trying it out and check if any problems come up is this worth for. At first view, I don't see any problems that may occur, and it is more straightforward to hold any Python-object going beyond primary types like long, float or even string in the Python world using PyProxy. Relating PyProxy, I still have #693 in mind, but this is another problem and exists between the JS / Python co-existence.

Would you like to make a pull request on this, so I can do further tests with existing projects on it?

@hoodmane
Copy link
Member Author

with some less verbose names

Of course =)

Would you like to make a pull request on this

Yes I will go ahead and implement this, thanks for your input.

@hoodmane
Copy link
Member Author

Update: I now think that the best posible thing to do for bytes objects and tuples (so immutable objects in Python with no immutable equivalent in Javascript) is to automatically copy them into javascript and allow the conversion back to python to proxy the javascript object. The round trip conversion through Javascript will then return an equivalent mutable object, leaving py2js and js2py not quite inverse to each other. This ensures that py2js and js2py never do hidden copies of mutable objects and never convert mutable objects to immutable ones.

@rth
Copy link
Member

rth commented Feb 12, 2021

It sounds like most of this was done, or has existing PRs?

@hoodmane
Copy link
Member Author

It's pretty close. The memory buffer stuff still needs work, but the other stuff and more has been done / has existing PRs. Perhaps I should close this as resolved, there are other issues specifically about memory buffers.

@rth
Copy link
Member

rth commented Apr 18, 2021

Closing as resolved (as far as I can tell). We can always open more specific issues.

@rth rth closed this as completed Apr 18, 2021
@hoodmane
Copy link
Member Author

I think as of #1322 everything above is implemented.

@rth
Copy link
Member

rth commented Apr 18, 2021

Great work @hoodmane !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants