Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

numpy TypeError and obtain/deliver usability #350

Open
gst opened this issue Oct 7, 2019 · 5 comments
Open

numpy TypeError and obtain/deliver usability #350

gst opened this issue Oct 7, 2019 · 5 comments
Assignees
Labels
Feature Request for developer-valued functionality

Comments

@gst
Copy link

gst commented Oct 7, 2019

Environment
  • rpyc latest (4.1.2)
  • python 3.6.7
  • operating system: Ubuntu 18.04
Minimal example

Server:

import rpyc
import numpy as np

from rpyc.utils.server import OneShotServer

class HelloService(rpyc.Service):
    def get(self):
        return np.random.rand(3, 3)

if __name__ == "__main__":
    rpyc.lib.setup_logger()
    server = OneShotServer(HelloService, port=12345,
                           protocol_config={'allow_all_attrs': True, 'allow_pickle': True})
    server.start()

Client:

import rpyc
import numpy as np

if __name__ == "__main__":
    c = rpyc.connect("localhost", 12345)
    arr = c.root.get()
    arr.astype(np.uint32)  # generates TypeError

traceback obtained:

Traceback (most recent call last):
  File "/home/gregory.starck/projects/pyro/rpyc_cli.py", line 7, in <module>
    arr.astype(np.uint32)  # generates TypeError
  File "/home/gregory.starck/.virtualenvs/autocooler/lib/python3.6/site-packages/rpyc/core/netref.py", line 247, in __call__
    return syncreq(_self, consts.HANDLE_CALL, args, kwargs)
  File "/home/gregory.starck/.virtualenvs/autocooler/lib/python3.6/site-packages/rpyc/core/netref.py", line 76, in syncreq
    return conn.sync_request(handler, proxy, *args)
  File "/home/gregory.starck/.virtualenvs/autocooler/lib/python3.6/site-packages/rpyc/core/protocol.py", line 464, in sync_request
    return self.async_request(handler, *args, timeout=timeout).value
  File "/home/gregory.starck/.virtualenvs/autocooler/lib/python3.6/site-packages/rpyc/core/async_.py", line 102, in value
    raise self._obj
TypeError: data type not understood

========= Remote Traceback (1) =========
Traceback (most recent call last):
  File "/home/gregory.starck/.virtualenvs/autocooler/lib/python3.6/site-packages/rpyc/core/protocol.py", line 323, in _dispatch_request
    res = self._HANDLERS[handler](self, *args)
  File "/home/gregory.starck/.virtualenvs/autocooler/lib/python3.6/site-packages/rpyc/core/protocol.py", line 585, in _handle_call
    return obj(*args, **dict(kwargs))
TypeError: data type not understood


Process finished with exit code 1

was expecting statement to succeed.

Is there anything I'm doing wrong ?

@comrumino comrumino self-assigned this Oct 7, 2019
@comrumino
Copy link
Collaborator

comrumino commented Oct 7, 2019

There is potential for supporting this use case in the future---per line 313 of rpyc.core.protocol which states in the future, it could see if a sys.module cache/lookup hits first. For now, you could use a netref to interact with objects on the remote server.
server.py

import rpyc
import numpy as np
from rpyc.utils.server import OneShotServer


class HelloService(rpyc.Service):
    def get(self):
        return np.random.rand(3, 3)

    def remote_np(self):
        return np

if __name__ == "__main__":
    rpyc.lib.setup_logger()
    server = OneShotServer(HelloService, port=8122, protocol_config={'allow_public_attrs': True, 'allow_pickle': True})
    server.start()

client.py

import rpyc
import numpy as np

if __name__ == "__main__":
    c = rpyc.connect("localhost", 8122)
    remote_arr = c.root.get()
    remote_np = c.root.remote_np()
    remote_arr.astype(remote_np.uint32)
  1. changed allow_all_attrs to allow_public_attrs
  2. created netref to numpy on remote by remote_np = c.root.remote_np()

Remote should probably be ran in a hypervisor or docker container and started from a clean state on a regular interval when pickling or changing attribute access settings. If the time was taken to write a class factory around numpy and your service could be locked down since it would no longer require allow_public_attrs. Moreover, naive pickling is going to create a lot of network traffic which could be avoided by only using netref objects client side. Pickling objects is usually a lazy choice.

@comrumino comrumino added the Triage Investigation by a maintainer has started label Oct 7, 2019
@gst
Copy link
Author

gst commented Oct 8, 2019

Hi,

unfortunately we are trying to use rpyc transparently from client side where my/the code isnt allowed to know about rpyc so I cannot import/get a remote ref to the numpy module as you do ; I have to import from the local/client numpy module.

Do your in progress status means that it would be doable/fixable ? as I don't quite get the end of your reply.

Thanks anyway.

@comrumino
Copy link
Collaborator

comrumino commented Oct 11, 2019

In progress does mean that I believe it is fixable, but it is not trivial. There are also workarounds in the interim, but this also depends on the requirements/restrictions. The end of my reply stemmed from the belief that since pickled objects have security risks and portability issues it's probably worthwhile to avoid pickling, but that is just my opinion---in many use cases numpy.save doesn't require pickle.

So there are limitations in complete "transparency". For example, the client always has to establish a connection to the service or how the built-in type is implemented in C to use one of PyObject's pointers thereby breaking promises of transparency. Another possible implementation of the client is

import rpyc
import numpy as np

if __name__ == "__main__":
    c = rpyc.connect("localhost", 8122)
    remote_arr = c.root.get()
    arr = rpyc.classic.obtain(remote_arr)
    arr.astype(np.uint32)

Of course, the best approach depends on the end goal and technical requirements. Does this client example work until a proper enhancement is complete?

Another related improvement would be adding a decorator to remove the requirement of using rpyc.utils.classic.obtain or rpyc.utils.classic.deliver.

@gst
Copy link
Author

gst commented Oct 11, 2019

Does this client example work until a proper enhancement is complete?

No we can not use obtain on the client side once we've got that ref to the remote array unfortunately as I said. actually if it would be possible we would like that numpy arrays are always transferred (and transparently) raw (I really mean serialized) to the other side ; I don't even know if that's something that would be configurable at some level of rpyc .. ?

Thanks again.

@comrumino comrumino added the Feature Request for developer-valued functionality label Nov 14, 2019
@comrumino
Copy link
Collaborator

comrumino commented Nov 14, 2019

Sorry about the slow turn around. As suggested by #310 and here, the usability of RPyC would benefit from a improving the return by-netref and by-value.

Approaches/considerations

  1. adding decorators to control return by-value vs by-netref
  2. add a new key to the protocol configuration to change the default return by-netref to return by-value
  3. add support for embedded custom de-serializers
  4. DRY, obtain/deliver are used for both classic mode and service mode

@comrumino comrumino added Diagnosed and removed Triage Investigation by a maintainer has started labels Nov 14, 2019
@comrumino comrumino changed the title numpy array + astype : TypeError: data type not understood numpy TypeError and obtain/deliver usability Dec 17, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature Request for developer-valued functionality
Projects
None yet
Development

No branches or pull requests

2 participants