You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to use RPyC with Tensorflow to distribute machine learning steps in a cluster. Right now, i'm struggling to save a trained model. When i call model.save inside a node, it throws the error below:
Traceback (most recent call last):
File "test_MLFV.py", line 82, in <module>
single(sys.argv[1])
File "test_MLFV.py", line 58, in single
x = send_chain(c,p)
File "test_MLFV.py", line 48, in send_chain
ret = con.root.exec_chain(c,p)
File "/home/matheus/.local/lib/python2.7/site-packages/rpyc/core/netref.py", line 253, in __call__
return syncreq(_self, consts.HANDLE_CALL, args, kwargs)
File "/home/matheus/.local/lib/python2.7/site-packages/rpyc/core/netref.py", line 76, in syncreq
return conn.sync_request(handler, proxy, *args)
File "/home/matheus/.local/lib/python2.7/site-packages/rpyc/core/protocol.py", line 469, in sync_request
return self.async_request(handler, *args, timeout=timeout).value
File "/home/matheus/.local/lib/python2.7/site-packages/rpyc/core/async_.py", line 102, in value
raise self._obj
AttributeError: 'list' object has no attribute '__name__'
========= Remote Traceback (3) =========
Traceback (most recent call last):
File "/home/matheus/.local/lib/python2.7/site-packages/rpyc/core/protocol.py", line 320, in _dispatch_request
res = self._HANDLERS[handler](self, *args)
File "/home/matheus/.local/lib/python2.7/site-packages/rpyc/core/protocol.py", line 593, in _handle_call
return obj(*args, **dict(kwargs))
File "MLFV_Module.py", line 20, in exposed_exec_chain
x = parse_chain(c, p, db)
File "/home/matheus/Desktop/mlfv-3.0/server/MLFV_Parsing.py", line 9, in parse_chain
return parse_seq(c,p,db)
File "/home/matheus/Desktop/mlfv-3.0/server/MLFV_Parsing.py", line 20, in parse_seq
parse_seq(i,p,db)
File "/home/matheus/Desktop/mlfv-3.0/server/MLFV_Parsing.py", line 25, in parse_seq
return exec_chain_function(c, p, ret, obj, pp, db)
File "/home/matheus/Desktop/mlfv-3.0/server/MLFV_Manager.py", line 24, in exec_chain_function
r = send_function(con, cc) # send the function to be executed there
File "/home/matheus/Desktop/mlfv-3.0/server/MLFV_Manager.py", line 10, in send_function
run = rpyc.utils.classic.teleport_function(con, obj.run)(obj)
File "/home/matheus/.local/lib/python2.7/site-packages/rpyc/core/netref.py", line 253, in __call__
return syncreq(_self, consts.HANDLE_CALL, args, kwargs)
File "/home/matheus/.local/lib/python2.7/site-packages/rpyc/core/netref.py", line 76, in syncreq
return conn.sync_request(handler, proxy, *args)
File "/home/matheus/.local/lib/python2.7/site-packages/rpyc/core/protocol.py", line 469, in sync_request
return self.async_request(handler, *args, timeout=timeout).value
File "/home/matheus/.local/lib/python2.7/site-packages/rpyc/core/async_.py", line 102, in value
raise self._obj
AttributeError: 'list' object has no attribute '__name__'
========= Remote Traceback (2) =========
Traceback (most recent call last):
File "/home/matheus/.local/lib/python2.7/site-packages/rpyc/core/protocol.py", line 320, in _dispatch_request
res = self._HANDLERS[handler](self, *args)
File "/home/matheus/.local/lib/python2.7/site-packages/rpyc/core/protocol.py", line 593, in _handle_call
return obj(*args, **dict(kwargs))
File "/home/matheus/Desktop/mlfv-3.0/server/training.py", line 51, in run
model.save('trained_model.h5')
File "/home/matheus/.local/lib/python2.7/site-packages/tensorflow_core/python/keras/engine/network.py", line 1008, in save
signatures, options)
File "/home/matheus/.local/lib/python2.7/site-packages/tensorflow_core/python/keras/saving/save.py", line 112, in save_model
model, filepath, overwrite, include_optimizer)
File "/home/matheus/.local/lib/python2.7/site-packages/tensorflow_core/python/keras/saving/hdf5_format.py", line 103, in save_model_to_hdf5
v, default=serialization.get_json_type).encode('utf8')
File "/usr/lib/python2.7/json/__init__.py", line 251, in dumps
sort_keys=sort_keys, **kw).encode(obj)
File "/usr/lib/python2.7/json/encoder.py", line 207, in encode
chunks = self.iterencode(o, _one_shot=True)
File "/usr/lib/python2.7/json/encoder.py", line 270, in iterencode
return _iterencode(o, 0)
File "/home/matheus/.local/lib/python2.7/site-packages/tensorflow_core/python/util/serialization.py", line 54, in get_json_type
return obj.__name__
File "/home/matheus/.local/lib/python2.7/site-packages/rpyc/core/netref.py", line 166, in __getattr__
return syncreq(self, consts.HANDLE_GETATTR, name)
File "/home/matheus/.local/lib/python2.7/site-packages/rpyc/core/netref.py", line 76, in syncreq
return conn.sync_request(handler, proxy, *args)
File "/home/matheus/.local/lib/python2.7/site-packages/rpyc/core/protocol.py", line 469, in sync_request
return self.async_request(handler, *args, timeout=timeout).value
File "/home/matheus/.local/lib/python2.7/site-packages/rpyc/core/async_.py", line 102, in value
raise self._obj
AttributeError: 'list' object has no attribute '__name__'
========= Remote Traceback (1) =========
Traceback (most recent call last):
File "/home/matheus/.local/lib/python2.7/site-packages/rpyc/core/protocol.py", line 320, in _dispatch_request
res = self._HANDLERS[handler](self, *args)
File "/home/matheus/.local/lib/python2.7/site-packages/rpyc/core/protocol.py", line 609, in _handle_getattr
return self._access_attr(obj, name, (), "_rpyc_getattr", "allow_getattr", getattr)
File "/home/matheus/.local/lib/python2.7/site-packages/rpyc/core/protocol.py", line 537, in _access_attr
return accessor(obj, name, *args)
AttributeError: 'list' object has no attribute '__name__'
I also tried to pickle the trained model and got this error: TypeError: can't pickle SwigPyObject objects
To start the server, has to access server directory and run: python2.7 MLFV_Module.py
Then, you have to init clients through init_client.py script. Go to client directory and run: python2.7 init_client.py localhost 15089 "numpy,pandas,tensorflow" "os,sys,timeit,numpy,pandas,tensorflow" 256000000 2 100
To run a function chain, go to server directory again and run: python2.7 test_MLFV.py
I'm not sure if it's a problem with rpyc or tensorflow because the error traceback brings both tensorflow and rpyc packages.
The most strange thing is that i built a boilerplate to test it and it worked fine. The boilerplate has the following code:
Server
from __future__ importprint_functionimportrpycimportsysif__name__=="__main__":
func=sys.argv[1]
c=rpyc.connect("localhost", 12345)
exec('print(c.root.{}())'.format(func))
So, what's the matter with my code that it doesn't work in my application, throwing AttributeError? Is it something related to rpyc or something i did wrong in my implementation?
The text was updated successfully, but these errors were encountered:
Hey, everyone!
I'm trying to use RPyC with Tensorflow to distribute machine learning steps in a cluster. Right now, i'm struggling to save a trained model. When i call
model.save
inside a node, it throws the error below:I also tried to pickle the trained model and got this error:
TypeError: can't pickle SwigPyObject objects
Environment
Minimal example
The code example can be found here:
https://github.com/mlfv-ufsm/mlfv-3.0/tree/feature/tensorflow-rpyc
To start the server, has to access server directory and run:
python2.7 MLFV_Module.py
Then, you have to init clients through
init_client.py
script. Go to client directory and run:python2.7 init_client.py localhost 15089 "numpy,pandas,tensorflow" "os,sys,timeit,numpy,pandas,tensorflow" 256000000 2 100
To run a function chain, go to server directory again and run:
python2.7 test_MLFV.py
I'm not sure if it's a problem with rpyc or tensorflow because the error traceback brings both tensorflow and rpyc packages.
The most strange thing is that i built a boilerplate to test it and it worked fine. The boilerplate has the following code:
Server
Client
So, what's the matter with my code that it doesn't work in my application, throwing AttributeError? Is it something related to rpyc or something i did wrong in my implementation?
The text was updated successfully, but these errors were encountered: