Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dump_session not working if getpass module is imported #319

Closed
danilociaffi opened this issue Jun 10, 2019 · 9 comments
Closed

dump_session not working if getpass module is imported #319

danilociaffi opened this issue Jun 10, 2019 · 9 comments
Labels
Milestone

Comments

@danilociaffi
Copy link

I have a Jupyter notebook whose state I want to save using dill.dump_session, however the function throws an error if the notebook uses the getpass module.

I understand the point of not dumping the password, but how can I save the rest of the session?

Here's a toy example of my code:

import dill
from getpass import getpass

dill.dump_session('../session_dump/sess_test.pkl')

and here is the error I get:

<ipython-input-8-5758b730705e> in <module>
----> 1 dill.dump_session('../session_dump/sess_test.pkl')

~\Anaconda3\lib\site-packages\dill\_dill.py in dump_session(filename, main, byref)
    391         pickler._recurse = False # disable pickling recursion for globals
    392         pickler._session = True  # is best indicator of when pickling a session
--> 393         pickler.dump(main)
    394     finally:
    395         if f is not filename:  # If newly opened file

~\Anaconda3\lib\pickle.py in dump(self, obj)
    435         if self.proto >= 4:
    436             self.framer.start_framing()
--> 437         self.save(obj)
    438         self.write(STOP)
    439         self.framer.end_framing()

~\Anaconda3\lib\pickle.py in save(self, obj, save_persistent_id)
    502         f = self.dispatch.get(t)
    503         if f is not None:
--> 504             f(self, obj) # Call unbound method with explicit self
    505             return
    506 

~\Anaconda3\lib\site-packages\dill\_dill.py in save_module(pickler, obj)
   1267                 + ["__builtins__", "__loader__"]]
   1268             pickler.save_reduce(_import_module, (obj.__name__,), obj=obj,
-> 1269                                 state=_main_dict)
   1270             log.info("# M1")
   1271         else:

~\Anaconda3\lib\pickle.py in save_reduce(self, func, args, state, listitems, dictitems, obj)
    660 
    661         if state is not None:
--> 662             save(state)
    663             write(BUILD)
    664 

~\Anaconda3\lib\pickle.py in save(self, obj, save_persistent_id)
    502         f = self.dispatch.get(t)
    503         if f is not None:
--> 504             f(self, obj) # Call unbound method with explicit self
    505             return
    506 

~\Anaconda3\lib\site-packages\dill\_dill.py in save_module_dict(pickler, obj)
    900             # we only care about session the first pass thru
    901             pickler._session = False
--> 902         StockPickler.save_dict(pickler, obj)
    903         log.info("# D2")
    904     return

~\Anaconda3\lib\pickle.py in save_dict(self, obj)
    854 
    855         self.memoize(obj)
--> 856         self._batch_setitems(obj.items())
    857 
    858     dispatch[dict] = save_dict

~\Anaconda3\lib\pickle.py in _batch_setitems(self, items)
    880                 for k, v in tmp:
    881                     save(k)
--> 882                     save(v)
    883                 write(SETITEMS)
    884             elif n:

~\Anaconda3\lib\pickle.py in save(self, obj, save_persistent_id)
    502         f = self.dispatch.get(t)
    503         if f is not None:
--> 504             f(self, obj) # Call unbound method with explicit self
    505             return
    506 

~\Anaconda3\lib\site-packages\dill\_dill.py in save_instancemethod0(pickler, obj)
   1076     log.info("Me: %s" % obj) #XXX: obj.__dict__ handled elsewhere?
   1077     if PY3:
-> 1078         pickler.save_reduce(MethodType, (obj.__func__, obj.__self__), obj=obj)
   1079     else:
   1080         pickler.save_reduce(MethodType, (obj.im_func, obj.im_self,

~\Anaconda3\lib\pickle.py in save_reduce(self, func, args, state, listitems, dictitems, obj)
    636         else:
    637             save(func)
--> 638             save(args)
    639             write(REDUCE)
    640 

~\Anaconda3\lib\pickle.py in save(self, obj, save_persistent_id)
    502         f = self.dispatch.get(t)
    503         if f is not None:
--> 504             f(self, obj) # Call unbound method with explicit self
    505             return
    506 

~\Anaconda3\lib\pickle.py in save_tuple(self, obj)
    769         if n <= 3 and self.proto >= 2:
    770             for element in obj:
--> 771                 save(element)
    772             # Subtle.  Same as in the big comment below.
    773             if id(obj) in memo:

~\Anaconda3\lib\pickle.py in save(self, obj, save_persistent_id)
    547 
    548         # Save the reduce() output and finally memoize the object
--> 549         self.save_reduce(obj=obj, *rv)
    550 
    551     def persistent_id(self, obj):

~\Anaconda3\lib\pickle.py in save_reduce(self, func, args, state, listitems, dictitems, obj)
    660 
    661         if state is not None:
--> 662             save(state)
    663             write(BUILD)
    664 

~\Anaconda3\lib\pickle.py in save(self, obj, save_persistent_id)
    502         f = self.dispatch.get(t)
    503         if f is not None:
--> 504             f(self, obj) # Call unbound method with explicit self
    505             return
    506 

~\Anaconda3\lib\site-packages\dill\_dill.py in save_module_dict(pickler, obj)
    900             # we only care about session the first pass thru
    901             pickler._session = False
--> 902         StockPickler.save_dict(pickler, obj)
    903         log.info("# D2")
    904     return

~\Anaconda3\lib\pickle.py in save_dict(self, obj)
    854 
    855         self.memoize(obj)
--> 856         self._batch_setitems(obj.items())
    857 
    858     dispatch[dict] = save_dict

~\Anaconda3\lib\pickle.py in _batch_setitems(self, items)
    880                 for k, v in tmp:
    881                     save(k)
--> 882                     save(v)
    883                 write(SETITEMS)
    884             elif n:

~\Anaconda3\lib\pickle.py in save(self, obj, save_persistent_id)
    502         f = self.dispatch.get(t)
    503         if f is not None:
--> 504             f(self, obj) # Call unbound method with explicit self
    505             return
    506 

~\Anaconda3\lib\site-packages\dill\_dill.py in save_module_dict(pickler, obj)
    900             # we only care about session the first pass thru
    901             pickler._session = False
--> 902         StockPickler.save_dict(pickler, obj)
    903         log.info("# D2")
    904     return

~\Anaconda3\lib\pickle.py in save_dict(self, obj)
    854 
    855         self.memoize(obj)
--> 856         self._batch_setitems(obj.items())
    857 
    858     dispatch[dict] = save_dict

~\Anaconda3\lib\pickle.py in _batch_setitems(self, items)
    880                 for k, v in tmp:
    881                     save(k)
--> 882                     save(v)
    883                 write(SETITEMS)
    884             elif n:

~\Anaconda3\lib\pickle.py in save(self, obj, save_persistent_id)
    547 
    548         # Save the reduce() output and finally memoize the object
--> 549         self.save_reduce(obj=obj, *rv)
    550 
    551     def persistent_id(self, obj):

~\Anaconda3\lib\pickle.py in save_reduce(self, func, args, state, listitems, dictitems, obj)
    660 
    661         if state is not None:
--> 662             save(state)
    663             write(BUILD)
    664 

~\Anaconda3\lib\pickle.py in save(self, obj, save_persistent_id)
    502         f = self.dispatch.get(t)
    503         if f is not None:
--> 504             f(self, obj) # Call unbound method with explicit self
    505             return
    506 

~\Anaconda3\lib\site-packages\dill\_dill.py in save_module_dict(pickler, obj)
    900             # we only care about session the first pass thru
    901             pickler._session = False
--> 902         StockPickler.save_dict(pickler, obj)
    903         log.info("# D2")
    904     return

~\Anaconda3\lib\pickle.py in save_dict(self, obj)
    854 
    855         self.memoize(obj)
--> 856         self._batch_setitems(obj.items())
    857 
    858     dispatch[dict] = save_dict

~\Anaconda3\lib\pickle.py in _batch_setitems(self, items)
    885                 k, v = tmp[0]
    886                 save(k)
--> 887                 save(v)
    888                 write(SETITEM)
    889             # else tmp is empty, and we're done

~\Anaconda3\lib\pickle.py in save(self, obj, save_persistent_id)
    547 
    548         # Save the reduce() output and finally memoize the object
--> 549         self.save_reduce(obj=obj, *rv)
    550 
    551     def persistent_id(self, obj):

~\Anaconda3\lib\pickle.py in save_reduce(self, func, args, state, listitems, dictitems, obj)
    660 
    661         if state is not None:
--> 662             save(state)
    663             write(BUILD)
    664 

~\Anaconda3\lib\pickle.py in save(self, obj, save_persistent_id)
    502         f = self.dispatch.get(t)
    503         if f is not None:
--> 504             f(self, obj) # Call unbound method with explicit self
    505             return
    506 

~\Anaconda3\lib\site-packages\dill\_dill.py in save_module_dict(pickler, obj)
    900             # we only care about session the first pass thru
    901             pickler._session = False
--> 902         StockPickler.save_dict(pickler, obj)
    903         log.info("# D2")
    904     return

~\Anaconda3\lib\pickle.py in save_dict(self, obj)
    854 
    855         self.memoize(obj)
--> 856         self._batch_setitems(obj.items())
    857 
    858     dispatch[dict] = save_dict

~\Anaconda3\lib\pickle.py in _batch_setitems(self, items)
    880                 for k, v in tmp:
    881                     save(k)
--> 882                     save(v)
    883                 write(SETITEMS)
    884             elif n:

~\Anaconda3\lib\pickle.py in save(self, obj, save_persistent_id)
    522             reduce = getattr(obj, "__reduce_ex__", None)
    523             if reduce is not None:
--> 524                 rv = reduce(self.proto)
    525             else:
    526                 reduce = getattr(obj, "__reduce__", None)

~\Anaconda3\lib\site-packages\zmq\backend\cython\socket.cp37-win_amd64.pyd in zmq.backend.cython.socket.Socket.__reduce_cython__()

TypeError: no default __reduce__ due to non-trivial __cinit__
@joar
Copy link

joar commented Jun 13, 2019

The exception is raised when trying to pickle a zmq socket. Probably because getpass references sys.stdout, which references a zmq socket when running in a jupyter notebook.

See https://gist.github.com/joar/e03a63a1329cb70c91c4c811b7365cb2 for a minimal example.

@joar
Copy link

joar commented Jun 13, 2019

uqfoundation/pathos#132 seems to be a duplicate a similar issue.

@joar
Copy link

joar commented Jun 13, 2019

To be able to dump your session, you probably need to remove all references to getpass and sys.stdout by liberally using the del keyword, e.g. del getpass &c.

@danilociaffi
Copy link
Author

Yeah, my workaround was exactly adding del getpass before dill.dump_session, and it worked fine.
Since you usually dump your session at the very end it is not a big deal deleting the package just before it in the end. I can see it become very boring if you need to dump in the middle and have to delete and import the packages multiple times (or if you have some conflicting import inside of a third party module), but this is not the most intuitive way to use dill either.

@joar
Copy link

joar commented Jun 14, 2019

@danilociaffi It may not be good form to post this here, but since the dill package does not seem to get much bandwidth on the Python 3 compat issue front, I'll post it for your, and other potential future frustrated websearcher's benefit:
apache-beam are considering cloudpickle (https://github.com/cloudpipe/cloudpickle) due to dill's Python 3 compatibility issues, you could see if that works for you.

As a side note, I still haven't wrapped my head around what I would assume would be the correct way to serialize a socket.

@mmckerns
Copy link
Member

@joar: I'm not sure what you mean by Python 3 compatibility issues. dill serializes a broader range of objects than anything else... so I think the issue is more that there are some objects that dill pickles, and some that it doesn't... and similarly for cloudpickle... and you just have to choose which serializer to use due to whatever objects are better supported for your application. Admittedly, some of the new objects/patterns introduced in Python 3.7 have not had much time committed to in dill, that's true. However, with Python 3.8 beta 1 just out, they are currently being looked at, and will be dealt with hopefully before the Python 3.8.0 final release.

FYI dill does have a cloudpickle mode, if you use one of the alternate dill.settings. For newer objects, that might not work, however... but it's in the cards to pull in the few newer common objects that cloudpickle supports and dill doesn't yet. They are all fairly straightforward, it's just a time issue (as it always is).

So, if you have important objects that are not supported, then please post an issue, and it will be dealt with (I realize you already posted the RLock issue #321 , so thanks). dill does try to support users wherever it can, but if we don't see the issues, we don't know what they are -- basically, there is no way to test all objects and coding combinations, so we just go through the issues as they come in. We welcome contributions/PRs to the code here.

Anyway, @joar I agree with most of your comments, and that this is a similar/duplicate to uqfoundation/pathos#132. Also @danilociaffi: note that there is an old enhancement request that would be helpful to you, and it's basically your workaround. See issue #66.

@danilociaffi: If you feel your question has been addressed, then please close the issue.

@joar
Copy link

joar commented Jun 14, 2019

Thank you for your clarifying response @mmckerns

The plural compatibility issues were mentioned in a comment on my dill-related apache-beam JIRA ticket - I've only found a singular issue (#321), albeit a nasty challenging one.

I understand it's impossible to stay on top of everything anyone is trying to pickle under any possible environment. - I did experiment with cloudpickle and found out that it's as you describe regarding coverage:

import cloudpickle
import sys
cloudpickle.dumps(sys.modules["__main__"])

gives

Traceback (most recent call last):
# [...]
PicklingError: args from save_reduce() must be a tuple

@mmckerns
Copy link
Member

mmckerns commented Jun 15, 2019

@danilociaffi: I don't believe a cython object can be serialized easily -- unless it includes a __reduce__ method or similar. The object that's failing for your case ultimately is a cython object. The most fruitful path may be to ping the zmq developers about seeing if the object can be made to serialize by adding serialization-helper methods.

@danilociaffi
Copy link
Author

I understand, my workaround seems to solve my problems anyway.
Thank you both!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants