Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dill doctests fail when run as python -m doctest #49

Closed
distobj opened this issue Jun 23, 2014 · 13 comments
Closed

dill doctests fail when run as python -m doctest #49

distobj opened this issue Jun 23, 2014 · 13 comments

Comments

@distobj
Copy link

distobj commented Jun 23, 2014

I'm encountering problems running doctests via python -m doctest and nosetests --with-doctest. This seems related to #18 (and also in PySpark, FWIW) but differs in that doctest.testmod() executed inside the module doesn't trigger the problem. Sample code;

import dill
import doctest

def test_dill():
    """
    >>> out = dill.dumps(lambda x: x)
    """

    out = dill.dumps(lambda x: x)

doctest.testmod()

So when executed via python -m doctest under Python 2.7.3, I get a long recursive stacktrace of save_module_dict and save_module calls, concluding with;

      ...
      File "/home/mbaker/venvs/bibframe/local/lib/python2.7/site-packages/dill/dill.py", line 773, in save_module
        state=_main_dict)
      File "/usr/lib/python2.7/pickle.py", line 419, in save_reduce
        save(state)
      File "/usr/lib/python2.7/pickle.py", line 286, in save
        f(self, obj) # Call unbound method with explicit self
      File "/home/mbaker/venvs/bibframe/local/lib/python2.7/site-packages/dill/dill.py", line 504, in save_module_dict
        StockPickler.save_dict(pickler, obj)
      File "/usr/lib/python2.7/pickle.py", line 649, in save_dict
        self._batch_setitems(obj.iteritems())
      File "/usr/lib/python2.7/pickle.py", line 681, in _batch_setitems
        save(v)
      File "/usr/lib/python2.7/pickle.py", line 286, in save
        f(self, obj) # Call unbound method with explicit self
      File "/home/mbaker/venvs/bibframe/local/lib/python2.7/site-packages/dill/dill.py", line 816, in save_type
        StockPickler.save_global(pickler, obj)
      File "/usr/lib/python2.7/pickle.py", line 748, in save_global
        (obj, module, name))
    PicklingError: Can't pickle <class 'unittest.util.Mismatch'>: it's not found as unittest.util.Mismatch
@mmckerns
Copy link
Member

I tried your test code in python2.7.7 with the latest dill (from github), and I didn't encounter the error.

dude@hilbert>$ python -m doctest test_doctest.py 
dude@hilbert>$ python test_doctest.py
dude@hilbert>$ nosetests-2.7 --with-doctest test_doctest.py
.
----------------------------------------------------------------------
Ran 1 test in 0.002s

OK

Which version of dill are you using? There have been a recent change or two around the line at the top of you stack trace (the dill line you clipped ~@773).

The last line of your traceback is a common error. If there's some "intelligent" renaming of instances or one of the cases that python provides the wrong path for a object… dill can also fail to serialize the object.

@matsjoyce
Copy link
Contributor

Same: latest dill:

$ python2 --version
Python 2.7.6
$ python2 -m doctest a.py
$ cat a.py
import dill
import doctest

def test_dill():
    """
    >>> out = dill.dumps(lambda x: x)
    """

    out = dill.dumps(lambda x: x)

doctest.testmod()
$ python2
Python 2.7.6 (default, Feb 26 2014, 12:07:17) 
[GCC 4.8.2 20140206 (prerelease)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import dill
>>> dill.__version__
'0.2b2.dev'
>>> 
$

@distobj
Copy link
Author

distobj commented Jun 24, 2014

Well, that's wacky. I just installed a new Mint 17 virtualbox instance and had the same problem with it. Python version 2.7.6, Dill version '0.2.1'

FWIW, the non-doctest "out =" assignment was overly-indented in what I pasted above, but that's immaterial.

@matsjoyce
Copy link
Contributor

Could you put

import dill
import doctest

dill.dill._trace(1)

def test_dill():
    """
    >>> out = dill.dumps(lambda x: x)
    """

out = dill.dumps(lambda x: x)

doctest.testmod()

in your file and run it, then paste the output?

For reference, I get:

F1: <function <lambda> at 0x287f488>
F2: <function _create_function at 0x281b848>
Co: <code object <lambda> at 0x204ddb0, file "a.py", line 11>
F2: <function _unmarshal at 0x281b6e0>
D4: <dict object at 0x2127730>
D2: <dict object at 0x2856200>
F1: <function <lambda> at 0x2895a28>
F2: <function _create_function at 0x281b848>
Co: <code object <lambda> at 0x286ad30, file "<doctest a.test_dill[0]>", line 1>
F2: <function _unmarshal at 0x281b6e0>
D2: <dict object at 0x29795b0>
M2: <module 'dill' from 'dill/__init__.pyc'>
F2: <function _import_module at 0x281c140>
D4: <dict object at 0x1de77c0>
M2: <module 'doctest' from '/usr/lib/python2.7/doctest.pyc'>
F2: <function test_dill at 0x287f410>
D2: <dict object at 0x29f3980>

@distobj
Copy link
Author

distobj commented Jun 24, 2014

$ python a.py 
F1: <function <lambda> at 0x7f12e1b58230>
F2: <function _create_function at 0x7f12e2359140>
Co: <code object <lambda> at 0x7f12e3b59a30, file "a.py", line 11>
F2: <function _unmarshal at 0x7f12e2350f50>
D1: <dict object at 0x7f12e3b89168>
D2: <dict object at 0x7f12e21a9398>
F1: <function <lambda> at 0x7f12e1b58230>
F2: <function _create_function at 0x7f12e2359140>
Co: <code object <lambda> at 0x7f12e22b78b0, file "<doctest __main__.test_dill[0]>", line 1>
F2: <function _unmarshal at 0x7f12e2350f50>
D1: <dict object at 0x7f12e21abb40>
D2: <dict object at 0x7f12e1b71d70>

@matsjoyce
Copy link
Contributor

It didn't crash?

@distobj
Copy link
Author

distobj commented Jun 24, 2014

Ah, didn't realize you were asking for output of python -m doctest. Here it is;

$ python -m doctest a.py 
F1: <function <lambda> at 0x7f64dc9e3e60>
F2: <function _create_function at 0x7f64dca2fe60>
Co: <code object <lambda> at 0x7f64dd6887b0, file "a.py", line 11> 
F2: <function _unmarshal at 0x7f64dca2fcf8>
D4: <dict object at 0x7f64dd69ad70>
D2: <dict object at 0x7f64dca3a280>
F1: <function <lambda> at 0x7f64dca12500>
F2: <function _create_function at 0x7f64dca2fe60>
Co: <code object <lambda> at 0x7f64dca546b0, file "<doctest a.test_dill[0]>", line 1>
F2: <function _unmarshal at 0x7f64dca2fcf8>
D2: <dict object at 0x7f64dc9ae398>
M2: <module 'dill' from '/home/mark/.virtualenvs/libhub/local/lib/python2.7/site-packages/dill/__init__.pyc'>
F2: <function _import_module at 0x7f64dca316e0>
D4: <dict object at 0x7f64deaf6168>
M1: <module 'doctest' from '/usr/lib/python2.7/doctest.pyc'>
D2: <dict object at 0x7f64dc9bfb40>
M1: <module '__future__' from '/usr/lib/python2.7/__future__.pyc'>
D2: <dict object at 0x7f64dc9b3280>
C2: __future__._Feature
D2: <dict object at 0x7f64de9a86e0>
D2: <dict object at 0x7f64de9a7c58>
D2: <dict object at 0x7f64de9a8d70>
D2: <dict object at 0x7f64de9a8910>
D2: <dict object at 0x7f64de9a87f8>
D2: <dict object at 0x7f64de9a8c58>
D2: <dict object at 0x7f64de9a85c8>
F2: <function _module_relative_path at 0x7f64dc9e0140>
M1: <module 'unittest' from '/usr/lib/python2.7/unittest/__init__.pyc'>
D2: <dict object at 0x7f64dc9b2c58>
F2: <function removeResult at 0x7f64dd6fef50>
M1: <module 'unittest.runner' from '/usr/lib/python2.7/unittest/runner.pyc'>
D2: <dict object at 0x7f64dc9b2d70>
T4: <class 'unittest.runner.TextTestResult'>
T4: <class 'unittest.runner._WritelnDecorator'>
M2: <module 'sys' (built-in)>
M1: <module 'unittest.result' from '/usr/lib/python2.7/unittest/result.pyc'>
D2: <dict object at 0x7f64dc9b15c8>
M1: <module 'unittest.util' from '/usr/lib/python2.7/unittest/util.pyc'>
D2: <dict object at 0x7f64dc9ae4b0>
F2: <function namedtuple at 0x7f64dd6b5938>
T4: <class 'collections.OrderedDict'>
F2: <function _count_diff_hashable at 0x7f64de952410>
F2: <function _count_diff_all_purpose at 0x7f64dd6e2f50>
F2: <function unorderable_list_difference at 0x7f64dd6e2ed8>
F2: <function strclass at 0x7f64dd6e2de8>
T4: <class 'unittest.util.Mismatch'>

@matsjoyce
Copy link
Contributor

What's the value of sys.prefix? I think its got to do with the module saving code which decides whether to pickle the whole module or a ref to it. Should be fixed in #43?

@distobj
Copy link
Author

distobj commented Jun 24, 2014

Yes, sys.prefix was set to my virtualenv home. I've just tried it outside a virtualenv, and all works as expected.

I tried the memorise branch of your fork, and it does indeed work inside virtualenvs too. Thanks! I'll close this then, with the assumption that the memorise branch will make it to dill master

@distobj distobj closed this as completed Jun 24, 2014
@matsjoyce
Copy link
Contributor

OK, hopefully it'll be soon, as its may solve quite a few issues like this.

@mmckerns
Copy link
Member

@distobj: memorize branch should make it in eventually, there are some speed and other issues that I'm concerned about, but they'll get worked out. Actually, I wanted to point out that dill has a few different variants on serialization that can be leveraged in cases where the vanilla dill.dumps has difficulty. For example, in some cases when I do hierarchical parallel computing (say multiprocessing + MPI + cluster nodes), I often serialize to source with dill.source.importable or dill.temp.dump_source. There are other tweaks and variants as well. Please post an issue (or ping me) for anything that's biting you… especially if it's something you have in cloudpickle and not in dill. I'd be happy to see that gap close to zero. Also, a number projects that use dill abstract the serializer, so they can try to swap in cloudpickle or whatever if needed in a pinch for a particular case.

@matsjoyce
Copy link
Contributor

I'll give the speed problem another go. Shall I make the memorise branch into a PR, so make comparison easier?

@mmckerns
Copy link
Member

@matsjoyce: I expect it will become a PR and get merged at some point. If you want to make it a PR now, then go ahead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants