Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decoding KNeighboursClassifier model from Sklearn fails #192

Closed
mohitbadwal opened this issue Feb 9, 2018 · 9 comments
Closed

Decoding KNeighboursClassifier model from Sklearn fails #192

mohitbadwal opened this issue Feb 9, 2018 · 9 comments

Comments

@mohitbadwal
Copy link

I upgraded to version 0.96 after following this issue #147 and KNeibhoursClassifier model still doesn't decode. Here's what I tried-

    import json
    import jsonpickle
    import jsonpickle.ext.numpy as jsonpickle_numpy
    rfc = KNeighborsClassifier()
    rfc.fit(X,Y)


    jsonpickle_numpy.register_handlers()
    p = jsonpickle.pickler.Pickler()
    u = jsonpickle.unpickler.Unpickler()
    j = jsonpickle.json.dumps(p.flatten(rfc))
    json.dump(j, open(pickle_path, 'w'))
    j = json.load(open(pickle_path, 'r'))
    rfc = u.restore(jsonpickle.json.loads(j))

and the error-
TypeError: 'dict' object is not callable

What should be done ?

@davvid
Copy link
Member

davvid commented Feb 10, 2018

Thanks for the report, much appreciated.

What is X and Y in your example above? I'm only a numpy dabbler, so this example looks incomplete to me, but the folks who wrote most of the numpy extension will hopefully see this issue and help out too.

Mentioning @EelcoHoogendoorn and @marscher since they're much more familiar with the numpy extension, and probably a lot smarter than me in all things numpy too ;-)

Instead of pasting the problem code into this issue, can you please add your testcase to the tests/numpy_test.py file? That'll make this process a million times easier, and it'll make it very clear what we need to fix this issue. The easiest way to run the test suite is to run make test in a virtualenv that has requirements-dev.txt installed.

With the given example above, the calling sequence might have a buglet. I'll try to walk through it here below.

This first part looks fairly boilerplate. We could use the top-level jsonpickle.loads/dumps functions but this is using the lower-level API with picklers and unpicklers instead:

jsonpickle_numpy.register_handlers()
p = jsonpickle.pickler.Pickler()
u = jsonpickle.unpickler.Unpickler()

At this point we have a pickler and unpickler. Next we create a json string j:

j = jsonpickle.json.dumps(p.flatten(rfc))

The following line looks like a buglet:

json.dump(j, open(pickle_path, 'w'))

The buglet above is we're using the json module to dump a string that's already json-encoded. I think this string should be written to disk directly, either with a raw filehandle, or by using json.dump on the result of p.flatten() rather than the result of json.dumps(p.flatten(...)). This line might explain part of the issue, but technically it actually should be okay because...

j = json.load(open(pickle_path, 'r'))

The line above is now undoing the double json encoding above. Technically that should work, but it is kinda confusing. Did you truly intend to double-json encode the result from jsonpickle?

That said, this should work because the double-encoded json string gets restored by the call to json.load(...) which is then handed back to jsonpickle.json.loads(j), but just wanted to mention the nested json encoding that's going on.

I suspect that we could simplify that part of the test case down to something simpler like this below:

json_str = jsonpickle.dumps(rfc)
rfc_clone = jsonpickle.loads(json_str)

I only mention this because it doesn't look like the raw dict is used in a way that needs explicit picklers and unpicklers, but perhaps they're used by other parts of the code that's not relevant to this issue.

Let me know about getting a complete test case (regarding the unknown X and Y variables above) and hopefully we can make the numpy support even more complete.

The best part about adding the test case to numpy_test.py is that this use case will be forever in our test suite. That's a pretty strong guarantee that the use case will keep working in the future.

@marscher
Copy link
Contributor

hi, which line is actually failing (please show us the stack trace). This error message is popping up, if there is an unpickle-able type involved. By not-pickleable I mean, there is no get/setstate, reduce etc. protocol plus it is not a primitive type. Since you have registered the numpy handlers, this should actually work.

When I used jsonpickle, I have written my own handler, so I can not really tell how good it is (eg. does it support exotic dtypes, like structured arrays etc.?).

Are you familiar with pdb? You can wrap your script in an try: .... except block and within except call: pdb.post_mortem() to gain insight which object is causing this TypeError.
If you need more info on this: https://docs.python.org/2/library/pdb.html#pdb.post_mortem

@mohitbadwal
Copy link
Author

Here's the stacktrace

Traceback (most recent call last):
  File "D:/backup/PycharmProjects/test/ML/ferrero_work.py", line 115, in <module>
    rfc = u.restore(jsonpickle.json.loads(j))
  File "C:\Users\mohit.badwal.NOTEBOOK546.000\Anaconda3\lib\site-packages\jsonpickle\unpickler.py", line 138, in restore
    value = self._restore(obj)
  File "C:\Users\mohit.badwal.NOTEBOOK546.000\Anaconda3\lib\site-packages\jsonpickle\unpickler.py", line 192, in _restore
    return restore(obj)
  File "C:\Users\mohit.badwal.NOTEBOOK546.000\Anaconda3\lib\site-packages\jsonpickle\unpickler.py", line 304, in _restore_object
    return self._restore_object_instance(obj, cls)
  File "C:\Users\mohit.badwal.NOTEBOOK546.000\Anaconda3\lib\site-packages\jsonpickle\unpickler.py", line 369, in _restore_object_instance
    return self._restore_object_instance_variables(obj, instance)
  File "C:\Users\mohit.badwal.NOTEBOOK546.000\Anaconda3\lib\site-packages\jsonpickle\unpickler.py", line 431, in _restore_object_instance_variables
    instance = self._restore_state(obj, instance)
  File "C:\Users\mohit.badwal.NOTEBOOK546.000\Anaconda3\lib\site-packages\jsonpickle\unpickler.py", line 436, in _restore_state
    state = self._restore(obj[tags.STATE])
  File "C:\Users\mohit.badwal.NOTEBOOK546.000\Anaconda3\lib\site-packages\jsonpickle\unpickler.py", line 192, in _restore
    return restore(obj)
  File "C:\Users\mohit.badwal.NOTEBOOK546.000\Anaconda3\lib\site-packages\jsonpickle\unpickler.py", line 491, in _restore_dict
    data[k] = self._restore(v)
  File "C:\Users\mohit.badwal.NOTEBOOK546.000\Anaconda3\lib\site-packages\jsonpickle\unpickler.py", line 192, in _restore
    return restore(obj)
  File "C:\Users\mohit.badwal.NOTEBOOK546.000\Anaconda3\lib\site-packages\jsonpickle\unpickler.py", line 213, in _restore_reduce
    reduce_tuple = f, args, state, listitems, dictitems = map(self._restore, reduce_val)
  File "C:\Users\mohit.badwal.NOTEBOOK546.000\Anaconda3\lib\site-packages\jsonpickle\unpickler.py", line 192, in _restore
    return restore(obj)
  File "C:\Users\mohit.badwal.NOTEBOOK546.000\Anaconda3\lib\site-packages\jsonpickle\unpickler.py", line 476, in _restore_tuple
    return tuple([self._restore(v) for v in obj[tags.TUPLE]])
  File "C:\Users\mohit.badwal.NOTEBOOK546.000\Anaconda3\lib\site-packages\jsonpickle\unpickler.py", line 476, in <listcomp>
    return tuple([self._restore(v) for v in obj[tags.TUPLE]])
  File "C:\Users\mohit.badwal.NOTEBOOK546.000\Anaconda3\lib\site-packages\jsonpickle\unpickler.py", line 192, in _restore
    return restore(obj)
  File "C:\Users\mohit.badwal.NOTEBOOK546.000\Anaconda3\lib\site-packages\jsonpickle\unpickler.py", line 222, in _restore_reduce
    stage1 = f(*args)
TypeError: 'dict' object is not callable

@oldeucryptoboi
Copy link

oldeucryptoboi commented Apr 23, 2018

I just hit that bug. I was able to narrow it down where it crashed. 100% reproducible.

class Foo:
    class Bar(Enum):
        WTF = 1

jsonpickle.decode(jsonpickle.encode(Foo.Bar.WTF))

  File "/.virtualenvs/pywiktionary/lib/python3.6/site-packages/jsonpickle/__init__.py", line 155, in decode
    return unpickler.decode(string, backend=backend, keys=keys, classes=classes)
  File "/.virtualenvs/pywiktionary/lib/python3.6/site-packages/jsonpickle/unpickler.py", line 26, in decode
    return context.restore(backend.decode(string), reset=reset, classes=classes)
  File "/.virtualenvs/pywiktionary/lib/python3.6/site-packages/jsonpickle/unpickler.py", line 138, in restore
    value = self._restore(obj)
  File "/.virtualenvs/pywiktionary/lib/python3.6/site-packages/jsonpickle/unpickler.py", line 192, in _restore
    return restore(obj)
  File "/.virtualenvs/pywiktionary/lib/python3.6/site-packages/jsonpickle/unpickler.py", line 304, in _restore_object
    return self._restore_object_instance(obj, cls)
  File "/.virtualenvs/pywiktionary/lib/python3.6/site-packages/jsonpickle/unpickler.py", line 369, in _restore_object_instance
    return self._restore_object_instance_variables(obj, instance)
  File "/.virtualenvs/pywiktionary/lib/python3.6/site-packages/jsonpickle/unpickler.py", line 419, in _restore_object_instance_variables
    instance = self._restore_from_dict(obj, instance)
  File "/.virtualenvs/pywiktionary/lib/python3.6/site-packages/jsonpickle/unpickler.py", line 387, in _restore_from_dict
    value = self._restore(v)
  File "/.virtualenvs/pywiktionary/lib/python3.6/site-packages/jsonpickle/unpickler.py", line 192, in _restore
    return restore(obj)
  File "/.virtualenvs/pywiktionary/lib/python3.6/site-packages/jsonpickle/unpickler.py", line 222, in _restore_reduce
    stage1 = f(*args)
TypeError: 'dict' object is not callable

@dhruvsaraiya
Copy link

dhruvsaraiya commented May 21, 2018

I am facing same issue while doing something like this

json_kd = jsonpickle.encode(kdtree)
decoded = jsonpickle.decode(json_kd, classes=cKDTree)
and getting error as
TypeError: 'dict' object is not callable

in decoding

@davvid please help

@edjubuh
Copy link

edjubuh commented Jun 10, 2018

I can reproduce TypeError: 'dict' object is not callable with the following code:

import jsonpickle
from enum import Enum

class WrapperClass(object):
    class myenum(Enum):
        Hello = 1

mydict = {
    WrapperClass.myenum.Hello: 'test'
}

print(mydict)
print(jsonpickle.dumps(mydict, keys=True))
print(jsonpickle.loads(jsonpickle.dumps(mydict, keys=True), keys=True))

I don't know if this is the same issue @mohitbadwal is noticing with numpy.

@kapilkd13
Copy link

kapilkd13 commented Jun 13, 2018

Hi, I am also facing same issue, interestingly if I switch to jsonpickle=0.7.2 everything works fine.
error log

loaders.py:151: in load
    unpickled = json_decode(json_string)
../../../anaconda3/envs/test-libneuro-conda/lib/python3.6/site-packages/jsonpickle/__init__.py:155: in decode
    return unpickler.decode(string, backend=backend, keys=keys, classes=classes)
../../../anaconda3/envs/test-libneuro-conda/lib/python3.6/site-packages/jsonpickle/unpickler.py:26: in decode
    return context.restore(backend.decode(string), reset=reset, classes=classes)
../../../anaconda3/envs/test-libneuro-conda/lib/python3.6/site-packages/jsonpickle/unpickler.py:138: in restore
    value = self._restore(obj)
../../../anaconda3/envs/test-libneuro-conda/lib/python3.6/site-packages/jsonpickle/unpickler.py:192: in _restore
    return restore(obj)
../../../anaconda3/envs/test-libneuro-conda/lib/python3.6/site-packages/jsonpickle/unpickler.py:304: in _restore_object
    return self._restore_object_instance(obj, cls)
../../../anaconda3/envs/test-libneuro-conda/lib/python3.6/site-packages/jsonpickle/unpickler.py:369: in _restore_object_instance
    return self._restore_object_instance_variables(obj, instance)
../../../anaconda3/envs/test-libneuro-conda/lib/python3.6/site-packages/jsonpickle/unpickler.py:419: in _restore_object_instance_variables
    instance = self._restore_from_dict(obj, instance)
../../../anaconda3/envs/test-libneuro-conda/lib/python3.6/site-packages/jsonpickle/unpickler.py:387: in _restore_from_dict
    value = self._restore(v)
../../../anaconda3/envs/test-libneuro-conda/lib/python3.6/site-packages/jsonpickle/unpickler.py:192: in _restore
    return restore(obj)
../../../anaconda3/envs/test-libneuro-conda/lib/python3.6/site-packages/jsonpickle/unpickler.py:466: in _restore_list
    children = [self._restore(v) for v in obj]
../../../anaconda3/envs/test-libneuro-conda/lib/python3.6/site-packages/jsonpickle/unpickler.py:466: in <listcomp>
    children = [self._restore(v) for v in obj]
../../../anaconda3/envs/test-libneuro-conda/lib/python3.6/site-packages/jsonpickle/unpickler.py:192: in _restore
    return restore(obj)
../../../anaconda3/envs/test-libneuro-conda/lib/python3.6/site-packages/jsonpickle/unpickler.py:304: in _restore_object
    return self._restore_object_instance(obj, cls)
../../../anaconda3/envs/test-libneuro-conda/lib/python3.6/site-packages/jsonpickle/unpickler.py:369: in _restore_object_instance
    return self._restore_object_instance_variables(obj, instance)
../../../anaconda3/envs/test-libneuro-conda/lib/python3.6/site-packages/jsonpickle/unpickler.py:419: in _restore_object_instance_variables
    instance = self._restore_from_dict(obj, instance)
../../../anaconda3/envs/test-libneuro-conda/lib/python3.6/site-packages/jsonpickle/unpickler.py:387: in _restore_from_dict
    value = self._restore(v)
../../../anaconda3/envs/test-libneuro-conda/lib/python3.6/site-packages/jsonpickle/unpickler.py:192: in _restore
    return restore(obj)
../../../anaconda3/envs/test-libneuro-conda/lib/python3.6/site-packages/jsonpickle/unpickler.py:304: in _restore_object
    return self._restore_object_instance(obj, cls)
../../../anaconda3/envs/test-libneuro-conda/lib/python3.6/site-packages/jsonpickle/unpickler.py:369: in _restore_object_instance
    return self._restore_object_instance_variables(obj, instance)
../../../anaconda3/envs/test-libneuro-conda/lib/python3.6/site-packages/jsonpickle/unpickler.py:419: in _restore_object_instance_variables
    instance = self._restore_from_dict(obj, instance)
../../../anaconda3/envs/test-libneuro-conda/lib/python3.6/site-packages/jsonpickle/unpickler.py:387: in _restore_from_dict
    value = self._restore(v)
../../../anaconda3/envs/test-libneuro-conda/lib/python3.6/site-packages/jsonpickle/unpickler.py:192: in _restore
    return restore(obj)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <jsonpickle.unpickler.Unpickler object at 0x7f65e924f390>
obj = {'py/reduce': [{'py/object': '__builtin__.builtin_function_or_method'}, {'py/tuple': [{'py/type': 'numpy.ndarray'}, {'...AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA\nAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=\n'}]}, None, None]}

    def _restore_reduce(self, obj):
        """
            Supports restoring with all elements of __reduce__ as per pep 307.
            Assumes that iterator items (the last two) are represented as lists
            as per pickler implementation.
            """
        proxy = _Proxy()
        self._mkref(proxy)
        reduce_val = obj[tags.REDUCE]
        reduce_tuple = f, args, state, listitems, dictitems = map(self._restore, reduce_val)
    
        if f == tags.NEWOBJ or getattr(f, '__name__', '') == '__newobj__':
            # mandated special case
            cls = args[0]
            if not isinstance(cls, type):
                cls = self._restore(cls)
            stage1 = cls.__new__(cls, *args[1:])
        else:
>           stage1 = f(*args)
E           TypeError: 'dict' object is not callable

@mstimberg
Copy link
Contributor

There seem to be at least two separate issues in this thread (and both might be different from the OP's issue...).
For the examples provided by @ldesegur and @edjubuh, the problem is that jsonpickle does not automatically handle nested class definitions, the class has to be given explicitly as an argument to classes when decoding. For the example that @kapilkd13 provided, the problem is that jsonpickle does not correctly handle builtin functions.

Here's another test case that shows the root of the first problem:

import jsonpickle

class Outer(object):
    class Inner(object):
        pass

print(jsonpickle.encode(Outer.Inner))

The output is {"py/type": "__main__.Inner"}>, showing that jsonpickle did not correctly encode the class name. This is because the importable_name function in jsonpickle.util assembles the name from the module name (__main__) and the class name (Inner), which does not include the name of the outer class. This is easily fixable for Python >=3.3 which provides an attribute __qualname__ to get the fully qualified name, but unfortunately needs complex codes that inspects the code file (see e.g. https://github.com/wbolster/qualname). For Python >=3.3, this is the necessary change:

diff --git a/jsonpickle/util.py b/jsonpickle/util.py
index 8566450..dff7dac 100644
--- a/jsonpickle/util.py
+++ b/jsonpickle/util.py
@@ -507,7 +507,7 @@ def importable_name(cls):
     True
 
     """
-    name = cls.__name__
+    name = cls.__qualname__
     module = translate_module_name(cls.__module__)
     return '%s.%s' % (module, name)

print(jsonpickle.encode(Outer.Inner)) then correctly prints {"py/type": "__main__.Outer.Inner"}.

However, this does not fix the decoding, because the decoding code assumes that only the last part of the "importable name" is a class name, and the rest is a module; in this case, it would therefore try to import the Inner class from the __main__.Outer module which obviously fails.

It might be cleaner to store the module and class information separately in a tuple, but the following change makes it simply try all options:

diff --git a/jsonpickle/unpickler.py b/jsonpickle/unpickler.py
index 283d54b..06864e9 100644
--- a/jsonpickle/unpickler.py
+++ b/jsonpickle/unpickler.py
@@ -606,13 +606,21 @@ def loadclass(module_and_name, classes=None):
         except KeyError:
             pass
     # Otherwise, load classes from globally-accessible imports
-    try:
-        module, name = module_and_name.rsplit('.', 1)
-        module = util.untranslate_module_name(module)
-        __import__(module)
-        return getattr(sys.modules[module], name)
-    except (AttributeError, ImportError, ValueError):
-        return None
+    names = module_and_name.split('.')
+    # First assume that everything up to the last dot is the module name,
+    # then try other splits to handle classes that are defined within
+    # classes
+    for up_to in range(len(names)-1, 0, -1):
+        module = util.untranslate_module_name('.'.join(names[:up_to]))
+        try:
+            __import__(module)
+            obj = sys.modules[module]
+            for class_name in names[up_to:]:
+                obj = getattr(obj, class_name)
+            return obj
+        except (AttributeError, ImportError, ValueError) as ex:
+            continue
+    return None

With these changes, my example provided above and the examples provided by @ldesegur and @edjubuh work fine.

For the second issue, the problem is relatively obvious, the output of this code:

import jsonpickle

print(jsonpickle.encode(dir))

is {"py/object": "__builtin__.builtin_function_or_method"} which is of course impossible to convert back to the dir function.

Here are the changes that seem to fix this issue:

diff --git a/jsonpickle/util.py b/jsonpickle/util.py
index dff7dac..ed6bd8d 100644
--- a/jsonpickle/util.py
+++ b/jsonpickle/util.py
@@ -110,7 +110,8 @@ def is_object(obj):
     False
     """
     return (isinstance(obj, object) and
-            not isinstance(obj, (type, types.FunctionType)))
+            not isinstance(obj, (type, types.FunctionType,
+                                 types.BuiltinFunctionType)))
 
 
 def is_primitive(obj):
@@ -277,7 +278,7 @@ def is_module_function(obj):
     """
 
     return (hasattr(obj, '__class__') and
-            isinstance(obj, types.FunctionType) and
+            isinstance(obj, (types.FunctionType, types.BuiltinFunctionType)) and
             hasattr(obj, '__module__') and
             hasattr(obj, '__name__') and
             obj.__name__ != '<lambda>')

The above example now prints {"py/function": "__builtin__.dir"} which can be converted back to dir with jsonpickle.decode.

I'd be happy to provide these changes as a pull request (adding/adapting the tests), but I'm unsure what to do about the fact that __qualname__ only exists in Python versions starting at 3.3. Also, is it a backward-compatibility issue if the output for nested classes is now different (e.g. test_can_serialize_inner_classes fails with my changes)?

@davvid
Copy link
Member

davvid commented Jun 15, 2018

This sounds great, yeah I'd totally merge these changes.

Regarding the __qualname__, my suggestion would be to check hasattr(cls, '__qualname__')) before using it, and when it's not available fallback to __name__ as before. That way we're backwards compatible, and folks using newer stuff can benefit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants