Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests/pandas_test.py::test_timedelta_index_roundtrip fails when pandas are built with Cython-3 #460

Closed
mgorny opened this issue Aug 14, 2023 · 7 comments

Comments

@mgorny
Copy link
Contributor

mgorny commented Aug 14, 2023

When pandas (tested with 2.0.3) are built against Cython 3.0.0, I'm getting the following test failure:

___________________________________________________ test_timedelta_index_roundtrip ____________________________________________________

    def test_timedelta_index_roundtrip():
        idx = pd.timedelta_range(start='1 day', periods=4, closed='right')
>       decoded_idx = roundtrip(idx)

tests/pandas_test.py:124: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
tests/pandas_test.py:32: in roundtrip
    return jsonpickle.decode(jsonpickle.encode(obj))
jsonpickle/unpickler.py:88: in decode
    return context.restore(data, reset=reset, classes=classes)
jsonpickle/unpickler.py:362: in restore
    value = self._restore(obj)
jsonpickle/unpickler.py:344: in _restore
    return restore(obj)
jsonpickle/unpickler.py:769: in _restore_object
    instance = handler(self).restore(obj)
jsonpickle/ext/pandas.py:172: in restore
    idx = self.index_constructor(decode(buf), dtype=dtype, **name_bundle)
jsonpickle/unpickler.py:88: in decode
    return context.restore(data, reset=reset, classes=classes)
jsonpickle/unpickler.py:362: in restore
    value = self._restore(obj)
jsonpickle/unpickler.py:344: in _restore
    return restore(obj)
jsonpickle/unpickler.py:427: in _restore_list
    children = [self._restore(v) for v in obj]
jsonpickle/unpickler.py:427: in <listcomp>
    children = [self._restore(v) for v in obj]
jsonpickle/unpickler.py:344: in _restore
    return restore(obj)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <jsonpickle.unpickler.Unpickler object at 0x7f1f1a5b2bf0>
obj = {'py/reduce': [{'py/object': '_cython_3_0_0.cython_function_or_method'}, {'py/tuple': [172800000000000, 10]}]}

    def _restore_reduce(self, obj):
        """
        Supports restoring with all elements of __reduce__ as per pep 307.
        Assumes that iterator items (the last two) are represented as lists
        as per pickler implementation.
        """
        proxy = _Proxy()
        self._mkref(proxy)
        reduce_val = list(map(self._restore, obj[tags.REDUCE]))
        if len(reduce_val) < 5:
            reduce_val.extend([None] * (5 - len(reduce_val)))
        f, args, state, listitems, dictitems = reduce_val
    
        if f == tags.NEWOBJ or getattr(f, '__name__', '') == '__newobj__':
            # mandated special case
            cls = args[0]
            if not isinstance(cls, type):
                cls = self._restore(cls)
            stage1 = cls.__new__(cls, *args[1:])
        else:
>           stage1 = f(*args)
E           TypeError: 'dict' object is not callable

jsonpickle/unpickler.py:472: TypeError

This is affecting Linux distributions that are building their packages against Cython 3.0.0. To reproduce in a venv:

pip install Cython versioneer
pip install --force --no-build-isolation --no-binary pandas pandas
python -m pytest -vv tests/pandas_test.py::test_timedelta_index_roundtrip
@Theelx
Copy link
Contributor

Theelx commented Aug 14, 2023

Got it. I'll try and fix it tonight but I can't guarantee it'll be done. Is there a deadline by which any Linux distros need a working version? Also, is pinning Cython to <3 a feasible temporary solution?

@mgorny
Copy link
Contributor Author

mgorny commented Aug 15, 2023

Is there a deadline by which any Linux distros need a working version?

Not really but numpy released an RC yesterday that requires Cython>=3.0, so it would be nice to fix it sooner than later. That said, I don't think this functionality is really critical right now, so there's no explicit hurry.

Also, is pinning Cython to <3 a feasible temporary solution?

We'd actually have to pin pandas on our end. Please don't add pins to jsonpickle because that won't affect pandas at all.

@Theelx
Copy link
Contributor

Theelx commented Sep 1, 2023

I've been investigating this, and it seems like it could be related to this commit (though I'm not 100% sure). The dict in the error you got is {'py/object': '_cython_3_0_2.cython_function_or_method'}, and it's created here. I'm not entirely sure how to fix it, but I'm going to start by trying to detect cython functions when encoding.

@bnavigator
Copy link

We'd actually have to pin pandas on our end. Please don't add pins to jsonpickle because that won't affect pandas at all.

Pandas upstream actually still pins to Cython <3 in its released/tagged versions, but has moved on to Cython 3 in the development branch.

@QuLogic
Copy link

QuLogic commented Feb 16, 2024

Pandas is Cython 3 in 2.2.0, so this test fails if that version is installed (which is what we are attempting to upgrade to in Fedora ATM.)

davvid added a commit to davvid/jsonpickle that referenced this issue Feb 16, 2024
Remove the Cython "py/ref" avoidance by recording ref entries when
unpickling "py/function" entities. Update the unpickler to do
record ref entries for "py/function" so that we can remove the
hacky self._objs handling for Cython functions in the pickler.

Related-to: jsonpickle#460
@davvid
Copy link
Member

davvid commented Feb 16, 2024

I just got the tests passing locally. @Theelx I suspect you're busy but if you have a moment to review it'd be much appreciated. @QuLogic if you could test #477 on your side that'd be much appreciated.

davvid added a commit to davvid/jsonpickle that referenced this issue Feb 16, 2024
Cython v3 produces functions that pass the util.is_object() tests.
This complicated the unpickler because it needed to start tracking
refs for functions, which it never needed to do before.

Expand the meaning of is_module_function() so that it includes cython
functions and use this check earlier in the pickling process.
This makes the pickler handle Cython functions before they ever make
it into the "if util.is_object(obj):" code path.

These changes restore the pickler's original semantics by ensuring
that functions are excluded from the referencing machinery completely.

Related-to: jsonpickle#460
@davvid davvid closed this as completed in 78b7ce7 Feb 20, 2024
davvid added a commit that referenced this issue Feb 20, 2024
* davvid/cython-functions:
  CHANGES: Pandas and Cython 3.0 compatibility
  pickler/unpickler: pickle functions and Cython functions earlier
  pickler/unpickler: record ref entries for functions
  pickler: flatten Cython functions using _flatten_function()

Closes: #460
@davvid
Copy link
Member

davvid commented Feb 20, 2024

Thanks for testing! I went ahead and deployed jsonpickle v3.0.3 to pypi with this fix: https://pypi.org/project/jsonpickle/3.0.3/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants