Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Serialization] Fix reference counting of numpy arrays created in custom serialializer #17703

Closed
asfimport opened this issue Oct 20, 2017 · 1 comment

Comments

@asfimport
Copy link

The problem happens with the following code:

import numpy as np
import pyarrow
import sys

class Bar(object):
    pass

def bar_custom_serializer(obj):
    x = np.zeros(4)
    return x

def bar_custom_deserializer(serialized_obj):
    return serialized_obj

pyarrow._default_serialization_context.register_type(Bar, "Bar", pickle=False, custom_serializer=bar_custom_serializer, custom_deserializer=bar_custom_deserializer)

pyarrow.serialize(Bar())

After execution of pyarrow.serialize, the interpreter crashes in the garbage collection routine.

This happens if a numpy array is returned in the custom serializer but there is no other reference to the numpy array. The reason this is not a problem in the current code is that so far we haven't created new numpy arrays in the custom serializer.

I think the problem here is that the numpy array hits reference count zero between the end of SerializeSequences in python_to_arrow.cc and the call to NdarrayToTensor. I'll push a fix later today, which just increases and decreases the reference counts at the appropriate places.

Reporter: Philipp Moritz / @pcmoritz
Assignee: Philipp Moritz / @pcmoritz

PRs and other links:

Note: This issue was originally created as ARROW-1695. Please see the migration documentation for further details.

@asfimport
Copy link
Author

Philipp Moritz / @pcmoritz:
Issue resolved by pull request 1220
#1220

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants