-
Notifications
You must be signed in to change notification settings - Fork 3.9k
Open
Labels
Component: PythonStatus: stale-warningIssues and PRs flagged as stale which are due to be closed if no indication otherwiseIssues and PRs flagged as stale which are due to be closed if no indication otherwiseType: enhancement
Description
Describe the enhancement requested
Currently, if you try to do something like:
a = pa.array([None, None, None]).cast(pa.dictionary(pa.int8(), pa.null()))
b = pa.array([None, None, None]).cast(pa.dictionary(pa.int8(), pa.null()))
pa.chunked_array([a, b]).unify_dictionaries()PyArrow will raise an exception like:
File pyarrow/table.pxi:1206, in pyarrow.lib.ChunkedArray.unify_dictionaries()
File pyarrow/error.pxi:154, in pyarrow.lib.pyarrow_internal_check_status()
File pyarrow/error.pxi:91, in pyarrow.lib.check_status()
ArrowNotImplementedError: Unification of null dictionaries is not implemented
Is there any way to just implement this? Manually calling .combine_chunks() seems to work fine, so the logic is clearly implemented somewhere.
a = pa.array([None, None, None]).cast(pa.dictionary(pa.int8(), pa.null()))
b = pa.array([None, None, None]).cast(pa.dictionary(pa.int8(), pa.null()))
pa.chunked_array([a, b]).combine_chunks()
I admit this might be a very niche feature (I am confused why I have feather files with these dictionary-encoded nulls, but I do and it'd be nice to be able to handle them).
Versions
I am testing this on Python 3.10 and pyarrow 15.
Component(s)
Python
Metadata
Metadata
Assignees
Labels
Component: PythonStatus: stale-warningIssues and PRs flagged as stale which are due to be closed if no indication otherwiseIssues and PRs flagged as stale which are due to be closed if no indication otherwiseType: enhancement