Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++] Cast between fixed size list type and variable size list #20086

Closed
asfimport opened this issue Jan 25, 2022 · 7 comments · Fixed by #37292
Closed

[C++] Cast between fixed size list type and variable size list #20086

asfimport opened this issue Jan 25, 2022 · 7 comments · Fixed by #37292

Comments

@asfimport
Copy link
Collaborator

Casting from fixed size list to variable size list could be possible, I think, but currently doesn't work:

>>> fixed_size = pa.array([[1, 2], [3, 4]], type=pa.list_(pa.int64(), 2))
>>> fixed_size.cast(pa.list_(pa.int64()))
...
ArrowNotImplementedError: Unsupported cast from fixed_size_list<item: int64>[2] to list using function cast_list

And in principle, a cast the other way around could also be possible if it is checked that each list has the correct length.

Reporter: Joris Van den Bossche / @jorisvandenbossche

PRs and other links:

Note: This issue was originally created as ARROW-15455. Please see the migration documentation for further details.

@asfimport
Copy link
Collaborator Author

Eduardo Ponce / @edponce:
Also need to consider the scalar cases:

>>> fixed_size = pa.scalar([1, 2], type=pa.list_(pa.int64(), 2))
>>> fixed_size.cast(pa.list_(pa.int64()))
...
pyarrow.lib.ArrowNotImplementedError: casting scalars of type fixed_size_list<item: int64>[2] to type list<item: int64>

@asfimport
Copy link
Collaborator Author

Eduardo Ponce / @edponce:
Interesting to note that casting FixedSizeList to FixedSizeList works with Array but not with Scalar. The same behavior is observed for ListType.

# Works with Array
>>> fixed_size = pa.array([[1, 2]], type=pa.list_(pa.int64(), 2))
>>> fixed_size.cast(pa.list_(pa.int64(), 2))
<pyarrow.lib.FixedSizeListArray object at 0x1059e9f00>
[[1, 2]]

# Does not works with Scalar
>>> fixed_size = pa.scalar([1, 2], type=pa.list_(pa.int64(), 2))
>>> fixed_size.cast(pa.list_(pa.int64(), 2))
pyarrow.lib.ArrowNotImplementedError: casting scalars of type fixed_size_list<item: int64>[2] to type fixed_size_list<item: int64>[2]

@asfimport
Copy link
Collaborator Author

Jabari Booker / @JabariBooker:
It seem that you can't cast any lists with Scalar

 

#For Lists
>>> arrow_list = pa.scalar([1, 2])
>>> arrow_list.cast(pa.list_(pa.int64()))
pyarrow.lib.ArrowNotImplementedError: casting scalars of type list<item: int64> to type list<item: int64>

#For LargeLists
>>> arrow_list = pa.scalar([1, 2], pa.large_list(pa.int64()))
>>> arrow_list.cast(pa.large_list(pa.int64()))
pyarrow.lib.ArrowNotImplementedError: casting scalars of type large_list<item: int64> to type large_list<item: int64>

 

 

@asfimport
Copy link
Collaborator Author

Todd Farmer / @toddfarmer:
This issue was last updated over 90 days ago, which may be an indication it is no longer being actively worked. To better reflect the current state, the issue is being unassigned. Please feel free to re-take assignment of the issue if it is being actively worked, or if you plan to start that work soon.

@asfimport
Copy link
Collaborator Author

Jabari Booker / @JabariBooker:
Because the compute component of Arrow is not apart of core, we cannot call make calls to Cast() in Scalar::CastTo() . Therefore, we would need to reimplement casting to get Scalar::CastTo() work between list-type scalars. This is only impacts scalar.cast() in PyArrow. I was wondering if this effort seems at all worth it considering you simply use  pc.cast() anyways.

@asfimport
Copy link
Collaborator Author

David Li / @lidavidm:
It could be #ifdef'd. It would always work in Python since PyArrow requires ARROW_COMPUTE.

@asfimport
Copy link
Collaborator Author

Apache Arrow JIRA Bot:
This issue was last updated over 90 days ago, which may be an indication it is no longer being actively worked. To better reflect the current state, the issue is being unassigned per project policy. Please feel free to re-take assignment of the issue if it is being actively worked, or if you plan to start that work soon.

@wjones127 wjones127 self-assigned this Aug 22, 2023
wjones127 added a commit that referenced this issue Aug 23, 2023
### Rationale for this change

These list types should be able to cast to each other. In our code base, we find users might naively make an list array (for example, with `pa.Table.from_pylist()`) and want it automatically casted to a tensor array (which is based on a fixed size list).

### What changes are included in this PR?

Adds casts from FSL -> (Large)List and (Large)List -> FSL.

### Are these changes tested?

The kernels are tested.

### Are there any user-facing changes?

Just a new kernel.
* Closes: #20086

Authored-by: Will Jones <willjones127@gmail.com>
Signed-off-by: Will Jones <willjones127@gmail.com>
@wjones127 wjones127 added this to the 14.0.0 milestone Aug 23, 2023
loicalleyne pushed a commit to loicalleyne/arrow that referenced this issue Nov 13, 2023
…apache#37292)

### Rationale for this change

These list types should be able to cast to each other. In our code base, we find users might naively make an list array (for example, with `pa.Table.from_pylist()`) and want it automatically casted to a tensor array (which is based on a fixed size list).

### What changes are included in this PR?

Adds casts from FSL -> (Large)List and (Large)List -> FSL.

### Are these changes tested?

The kernels are tested.

### Are there any user-facing changes?

Just a new kernel.
* Closes: apache#20086

Authored-by: Will Jones <willjones127@gmail.com>
Signed-off-by: Will Jones <willjones127@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants