Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++] Allow automatic String -> LargeString promotions when concatenating tables #23539

Open
asfimport opened this issue Nov 22, 2019 · 2 comments

Comments

@asfimport
Copy link

asfimport commented Nov 22, 2019

inspired by GitHub issue #5874

Reporter: Wes McKinney / @wesm

Related issues:

Note: This issue was originally created as ARROW-7245. Please see the migration documentation for further details.

@asfimport
Copy link
Author

Wes McKinney / @wesm:
Perhaps Concatenate can be reimplemented as a vector kernel, so that type promotions can be handled by the kernel execution machinery

@asfimport
Copy link
Author

Clark Zinzow:
+1 to having Concatenate go through the type promotion logic in the compute layer. I'm currently running into a similar issue with concatenating tables with different numeric types that can certainly be promoted to a common numeric type. I'm currently working around this in application code by doing manual type promotion of each column to the common dtype for that column across all tables (mimicking Arrow's internal type promotion logic here)) before concatenating the tables.

Here is a pyarrow MWE:

In [1]: t1 = pa.table({"a": pa.array([1, 2], type=pa.int16())})

In [2]: t2 = pa.table({"a": pa.array([3, 4], type=pa.int64())})

In [3]: pa.concat_tables([t1, t2])
---------------------------------------------------------------------------
ArrowInvalid                              Traceback (most recent call last)
<ipython-input-91-40afec1155a5> in <module>
----> 1 pa.concat_tables([t1, t2])

~/.local/lib/python3.7/site-packages/pyarrow/table.pxi in pyarrow.lib.concat_tables()

~/.local/lib/python3.7/site-packages/pyarrow/error.pxi in pyarrow.lib.pyarrow_internal_check_status()

~/.local/lib/python3.7/site-packages/pyarrow/error.pxi in pyarrow.lib.check_status()

ArrowInvalid: Schema at index 1 was different:
a: int16
vs
a: int64

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant