Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++] Regression with compute utf8_*trim functions on macOS. #29176

Closed
asfimport opened this issue Aug 2, 2021 · 9 comments
Closed

[C++] Regression with compute utf8_*trim functions on macOS. #29176

asfimport opened this issue Aug 2, 2021 · 9 comments

Comments

@asfimport
Copy link

import pyarrow as pa
import pyarrow.compute as pc

arr = pa.array(["ab", "ac"])
r = pc.utf8_ltrim(arr, characters="a")
assert r.to_pylist() == ["b", "c"], r
r = pc.utf8_rtrim(arr, characters="b")
assert r.to_pylist() == ["a", "ac"], r

Seems to go awry after the first match.

Environment: macOS 11.5
Reporter: A. Coady / @coady
Assignee: Antoine Pitrou / @pitrou

PRs and other links:

Note: This issue was originally created as ARROW-13522. Please see the migration documentation for further details.

@asfimport
Copy link
Author

Antoine Pitrou / @pitrou:
I seem to get the correct results here:

>>> import pyarrow as pa, pyarrow.compute as pc
>>> arr = pa.array(["ab", "ac"])
>>> pc.utf8_ltrim(arr, characters="a").to_pylist()
['b', 'c']
>>> pc.utf8_rtrim(arr, characters="b").to_pylist()
['a', 'ac']

Are you expecting something else?

@asfimport
Copy link
Author

A. Coady / @coady:
Didn't reproduce on ubuntu; seems to be macOS only. And updated the example.

  • macOS 11.5.1
  • pyarrow-5.0.0-cp39-cp39-macosx_10_13_x86_64.whl

@asfimport
Copy link
Author

Antoine Pitrou / @pitrou:
Ok, it does seem to be completely broken on MacOS :-(

In [1]: import pyarrow as pa

In [2]: import pyarrow.compute as pc

In [3]: arr = pa.array(["ab", "ac"])

In [5]: pc.utf8_ltrim(arr, characters="a")
Out[5]: 
<pyarrow.lib.StringArray object at 0x7fc9d08b3b40>
[
  "",
  ""
]

In [10]: pc.utf8_rtrim(arr, characters="b")
Out[10]: 
<pyarrow.lib.StringArray object at 0x7fc9d06bbb40>
[
  "a",
  "a"
]

@asfimport
Copy link
Author

Joris Van den Bossche / @jorisvandenbossche:
Is it a regression compared to 4.0?

@asfimport
Copy link
Author

Antoine Pitrou / @pitrou:
Don't know, but it would be nice to fix anyway.

@asfimport
Copy link
Author

A. Coady / @coady:
Yes, it worked in 4.0.

@asfimport
Copy link
Author

Antoine Pitrou / @pitrou:
Issue resolved by pull request 10853
#10853

@asfimport
Copy link
Author

Antoine Pitrou / @pitrou:
This should be fixed now. I encourage you to check using the nightly builds, for example tomorrow:

https://arrow.apache.org/docs/python/install.html#installing-nightly-packages

@asfimport
Copy link
Author

Antoine Pitrou / @pitrou:
I've checked that the issue is fixed on an Intel Mac using the latest nightly wheels.

@asfimport asfimport added this to the 6.0.0 milestone Jan 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants