You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Alessandro Molina / @amol-: @jorisvandenbossche@kszucs I was able to reproduce the segfault using the provided test. I confirmed the test reproduces the issue by reverting the changes in #11465 and triggering the segfault.
I also added an additional check, that I verified does prevent the segfault replacing it with a proper Invalid("Invalid mask type") error to catch future regressions.
I couldn't find a way to trigger that error with current codebase on master, so it ends up being uncovered by a test. That's because with current codebase anything that is not a numpy.array gets converted to it so there is no way to end up into that situation normally. Ideally it's a kind of issue I would simulate by monkeypatching, but given that everything runs within Cython I can't monkeypatch get_values anyway the check is there and should prevent us from reintroducing the same issue in the future.
Alessandro Molina / @amol-:
PS: If you are wondering about the pandas.Series values, those are exactly took from the pyspark test that triggered the segfault for me
test_createDataFrame_column_name_encoding (pyspark.sql.tests.test_arrow.ArrowTests) ...
ARRAY01Name: a, dtype: int64TYPE <class'pandas.core.series.Series'>
---
MASK0FalseName: a, dtype: boolTYPE <class'pandas.core.series.Series'>
Hadtestfailuresinpyspark.sql.tests.test_arrowArrowTestswith /Users/amol/ARROW/venv/bin/python3; seelogs.
Cover the changes in #11465
cc @amol-
Reporter: Krisztian Szucs / @kszucs
Assignee: Alessandro Molina / @amol-
PRs and other links:
Note: This issue was originally created as ARROW-14388. Please see the migration documentation for further details.
The text was updated successfully, but these errors were encountered: