New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Python] Regression: segfault when reading hive table with v0.14 #16812
Comments
Neal Richardson / @nealrichardson:
|
H. Vetinari: I tried a couple of times before filing the report, and all (~5) invocations on 0.14 crashed, and all invocations on 0.13 worked. The machine itself has lots of memory, so I don't think it's that. Not sure I'll be able to pare this down to a minimal reproducing parquet file. I'll try. |
Wes McKinney / @wesm: |
H. Vetinari:
which, I believe, is due to the fact that gdb has not yet been built for python 3.7. (although, just as I was preparing this message, I triggered a rerender there and this has caused some further action and the first passing 3.7 build; not yet merged because 2.7 is failing). In the meantime I tried downgrading my whole environment to 3.6, where the program also crashes or hangs on v0.14. However, I haven't yet been able to get a gdb output. Might need some more reading of the GDB manual... |
Wes McKinney / @wesm: |
H. Vetinari: but I always get: Not sure if that's a mistake on my side or something in the setup/interplay of conda-gdb. |
Wes McKinney / @wesm: |
I'm working with pyarrow on a cloudera cluster (CDH 6.1.1), with pyarrow installed in a conda env.
The data I'm reading is a hive(-registered) table written as parquet, and with v0.13, reading this table (that is partitioned) does not cause any issues.
The code that worked before and now crashes with v0.14 is simply:
Since it completely crashes my notebook (resp. my REPL ends with "Killed"), I cannot report much more, but this is a pretty severe usability restriction. So far the solution is to enforce
pyarrow<0.14
Reporter: H. Vetinari
Related issues:
Note: This issue was originally created as ARROW-5965. Please see the migration documentation for further details.
The text was updated successfully, but these errors were encountered: