New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Python] Segfault when reading parquet files if pytorch is imported before pyarrow #2637

Closed
ostrokach opened this Issue Sep 26, 2018 · 7 comments

Comments

Projects
None yet
4 participants
@ostrokach
Copy link

ostrokach commented Sep 26, 2018

pyarrow (version 0.10.0) appears to crash sporadically with a segmentation fault when reading parquet files if it is used in a program where torch is imported first.

A self-contained example is available here: https://gitlab.com/ostrokach/pyarrow_pytorch_segfault.

Basically, running

python -X faulthandler -c "import torch; import pyarrow.parquet as pq; _ = pq.ParquetFile('example.parquet').read_row_group(0)"

sooner or later results in a segfault:

Fatal Python error: Segmentation fault

Current thread 0x00007f52959bb740 (most recent call first):
  File "/home/kimlab1/strokach/anaconda/lib/python3.6/site-packages/pyarrow/parquet.py", line 125 in read_row_group
  File "<string>", line 1 in <module>
./test_fail.sh: line 5: 42612 Segmentation fault      (core dumped) python -X faulthandler -c "import torch; import pyarrow.parquet as pq; _ = pq.ParquetFile('example.parquet').read_row_group(0)"

The number of iterations before a segfault varies, but it usually happens within the first several calls.

Running

python -X faulthandler -c "import pyarrow.parquet as pq; import torch; _ = pq.ParquetFile('example.parquet').read_row_group(0)"

works without a problem.

@fsaintjacques

This comment has been minimized.

Copy link
Contributor

fsaintjacques commented Sep 27, 2018

>>> bt
#0  0x00007f801d44cb90 in ?? ()
#1  0x00007f7fdb3e1e73 in std::_Sp_counted_ptr_inplace<parquet::DataPage, std::allocator<parquet::DataPage>, (__gnu_cxx::_Lock_policy)2>::_M_dispose() () from /home/fsaintjacques/.local/lib/python2.7/site-packages/pyarrow/libparquet.so.1
#2  0x00007f7fdb393b29 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() () from /home/fsaintjacques/.local/lib/python2.7/site-packages/pyarrow/libparquet.so.1
#3  0x00007f7fdb3ba28f in parquet::internal::TypedRecordReader<parquet::DataType<(parquet::Type::type)2> >::ReadNewPage() () from /home/fsaintjacques/.local/lib/python2.7/site-packages/pyarrow/libparquet.so.1
#4  0x00007f7fdb3bac20 in parquet::internal::TypedRecordReader<parquet::DataType<(parquet::Type::type)2> >::ReadRecords(long) () from /home/fsaintjacques/.local/lib/python2.7/site-packages/pyarrow/libparquet.so.1
#5  0x00007f7fdb391676 in parquet::arrow::PrimitiveImpl::NextBatch(long, std::shared_ptr<arrow::Array>*) () from /home/fsaintjacques/.local/lib/python2.7/site-packages/pyarrow/libparquet.so.1
#6  0x00007f7fdb38cfae in parquet::arrow::ColumnReader::NextBatch(long, std::shared_ptr<arrow::Array>*) () from /home/fsaintjacques/.local/lib/python2.7/site-packages/pyarrow/libparquet.so.1
#7  0x00007f7fdb38dbcb in parquet::arrow::FileReader::Impl::ReadColumnChunk(int, int, std::shared_ptr<arrow::Array>*) () from /home/fsaintjacques/.local/lib/python2.7/site-packages/pyarrow/libparquet.so.1
#8  0x00007f7fdb38e10f in parquet::arrow::FileReader::Impl::ReadRowGroup(int, std::vector<int, std::allocator<int> > const&, std::shared_ptr<arrow::Table>*)::{lambda(int)#1}::operator()(int) const () from /home/fsaintjacques/.local/lib/python2.7/site-packages/pyarrow/libparquet.so.1
#9  0x00007f7fdb38ef20 in parquet::arrow::FileReader::Impl::ReadRowGroup(int, std::vector<int, std::allocator<int> > const&, std::shared_ptr<arrow::Table>*) () from /home/fsaintjacques/.local/lib/python2.7/site-packages/pyarrow/libparquet.so.1
#10 0x00007f7fdb38f2f4 in parquet::arrow::FileReader::Impl::ReadRowGroup(int, std::shared_ptr<arrow::Table>*) () from /home/fsaintjacques/.local/lib/python2.7/site-packages/pyarrow/libparquet.so.1
#11 0x00007f7fdb38f402 in parquet::arrow::FileReader::ReadRowGroup(int, std::shared_ptr<arrow::Table>*) () from /home/fsaintjacques/.local/lib/python2.7/site-packages/pyarrow/libparquet.so.1
#12 0x00007f7fd9fd8f12 in __pyx_pw_7pyarrow_8_parquet_13ParquetReader_7read_row_group(_object*, _object*, _object*) () from /home/fsaintjacques/.local/lib/python2.7/site-packages/pyarrow/_parquet.so
#13 0x00005593b92f4d57 in PyEval_EvalFrameEx ()
#14 0x00005593b92eb8ca in PyEval_EvalCodeEx ()
#15 0x00005593b92f324e in PyEval_EvalFrameEx ()
#16 0x00005593b92eb8ca in PyEval_EvalCodeEx ()
#17 0x00005593b92eb1e9 in PyEval_EvalCode ()
#18 0x00005593b931bbdf in ?? ()
#19 0x00005593b9348586 in PyRun_StringFlags ()
#20 0x00005593b9348b4c in PyRun_SimpleStringFlags ()
#21 0x00005593b92c547c in Py_Main ()
#22 0x00007f8036884b97 in __libc_start_main () from /lib/x86_64-linux-gnu/libc.so.6
#23 0x00005593b92c4e0a in _start ()
@xhochy

This comment has been minimized.

Copy link
Member

xhochy commented Sep 27, 2018

The backtrace looks like the same problem as we see with the tensorflow wheels. They both use a newer libstdc++ version and when pyarrow is imported last, the virtual shared pointer destructor will be picked up from the torch libstdc++.

One should have a look at the symbols exported by the Torch wheel, maybe it could help if they would employ also the same symbol hiding as we do in pyarrow.

@xhochy

This comment has been minimized.

Copy link
Member

xhochy commented Sep 27, 2018

@ostrokach Can you open a JIRA with your description? That would make it better to track long-term.

@wesm

This comment has been minimized.

Copy link
Member

wesm commented Sep 27, 2018

It's odd since in theory newer libstdc++ symbols should be ABI-compatible

@xhochy

This comment has been minimized.

Copy link
Member

xhochy commented Sep 27, 2018

There are some bits that broke a bit in older versions which also could affect our manylinux1 wheels: https://gcc.gnu.org/wiki/Cxx11AbiCompatibility

@ostrokach

This comment has been minimized.

Copy link
Author

ostrokach commented Sep 27, 2018

@wesm

This comment has been minimized.

Copy link
Member

wesm commented Sep 27, 2018

thanks

@wesm wesm closed this Sep 27, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment