-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot import datasets - ValueError: pyarrow.lib.IpcWriteOptions size changed, may indicate binary incompatibility #5923
Comments
Based on rapidsai/cudf#10187, this probably means your Can you please execute the following commands in the terminal and paste the output here?
|
Here is the output to the first command:
and the second:
Thanks! |
after installing pytesseract 0.3.10, I got the above error. FYI |
RuntimeError: Failed to import transformers.trainer because of the following error (look up to see its traceback): |
I got the same error, pyarrow 12.0.0 released May/2023 (https://pypi.org/project/pyarrow/) is not compatible, running Do we need to update dependencies? |
Please note that our CI properly passes all tests with |
For conda with python3.8.16 this solved my problem! thanks!
|
Thanks for replying. I am not sure about those environments but it seems like pyarrow-12.0.0 does not work for conda with python 3.8.16.
|
Got the same error with:
|
This solved the issue for me as well. |
Solved it for me also |
arrow-cpp 11.0.0 py310h7516544_0 /root/miniconda3/lib/python3.10/site-packages/pyarrow/init.py |
Got the same problem with arrow-cpp 11.0.0 py310h1fc3239_0 miniforge3/envs/mlp/lib/python3.10/site-packages/pyarrow/init.py Reverting back to pyarrow 11 solved the problem. |
Solved with |
I got different. Solved with env: |
This works for me as well |
I guess it also depends on the Python version. I got Python 3.11.5 and pyarrow==12.0.0. |
Hi, if this helps anyone, pip install pyarrow==11.0.0 did not work for me (I'm using Colab) but this worked: |
ValueError Traceback (most recent call last) <timed exec> in <module> /usr/local/lib/python3.10/dist-packages/datasets/__init__.py in <module> 20 __version__ = "2.14.5" 21 ---> 22 from .arrow_dataset import Dataset 23 from .arrow_reader import ReadInstruction 24 from .builder import ArrowBasedBuilder, BeamBasedBuilder, BuilderConfig, DatasetBuilder, GeneratorBasedBuilder 4 frames /usr/local/lib/python3.10/dist-packages/pyarrow/_parquet.pyx in init pyarrow._parquet() ValueError: pyarrow.lib.IpcWriteOptions size changed, may indicate binary incompatibility. Expected 88 from C header, got 72 from PyObject 上記の問題を対応するために、ライブラリのアップデート 以下を参考 huggingface/datasets#5923
thanks! I met the same problem and your suggestion solved it. |
(I was doing quiet install so I didn't notice it initially)
|
for colab - pip install pyarrow==11.0.0 |
The above methods didn't help me. So I installed an older version: |
@rasith1998 @PennlaineChu You can avoid this issue by restarting the session after the Also, we've contacted Google Colab folks to update the default PyArrow installation, so the issue should soon be "officially" resolved on their side. |
This has been done! Google Colab now pre-installs PyArrow 14.0.2, which makes this issue unlikely to happen, so I'm closing it. |
Describe the bug
When trying to import datasets, I get a pyarrow ValueError:
Traceback (most recent call last):
File "/Users/edward/test/test.py", line 1, in
import datasets
File "/Users/edward/opt/anaconda3/envs/cs235/lib/python3.9/site-packages/datasets/init.py", line 43, in
from .arrow_dataset import Dataset
File "/Users/edward/opt/anaconda3/envs/cs235/lib/python3.9/site-packages/datasets/arrow_dataset.py", line 65, in
from .arrow_reader import ArrowReader
File "/Users/edward/opt/anaconda3/envs/cs235/lib/python3.9/site-packages/datasets/arrow_reader.py", line 28, in
import pyarrow.parquet as pq
File "/Users/edward/opt/anaconda3/envs/cs235/lib/python3.9/site-packages/pyarrow/parquet/init.py", line 20, in
from .core import *
File "/Users/edward/opt/anaconda3/envs/cs235/lib/python3.9/site-packages/pyarrow/parquet/core.py", line 45, in
from pyarrow.fs import (LocalFileSystem, FileSystem, FileType,
File "/Users/edward/opt/anaconda3/envs/cs235/lib/python3.9/site-packages/pyarrow/fs.py", line 49, in
from pyarrow._gcsfs import GcsFileSystem # noqa
File "pyarrow/_gcsfs.pyx", line 1, in init pyarrow._gcsfs
ValueError: pyarrow.lib.IpcWriteOptions size changed, may indicate binary incompatibility. Expected 88 from C header, got 72 from PyObject
Steps to reproduce the bug
import datasets
Expected behavior
Successful import
Environment info
Conda environment, MacOS
python 3.9.12
datasets 2.12.0
The text was updated successfully, but these errors were encountered: