Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Python][C++] posix_madvise error on Debian in pyarrow 1.0.0 #25642

Closed
asfimport opened this issue Jul 27, 2020 · 19 comments
Closed

[Python][C++] posix_madvise error on Debian in pyarrow 1.0.0 #25642

asfimport opened this issue Jul 27, 2020 · 19 comments

Comments

@asfimport
Copy link

The following writes and reads back from a Parquet file in both pyarrow 0.17.0 and 1.0.0 on Ubuntu 18.04:
 

>>> import pyarrow.parquet
>>> a = pyarrow.array([[1.1, 2.2, 3.3], [], [4.4, 5.5]])
>>> t = pyarrow.Table.from_batches([pyarrow.RecordBatch.from_arrays([a], ["stuff"])])
>>> pyarrow.parquet.write_table(t, "stuff.parquet")
>>> t2 = pyarrow.parquet.read_table("stuff.parquet") 

 
However, the same thing raises the following exception on Debian 9 (stretch) in pyarrow 1.0.0 but not in pyarrow 0.17.0:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/jpivarski/miniconda3/lib/python3.7/site-packages/pyarrow/parquet.py", line 1564, in read_table
    filters=filters,
  File "/home/jpivarski/miniconda3/lib/python3.7/site-packages/pyarrow/parquet.py", line 1433, in __init__
    partitioning=partitioning)
  File "/home/jpivarski/miniconda3/lib/python3.7/site-packages/pyarrow/dataset.py", line 667, in dataset
    return _filesystem_dataset(source, **kwargs)
  File "/home/jpivarski/miniconda3/lib/python3.7/site-packages/pyarrow/dataset.py", line 434, in _filesystem_dataset
    return factory.finish(schema)
  File "pyarrow/_dataset.pyx", line 1451, in pyarrow._dataset.DatasetFactory.finish
  File "pyarrow/error.pxi", line 122, in pyarrow.lib.pyarrow_internal_check_status
  File "pyarrow/error.pxi", line 99, in pyarrow.lib.check_status
OSError: posix_madvise failed. Detail: [errno 0] Success

It's a little odd that the error says that it failed with "detail: success". That suggests to me that an "if" predicate is backward (missing "not"), which might only be triggered on some OS/distributions.

Environment: Installed with Miniconda (for Debian; used pip for the Ubuntu test)
Reporter: Jim Pivarski / @jpivarski
Assignee: Antoine Pitrou / @pitrou

Original Issue Attachments:

PRs and other links:

Note: This issue was originally created as ARROW-9577. Please see the migration documentation for further details.

@asfimport
Copy link
Author

Wes McKinney / @wesm:
@jorisvandenbossche @bkietz why is the datasets API being used to read a single file? That seems wrong to me

@asfimport
Copy link
Author

Joris Van den Bossche / @jorisvandenbossche:
@wesm I did this to enable filter on the rowgroup level for single files as well. We could also only use the datasets API when needed (so if the filter keyword is specified), and otherwise use the ParquetFile single-file reader.

@asfimport
Copy link
Author

Wes McKinney / @wesm:
I see, that makes more sense then.

@asfimport
Copy link
Author

Antoine Pitrou / @pitrou:
@jpivarski Are you able to reproduce when building from source?

@asfimport
Copy link
Author

Jim Pivarski / @jpivarski:
The computer in question is a Chromebook—I don't know I can build from source. I'll see if I can create an equivalent test on AWS. It probably needs to be Debian (not Ububtu), but do you have any guesses about what is likely to trigger this "posix_madvise"? (Maybe I'm wrong in thinking it's the Linux distribution; if it's hardware, I'd be wasting my time trying to reproduce it on a VM.)

@asfimport
Copy link
Author

Antoine Pitrou / @pitrou:
I couldn't reproduce in a Docker 9 container (when built from source). @jpivarski Can you upload the stuff.parquet file somewhere?

@asfimport
Copy link
Author

Antoine Pitrou / @pitrou:

do you have any guesses about what is likely to trigger this "posix_madvise"?

I have no idea, which is why I am trying to reproduce to get some insight.

@asfimport
Copy link
Author

Antoine Pitrou / @pitrou:
I just know the error message is wrong, so I can fix that and we may gain a bit more insight.

@asfimport
Copy link
Author

Antoine Pitrou / @pitrou:
A Chromebook is which kind of CPU, by the way?

@asfimport
Copy link
Author

Antoine Pitrou / @pitrou:
By the way, perhaps by running under strace, you can find out what the arguments to madvise were and what the actual error return is?

@asfimport
Copy link
Author

Jim Pivarski / @jpivarski:
I've reproduced the error, but again using the 1.0.0 version from pip. This Chromebook is an Intel, not ARM, and it has normal i386 Debian packages running in its Crostini VM—the only limitations that I'm aware of are memory, disk, and CPU size, and this posix_madvaise error. (It's also what I'm stuck with while my real computer is being repaired, so I've been trying to do only Python tasks these few weeks. I probably wouldn't have found this error, though, which is likely to affect somebody, given how widely Arrow is used.)

I've attached the stuff.parquet file that the Chromebook made (in case it's the writing step that's affected?) as well as an strace from just before I pressed on read_table to just after the exception.

stuff.parquet

strace-parquet-read.log

I don't know how to read an strace, but I see the posix_madvise in there after a lot of "No such file or directory" when trying to open files like _dataset.pyx and error.pxi. I've also attached the locations of all my .pxi files: /home/jpivarski/miniconda3/lib/python3.8/site-packages/pyarrow/error.pxi is definitely there...

location-of-pxi-files.log

 

@asfimport
Copy link
Author

Antoine Pitrou / @pitrou:
Thank you very much. madvise fails with EBADF, meaning the given memory address is legit but does not correspond to a file mapping. Reading the pointer value, it seems it was allocated using brk rather than mmap. However, on my system (x86-64 Ubuntu 18.04), both kinds of allocations work with madvise.

Can I ask you to compile and run this small C file on your Chromebook?
https://gist.github.com/pitrou/fa03cbc44bd93cefee727d7000942a64

(also, if it fails, can you run strace on it?)

@asfimport
Copy link
Author

Jim Pivarski / @jpivarski:
Thank you: making a tiny C program helps a lot! It fails with

madvise of mmap-allocated data failed: Bad file descriptor
madvise of sbrk-allocated data failed: Bad file descriptor

and the output of strace -ttT ./madvise-test is

10:44:33.699121 execve("./madvise-test", ["./madvise-test"], [/* 95 vars */]) = 0 <0.000980>
10:44:33.701266 brk(NULL) = 0x58756f244000 <0.000170>
10:44:33.701935 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) <0.000208>
10:44:33.702494 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) <0.000283>
10:44:33.703142 open("/home/jpivarski/miniconda3/lib/tls/x86_64/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) <0.001724>
10:44:33.705203 stat("/home/jpivarski/miniconda3/lib/tls/x86_64", 0x7ffc8a9e0370) = -1 ENOENT (No such file or directory) <0.011708>
10:44:33.717319 open("/home/jpivarski/miniconda3/lib/tls/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) <0.000266>
10:44:33.718843 stat("/home/jpivarski/miniconda3/lib/tls", 0x7ffc8a9e0370) = -1 ENOENT (No such file or directory) <0.000139>
10:44:33.719183 open("/home/jpivarski/miniconda3/lib/x86_64/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) <0.000128>
10:44:33.719509 stat("/home/jpivarski/miniconda3/lib/x86_64", 0x7ffc8a9e0370) = -1 ENOENT (No such file or directory) <0.000123>
10:44:33.719860 open("/home/jpivarski/miniconda3/lib/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) <0.000125>
10:44:33.720179 stat("/home/jpivarski/miniconda3/lib", \{st_mode=S_IFDIR|0755, st_size=40452, ...}) = 0 <0.000142>
10:44:33.720553 open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 <0.000147>
10:44:33.720947 fstat(3, \{st_mode=S_IFREG|0644, st_size=50626, ...}) = 0 <0.000145>
10:44:33.721288 mmap(NULL, 50626, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7bac294a0000 <0.000130>
10:44:33.721610 close(3) = 0 <0.000138>
10:44:33.721961 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) <0.000129>
10:44:33.723986 open("/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3 <0.000116>
10:44:33.724265 read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\0\4\2\0\0\0\0\0"..., 832) = 832 <0.000091>
10:44:33.724505 fstat(3, \{st_mode=S_IFREG|0755, st_size=1689360, ...}) = 0 <0.000089>
10:44:33.724758 mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7bac2949e000 <0.000114>
10:44:33.725060 mmap(NULL, 3795296, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7bac28eeb000 <0.000106>
10:44:33.725313 mprotect(0x7bac29080000, 2097152, PROT_NONE) = 0 <0.000111>
10:44:33.725583 mmap(0x7bac29280000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x195000) = 0x7bac29280000 <0.000114>
10:44:33.725877 mmap(0x7bac29286000, 14688, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7bac29286000 <0.000099>
10:44:33.726149 close(3) = 0 <0.000095>
10:44:33.726417 arch_prctl(ARCH_SET_FS, 0x7bac2949f440) = 0 <0.000102>
10:44:33.726793 mprotect(0x7bac29280000, 16384, PROT_READ) = 0 <0.000112>
10:44:33.727051 mprotect(0x58756d933000, 4096, PROT_READ) = 0 <0.000104>
10:44:33.727304 mprotect(0x7bac294ad000, 4096, PROT_READ) = 0 <0.000107>
10:44:33.727565 munmap(0x7bac294a0000, 50626) = 0 <0.000121>
10:44:33.727941 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7bac294ac000 <0.000098>
10:44:33.728184 madvise(0x7bac294ac000, 4096, MADV_WILLNEED) = -1 EBADF (Bad file descriptor) <0.000091>
10:44:33.728432 dup(2) = 3 <0.000097>
10:44:33.728679 fcntl(3, F_GETFL) = 0x8001 (flags O_WRONLY|O_LARGEFILE) <0.000110>
10:44:33.728931 close(3) = 0 <0.000073>
10:44:33.729151 write(2, "madvise of mmap-allocated data f"..., 59madvise of mmap-allocated data failed: Bad file descriptor
) = 59 <0.000089>
10:44:33.729386 brk(NULL) = 0x58756f244000 <0.000079>
10:44:33.729593 brk(0x58756f245000) = 0x58756f245000 <0.000078>
10:44:33.729822 madvise(0x58756f244000, 4096, MADV_WILLNEED) = -1 EBADF (Bad file descriptor) <0.000078>
10:44:33.730048 write(2, "madvise of sbrk-allocated data f"..., 59madvise of sbrk-allocated data failed: Bad file descriptor
) = 59 <0.000136>
10:44:33.730354 exit_group(0) = ?
10:44:33.731306 +++ exited with 0 +++

It seems to be searching for libc.so.6; in case it's relevant, that's located here:

% /sbin/ldconfig -p | fgrep libc.so.6
libc.so.6 (libc6,x86-64, OS ABI: Linux 2.6.32) => /lib/x86_64-linux-gnu/libc.so.6

 

@asfimport
Copy link
Author

Antoine Pitrou / @pitrou:
Thank you. What is the Linux kernel version on the Chromebook?

@asfimport
Copy link
Author

Antoine Pitrou / @pitrou:
For the record, I've found two possible reasons for this error:

  • The Linux kernel version is older than 3.9.0
  • Swap was disabled when compiling the kernel (you can check this by looking for "CONFIG_SWAP" in either /boot/config-<kernel version>, if it exists, or /proc/config.gz).

@asfimport
Copy link
Author

Jim Pivarski / @jpivarski:
I think it's the CONFIG_SWAP. This is what I find:

% cat /etc/os-release
PRETTY_NAME="Debian GNU/Linux 9 (stretch)"
NAME="Debian GNU/Linux"
VERSION_ID="9"
VERSION="9 (stretch)"
VERSION_CODENAME=stretch
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"

% uname -a
Linux penguin 5.4.40-04224-g891a6cce2d44 #1 SMP PREEMPT Tue Jun 23 20:21:29 PDT 2020 x86_64 GNU/Linux

% fgrep -A5 -B5 CONFIG_SWAP config
# CONFIG_KERNEL_LZMA is not set
# CONFIG_KERNEL_XZ is not set
# CONFIG_KERNEL_LZO is not set
# CONFIG_KERNEL_LZ4 is not set
CONFIG_DEFAULT_HOSTNAME="localhost"
{color:#FF0000}# CONFIG_SWAP is not set{color}
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
CONFIG_POSIX_MQUEUE_SYSCTL=y
CONFIG_CROSS_MEMORY_ATTACH=y

% free
total used free shared buff/cache available
Mem: 6798788 246552 5176772 8808 1375464 6552236
Swap: 0 0 0

(Maybe because the disk is so small?) Is it within scope for Parquet-reading to support operating systems compiled without swap? I don't know how unusual this situation is. I don't plan to use this computer for big data, but it did force me to turn off some tests in my testing suite.

@asfimport
Copy link
Author

Antoine Pitrou / @pitrou:
Yes, this can be trivially worked around. I just wanted to know exactly why this happens :) Thank you for the quick reports!

@asfimport
Copy link
Author

Jim Pivarski / @jpivarski:
Sure! And thanks for the fix!

@asfimport
Copy link
Author

Antoine Pitrou / @pitrou:
Issue resolved by pull request 7904
#7904

@asfimport asfimport added this to the 1.0.1 milestone Jan 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants