Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R] Segfault when collecting parquet dataset query results #41813

Closed
mrd0ll4r opened this issue May 24, 2024 · 15 comments
Closed

[R] Segfault when collecting parquet dataset query results #41813

mrd0ll4r opened this issue May 24, 2024 · 15 comments
Assignees
Labels
Component: R Critical Fix Bugfixes for security vulnerabilities, crashes, or invalid data. Type: bug
Milestone

Comments

@mrd0ll4r
Copy link

mrd0ll4r commented May 24, 2024

Hello!
I've been using arrow with R for a while now to great success.
Recently, I've re-opened an old project (managed with renv, so I'm pretty confident all the package versions were the same).
It is possible I upgraded the OS and/or OS packages in the meantime.
Now, some of my queries on a gzip-compressed dataset of parquet files lead to a segfault:

 *** caught segfault ***
address 0x7f54ce520898, cause 'memory not mapped'

Traceback:
 1: Table__from_ExecPlanReader(self)
 2: x$read_table()
 3: as_arrow_table.RecordBatchReader(reader)
 4: as_arrow_table(reader)
 5: as_arrow_table.arrow_dplyr_query(x)
 6: as_arrow_table(x)
 7: doTryCatch(return(expr), name, parentenv, handler)
 8: tryCatchOne(expr, names, parentenv, handlers[[1L]])
 9: tryCatchList(expr, classes, parentenv, handlers)
10: tryCatch(as_arrow_table(x), error = function(e, call = caller_env(n = 4)) {    augment_io_error_msg(e, call, schema = schema())})
11: compute.arrow_dplyr_query(x)
12: collect.arrow_dplyr_query(.)
13: collect(.)
14: d_redacted %>% group_by(year, month, cid) %>% summarize(n = n()) %>%     collect()

I have a core dump from that session, but it's 46GB.
The machine has 256GB RAM and another 256GB swap, so I'm confident that's not the problem.

I'm not a professional in analyzing these things, but this is what I got:

Core was generated by `/usr/lib/R/bin/exec/R'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f612d4ea3b0 in arrow::compute::KeyCompare::CompareBinaryColumnToRow_avx2(bool, unsigned int, unsigned int, unsigned short const*, unsigned int const*, arrow::compute::LightContext*, arrow::compute::KeyColumnArray const&, arrow::compute::RowTableImpl const&, unsigned char*) () from /home/leo/.cache/R/renv/cache/v5/R-4.3/x86_64-pc-linux-gnu/arrow/15.0.1/85c24dd7844977e4a680ba28f576125c/arrow/libs/arrow.so
[Current thread is 1 (Thread 0x7f6093fff640 (LWP 2273813))]
(gdb) bt
#0  0x00007f612d4ea3b0 in arrow::compute::KeyCompare::CompareBinaryColumnToRow_avx2(bool, unsigned int, unsigned int, unsigned short const*, unsigned int const*, arrow::compute::LightContext*, arrow::compute::KeyColumnArray const&, arrow::compute::RowTableImpl const&, unsigned char*) () from /home/leo/.cache/R/renv/cache/v5/R-4.3/x86_64-pc-linux-gnu/arrow/15.0.1/85c24dd7844977e4a680ba28f576125c/arrow/libs/arrow.so
#1  0x00007f612d4d7093 in void arrow::compute::KeyCompare::CompareBinaryColumnToRow<true>(unsigned int, unsigned int, unsigned short const*, unsigned int const*, arrow::compute::LightContext*, arrow::compute::KeyColumnArray const&, arrow::compute::RowTableImpl const&, unsigned char*) () from /home/leo/.cache/R/renv/cache/v5/R-4.3/x86_64-pc-linux-gnu/arrow/15.0.1/85c24dd7844977e4a680ba28f576125c/arrow/libs/arrow.so
#2  0x00007f612d4d6278 in arrow::compute::KeyCompare::CompareColumnsToRows(unsigned int, unsigned short const*, unsigned int const*, arrow::compute::LightContext*, unsigned int*, unsigned short*, std::vector<arrow::compute::KeyColumnArray, std::allocator<arrow::compute::KeyColumnArray> > const&, arrow::compute::RowTableImpl const&, bool, unsigned char*) ()
   from /home/leo/.cache/R/renv/cache/v5/R-4.3/x86_64-pc-linux-gnu/arrow/15.0.1/85c24dd7844977e4a680ba28f576125c/arrow/libs/arrow.so
#3  0x00007f612d4d896e in ?? () from /home/leo/.cache/R/renv/cache/v5/R-4.3/x86_64-pc-linux-gnu/arrow/15.0.1/85c24dd7844977e4a680ba28f576125c/arrow/libs/arrow.so
#4  0x00007f612d3a98e6 in ?? () from /home/leo/.cache/R/renv/cache/v5/R-4.3/x86_64-pc-linux-gnu/arrow/15.0.1/85c24dd7844977e4a680ba28f576125c/arrow/libs/arrow.so
#5  0x00007f612d3ab154 in arrow::compute::SwissTable::find(int, unsigned int const*, unsigned char*, unsigned char const*, unsigned int*, arrow::util::TempVectorStack*, std::function<void (int, unsigned short const*, unsigned int const*, unsigned int*, unsigned short*, void*)> const&, void*) const ()
   from /home/leo/.cache/R/renv/cache/v5/R-4.3/x86_64-pc-linux-gnu/arrow/15.0.1/85c24dd7844977e4a680ba28f576125c/arrow/libs/arrow.so
#6  0x00007f612d4df2d0 in ?? () from /home/leo/.cache/R/renv/cache/v5/R-4.3/x86_64-pc-linux-gnu/arrow/15.0.1/85c24dd7844977e4a680ba28f576125c/arrow/libs/arrow.so
#7  0x00007f612d4dfb73 in ?? () from /home/leo/.cache/R/renv/cache/v5/R-4.3/x86_64-pc-linux-gnu/arrow/15.0.1/85c24dd7844977e4a680ba28f576125c/arrow/libs/arrow.so
#8  0x00007f612cf8da83 in arrow::acero::aggregate::GroupByNode::Merge() () from /home/leo/.cache/R/renv/cache/v5/R-4.3/x86_64-pc-linux-gnu/arrow/15.0.1/85c24dd7844977e4a680ba28f576125c/arrow/libs/arrow.so
#9  0x00007f612cf8f8a3 in arrow::acero::aggregate::GroupByNode::OutputResult(bool) ()
   from /home/leo/.cache/R/renv/cache/v5/R-4.3/x86_64-pc-linux-gnu/arrow/15.0.1/85c24dd7844977e4a680ba28f576125c/arrow/libs/arrow.so
#10 0x00007f612cf941f6 in arrow::acero::aggregate::GroupByNode::InputReceived(arrow::acero::ExecNode*, arrow::compute::ExecBatch) ()
   from /home/leo/.cache/R/renv/cache/v5/R-4.3/x86_64-pc-linux-gnu/arrow/15.0.1/85c24dd7844977e4a680ba28f576125c/arrow/libs/arrow.so
#11 0x00007f612cef3f1b in arrow::acero::MapNode::InputReceived(arrow::acero::ExecNode*, arrow::compute::ExecBatch) ()
   from /home/leo/.cache/R/renv/cache/v5/R-4.3/x86_64-pc-linux-gnu/arrow/15.0.1/85c24dd7844977e4a680ba28f576125c/arrow/libs/arrow.so
#12 0x00007f612cf25dd2 in ?? () from /home/leo/.cache/R/renv/cache/v5/R-4.3/x86_64-pc-linux-gnu/arrow/15.0.1/85c24dd7844977e4a680ba28f576125c/arrow/libs/arrow.so
#13 0x00007f612cf05a7e in arrow::internal::FnOnce<void ()>::FnImpl<std::_Bind<arrow::detail::ContinueFuture (arrow::Future<arrow::internal::Empty>, std::function<arrow::Status ()>)> >::invoke() ()
   from /home/leo/.cache/R/renv/cache/v5/R-4.3/x86_64-pc-linux-gnu/arrow/15.0.1/85c24dd7844977e4a680ba28f576125c/arrow/libs/arrow.so
#14 0x00007f612d290a9d in ?? () from /home/leo/.cache/R/renv/cache/v5/R-4.3/x86_64-pc-linux-gnu/arrow/15.0.1/85c24dd7844977e4a680ba28f576125c/arrow/libs/arrow.so
#15 0x00007f6136f87253 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#16 0x00007f61396a9ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#17 0x00007f613973b850 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

I've tried:

  • Updating all the dependencies. I'm now at 15.0.1 from RSPM. Above crash is from this version.
  • Re-writing the dataset. The raw data is a bunch of CSV files, which I read -> mutate -> write to parquet
  • Checking if simple queries (dataset %>% summarize(n=n())) work, which they do

Specifically, this query works:

d_redacted %>% group_by(year, month) %>% summarize(n=n()) %>% collect()

and this doesn't:

d_redacted %>% group_by(year, month, cid) %>% summarize(n=n()) %>% collect()

The dataset looks like this:

> d_redacted
FileSystemDataset with 1342 Parquet files
peer: string
address: string
asn: string
geolocation: string
cid: string
entry_type: string
date: date32[day]
monitor: string
year: int32
month: int32

It's 3GB on disk, gzip compressed.

Unfortunately, I cannot share the dataset publicly as it contains sensitive information.

Overall, pretty lost now.
The system is running Ubuntu 22.04, kernel:

5.15.0-101-generic #111-Ubuntu SMP Tue Mar 5 20:16:58 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

Hope that helps somehow...

Component(s)

R

@amoeba amoeba changed the title Segfault when collecting parquet dataset query results [R] Segfault when collecting parquet dataset query results May 25, 2024
@mrd0ll4r
Copy link
Author

mrd0ll4r commented May 27, 2024

Update:
I did some more research on the timeline. I'm fairly confident of what changed between the run that worked (in February) and now:

  • Updated the hypervisor (Proxmox) from some ancient version to the most recent one
  • Updated the OS from Ubuntu... unclear, 18? 20? to 22.04
  • Update R from (unknown) to 4.3

In particular, when I first tried re-running the script on Friday, I used the existing renv.lock file. Because of the R version change it probably downloaded the packages again, not sure.

Update 2: I upgraded arrow to 16.1.0 (RSPM). The bug persists.
I'm trying to reduce the dataset to still have a reproducer but be able to share it.

Update 3: I cannot reproduce if I specify Sys.setenv(ARROW_USER_SIMD_LEVEL = "avx") before loading Arrow

@jonkeane
Copy link
Member

I cannot reproduce if I specify Sys.setenv(ARROW_USER_SIMD_LEVEL = "avx") before loading Arrow

This is interesting, what kind of computer (and ideally, specific model of CPU) are you running on?

@mrd0ll4r
Copy link
Author

mrd0ll4r commented May 31, 2024

Oh, sorry, I thought I specified that somewhere!

Here's the first core of cat /proc/cpuinfo:

processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 85
model name      : Intel(R) Xeon(R) Gold 6154 CPU @ 3.00GHz
stepping        : 4
microcode       : 0x2007108
cpu MHz         : 2992.966
cache size      : 16384 KB
physical id     : 0
siblings        : 32
core id         : 0
cpu cores       : 32
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 22
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon rep_good nopl xtopology cpuid tsc_known_freq pni pclmulqdq vmx ssse3 fma cx16 pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch cpuid_fault invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves arat umip pku ospke md_clear flush_l1d arch_capabilities
vmx flags       : vnmi preemption_timer posted_intr invvpid ept_x_only ept_ad ept_1gb flexpriority apicv tsc_offset vtpr mtf vapic ept vpid unrestricted_guest vapic_reg vid shadow_vmcs pml tsc_scaling
bugs            : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs taa mmio_stale_data retbleed gds
bogomips        : 5985.93
clflush size    : 64
cache_alignment : 64
address sizes   : 46 bits physical, 48 bits virtual
power management:

The VM has 32 cores total.

I'm still working on removing PII from the dataset so I can share it. It'll still be a few GB, any place I could put it once it's done?

@amoeba
Copy link
Member

amoeba commented Jun 6, 2024

Hey @mrd0ll4r, you could try uploading to my Dropbox if it's under 3GB and from there I can make it available to others: https://www.dropbox.com/request/bGhCIALbSd8izywgWSHi. Otherwise Google Drive might work fine.

@zanmato1984, do you have any pointers here? I know you have some experience in some of these pieces.

@amoeba amoeba added the Critical Fix Bugfixes for security vulnerabilities, crashes, or invalid data. label Jun 6, 2024
@mrd0ll4r
Copy link
Author

mrd0ll4r commented Jun 6, 2024

I've created a reproducer and uploaded it to dropbox, thanks!
The archive contains the entire RStudio project, including the renv/ subdirectory and whatnot.

When you open the project, renv will complain about broken symlinks into the cache (which is local to my user).
You can probably repair that with renv::repair() or renv::restore() or something like that.

Comment/uncomment the environement-setting line to trigger the bug.

Good luck and thanks for looking into this! Let me know if you need any additional info or resources.

Edit: I uploaded the core dump and apport report as well. The latter also contains the core dump, plus some additional info, but uses whatever format apport produces...

@zanmato1984
Copy link
Collaborator

@zanmato1984, do you have any pointers here? I know you have some experience in some of these pieces.

I did fixed a bug #39577 related to arrow::compute::KeyCompare::CompareBinaryColumnToRow but that's not avx2 specialized code (the symptom in the reported stack trace).

Besides, quoting @mrd0ll4r :

Update 2: I upgraded arrow to 16.1.0 (RSPM). The bug persists.

But the fix is prior to 16.0.0. So this could be a new issue.

@amoeba How can I access the archive in your dropbox? I can give it a try to reproduce.

@amoeba
Copy link
Member

amoeba commented Jun 9, 2024

Hi @zanmato1984, I made the files public at https://www.dropbox.com/scl/fo/5ao8vij4kogb16a3i1l63/AJ_fT08eJrtY-ouuh-nKOoc?rlkey=7ug6htsro09agqyuw6afoesvt&st=nar7b740&dl=0. Let me know if that doesn't work.

@zanmato1984
Copy link
Collaborator

I'm now trying to reproduce using C++, this is probably more efficient considering my poor experience on R. The R reproduction script and the data seem very helpful. Will get back when I find something.

Hi @zanmato1984, I made the files public at dropbox.com/scl/fo/5ao8vij4kogb16a3i1l63/AJ_fT08eJrtY-ouuh-nKOoc?rlkey=7ug6htsro09agqyuw6afoesvt&st=nar7b740&dl=0. Let me know if that doesn't work.

@amoeba
Copy link
Member

amoeba commented Jun 13, 2024

Thanks @zanmato1984. I tried to reproduce the issue last night and I was able to reproduce it (which is good) so I'm going to reproduce with a debug build hopefully later today.

@zanmato1984
Copy link
Collaborator

Cool @amoeba ! Let me know if you need any help from me on your debugging. Meanwhile I'll keep my debugging as well.

@amoeba
Copy link
Member

amoeba commented Jun 14, 2024

I did get more info from a debug build, so this is something to go off of:

* thread #18, name = 'R', stop reason = signal SIGSEGV: invalid address (fault address: 0x7ff551565330)
    frame #0: 0x00007fffe8bc1e11 libarrow.so.1700`unsigned long arrow::compute::Compare8_avx2<4>(left_base="\xe6\a", right_base="\xe4\a", irow_left_first=784, offset_right=([0] = 82491328, [1] = 5718977208514924784, [2] = 5718727617226473472, [3] = 1723092364720968600), bit_offset=0) at compare_internal_avx2.cc:333:19
   330 	      ARROW_DCHECK(false);
   331 	  }
   332
-> 333 	  __m256i right = _mm256_i32gather_epi32((const int*)right_base, offset_right, 1);
   334 	  if (column_width != sizeof(uint32_t)) {
   335 	    constexpr uint32_t mask = column_width == 0 || column_width == 1 ? 0xff : 0xffff;
   336 	    right = _mm256_and_si256(right, _mm256_set1_epi32(mask));
(lldb) p right_base
(const uint8_t *) $0 = 0x00007ff5d1400240 "\xe4\a"
(lldb) p offset_right
(__m256i) $1 = (82491328, 5718977208514924784, 5718727617226473472, 1723092364720968600)
(lldb) fr v
(const uint8_t *) left_base = 0x00007ffedab41140 "\xe6\a"
(const uint8_t *) right_base = 0x00007ff5d1400240 "\xe4\a"
(uint32_t) irow_left_first = 784
(__m256i) offset_right = (82491328, 5718977208514924784, 5718727617226473472, 1723092364720968600)
(int) bit_offset = 0
(__m256i) left = (8684423874534, 8684423874534, 8684423874534, 8684423874534)
(__m256i) right = (8675833939942, 8684423874534, 8675833939940, 8684423874532)
(__m256i) cmp = (4294967295, -1, 0, -4294967296)
(uint32_t) result_lo = 4294902015
(uint32_t) result_hi = 4278190080

Edit: To add fr v output

@zanmato1984
Copy link
Collaborator

zanmato1984 commented Jun 14, 2024

I reproduced the bug in C++ and found about the same thing. (For anyone who is interested, checkout my repro branch https://github.com/zanmato1984/arrow/tree/fix-41813-repro. Note you'll still need the data in the repro archive.)

The bug seems to be that the group ids in the swiss table used by GrouperFastImpl are somehow wrong, causing the right rows obtained from it to be nonsense. Still digging.

@zanmato1984
Copy link
Collaborator

zanmato1984 commented Jun 14, 2024

The bug is that in this line:

__m256i right = _mm256_i32gather_epi32((const int*)right_base, offset_right, 1);

If a slot of offset_right contains a value >= 0x80000000, which is an offset in row bigger than 2GB, then it is added to right_base as a negative integer, causing gathering data from an invalid address.

Proval followed:
Similar to @amoeba 's reproducing, mine is:

fault address: 0x4a7f85638
right_base: 0x0000000527e1e800
offset_right: (400023873834003288, 400025248223538328, 400025523922057392, -9217058400476779112)

Further decoding each slot of offset_right, it is:

(0x58D2B58 0x58D2B98 0x58D2C98 0x58D2CD8 0x3676B8B0 0x58D2D18 0x58D2D98 0x80166E38)

Note that the last offset is larger than 0x80000000 (which is legit, as non-avx2 version runs it right), and its signed interpretation is -2146013640. And right_base(0x0000000527e1e800) + (-2146013640) = 0x4a7f85638 is exactly the offending address. I didn't calculate @amoeba 's case but I believe it has the same math.

I'm working on a fix.

@amoeba
Copy link
Member

amoeba commented Jun 14, 2024

Awesome work @zanmato1984, thank you for figuring this out.

pitrou pushed a commit that referenced this issue Jun 25, 2024
…umnsToRows` (#42188)

### Rationale for this change

AVX2 intrinsics `_mm256_i32gather_epi32`/`_mm256_i32gather_epi64` are used in `CompareColumnsToRows` API, and treat the `vindex` as signed integer. In our row table implementation, we use `uint32_t` to represent the offset within the row table. When a offset is larger than (`0x80000000`, or `2GB`), the aforementioned intrinsics will treat it as negative offset and gather the data from undesired address. More details please see #41813 (comment).

Considering there is no unsigned-32bit-offset or 64bit-offset counterparts of those intrinsics in AVX2, this issue can be simply mitigated by translating the base address and the offset:
```
new_base = base + 0x80000000;
new_offset = offset - 0x80000000;
```

### What changes are included in this PR?

Fix and UT that reproduces the issue.

### Are these changes tested?

UT included.

### Are there any user-facing changes?

None.

* GitHub Issue: #41813

Authored-by: Ruoxi Sun <zanmato1984@gmail.com>
Signed-off-by: Antoine Pitrou <antoine@python.org>
@pitrou pitrou added this to the 17.0.0 milestone Jun 25, 2024
@pitrou
Copy link
Member

pitrou commented Jun 25, 2024

Issue resolved by pull request 42188
#42188

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: R Critical Fix Bugfixes for security vulnerabilities, crashes, or invalid data. Type: bug
Projects
None yet
Development

No branches or pull requests

5 participants