PARQUET-1581: [C++] Fix undefined behavior in encoding.cc #4336
PARQUET-1581: [C++] Fix undefined behavior in encoding.cc #4336emkornfield wants to merge 3 commits intoapache:masterfrom
Conversation
|
Do you know how this UB could occur? |
|
Hmm, this came up after cleaning up unaligned access UBSan errors, but I think had a small bug in that cleanup and reverted. I'll see if I can reproduce without it but will close this for now until I can (I also need to open a thread about UBSan issues on the mailing list). |
|
I haven't following the code paths fully but n DictEncoding/0.CheckDecodeArrowUsingDenseBuilder I see I believe the last entry exercises this code path. |
|
If the above doesn't sound right I can dig further. |
|
OK. I'm interested in the UBSAN report to see what the UB is actually occurring, could we set up a docker-compose build for this? |
|
sorry should have clarified this was a null reference error. ive updated the PR with a more specific fix. |
|
I'll open up follow-up JIRAs per discussion on the ML to get UBsan integrated with the build |
Codecov Report
@@ Coverage Diff @@
## master #4336 +/- ##
==========================================
+ Coverage 88.26% 89.23% +0.96%
==========================================
Files 777 632 -145
Lines 97953 86551 -11402
Branches 1251 0 -1251
==========================================
- Hits 86461 77232 -9229
+ Misses 11256 9319 -1937
+ Partials 236 0 -236Continue to review full report at Codecov.
|
|
I opened up https://issues.apache.org/jira/browse/ARROW-5365 to track adding in UBSan and ASAN to CI (I thought we were already doing ASAN though?) |
No we are not. We had Valgrind runs but we disabled them because of regressions when refactoring the CMake build system AFAICT. |
|
Note: null pointers are generally a PITA with C++ undefined behaviour rules. You often need to special-case the null case, even when it seems things should be alright as the size is zero. @bkietz |
No description provided.