Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[VL] Fix shuffle writer invalid write #4240

Merged
merged 1 commit into from
Jan 2, 2024

Conversation

marin-ma
Copy link
Contributor

@marin-ma marin-ma commented Jan 2, 2024

Fix Invalid write reported by valgrind

/usr/bin/valgrind gluten/cpp/build/velox/tests/velox_shuffle_writer_test --gtest_filter=VeloxShuffleWriterMemoryTest.resizeBinaryBufferTriggerSpill

[ RUN      ] VeloxShuffleWriterMemoryTest.resizeBinaryBufferTriggerSpill
I1225 19:09:26.151989 714619 VeloxShuffleWriter.cc:1227] Evicted all cached payloads. 0 bytes released
==714619== Invalid write of size 4
==714619==    at 0xF9D664D: gluten::VeloxShuffleWriter::splitBinaryType(unsigned int, facebook::velox::FlatVector<facebook::velox::StringView> const&, std::vector<gluten::VeloxShuffleWriter::BinaryBuf, std::allocator<gluten::VeloxShuffleWriter::BinaryBuf> >&) (VeloxShuffleWriter.cc:746)
==714619==    by 0xF9D6A78: gluten::VeloxShuffleWriter::splitBinaryArray(facebook::velox::RowVector const&) (VeloxShuffleWriter.cc:787)
==714619==    by 0xF9D4D51: gluten::VeloxShuffleWriter::splitRowVector(facebook::velox::RowVector const&) (VeloxShuffleWriter.cc:474)
==714619==    by 0xF9D4ADF: gluten::VeloxShuffleWriter::doSplit(facebook::velox::RowVector const&, long) (VeloxShuffleWriter.cc:460)
==714619==    by 0xF9D3B64: gluten::VeloxShuffleWriter::split(std::shared_ptr<gluten::ColumnarBatch>, long) (VeloxShuffleWriter.cc:361)
==714619==    by 0x2E0E7F: gluten::VeloxShuffleWriterTestBase::splitRowVector(gluten::VeloxShuffleWriter&, std::shared_ptr<facebook::velox::RowVector>) (VeloxShuffleWriterTestBase.h:183)
==714619==    by 0x2C9609: gluten::VeloxShuffleWriterMemoryTest_resizeBinaryBufferTriggerSpill_Test::TestBody()::{lambda()#1}::operator()() const (VeloxShuffleWriterTest.cc:674)
==714619==    by 0x2CC4C7: void std::__invoke_impl<void, gluten::VeloxShuffleWriterMemoryTest_resizeBinaryBufferTriggerSpill_Test::TestBody()::{lambda()#1}&>(std::__invoke_other, gluten::VeloxShuffleWriterMemoryTest_resizeBinaryBufferTriggerSpill_Test::TestBody()::{lambda()#1}&) (invoke.h:61)
==714619==    by 0x2CBDFC: std::enable_if<is_invocable_r_v<void, gluten::VeloxShuffleWriterMemoryTest_resizeBinaryBufferTriggerSpill_Test::TestBody()::{lambda()#1}&>, void>::type std::__invoke_r<void, gluten::VeloxShuffleWriterMemoryTest_resizeBinaryBufferTriggerSpill_Test::TestBody()::{lambda()#1}&>(gluten::VeloxShuffleWriterMemoryTest_resizeBinaryBufferTriggerSpill_Test::TestBody()::{lambda()#1}&) (invoke.h:111)
==714619==    by 0x2CB151: std::_Function_handler<void (), gluten::VeloxShuffleWriterMemoryTest_resizeBinaryBufferTriggerSpill_Test::TestBody()::{lambda()#1}>::_M_invoke(std::_Any_data const&) (std_function.h:290)
==714619==    by 0xFAC7D05: std::function<void ()>::operator()() const (std_function.h:590)
==714619==    by 0xFAC7384: gluten::SelfEvictedMemoryPool::checkEvict(long, std::function<void ()>) (MemoryPoolUtils.cc:71)
==714619==  Address 0x27ec8444 is 10,244 bytes inside a block of size 16,384 free'd
==714619==    at 0x484DCD3: realloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==714619==    by 0x20EC25F0: gluten::StdMemoryAllocator::reallocate(void*, long, long, void**) (MemoryAllocator.cc:126)
==714619==    by 0x20EC26A6: gluten::StdMemoryAllocator::reallocateAligned(void*, unsigned long, long, long, void**) (MemoryAllocator.cc:139)
==714619==    by 0x20EC38A8: gluten::ArrowMemoryPool::Reallocate(long, long, long, unsigned char**) (ArrowMemoryPool.cc:37)
==714619==    by 0xFAC7646: gluten::SelfEvictedMemoryPool::Reallocate(long, long, long, unsigned char**) (MemoryPoolUtils.cc:102)
==714619==    by 0x20EF2998: gluten::ShuffleMemoryPool::Reallocate(long, long, long, unsigned char**) (ShuffleMemoryPool.cc:37)
==714619==    by 0x226BD81F: arrow::PoolBuffer::Resize(long, bool) (memory_pool.cc:886)
==714619==    by 0x2CFDD3: arrow::ResizableBuffer::Resize(long) (buffer.h:493)
==714619==    by 0xF9DB86F: gluten::VeloxShuffleWriter::resizePartitionBuffer(unsigned int, long, bool) (VeloxShuffleWriter.cc:1273)
==714619==    by 0xF9DC03F: gluten::VeloxShuffleWriter::shrinkPartitionBuffer(unsigned int) (VeloxShuffleWriter.cc:1340)
==714619==    by 0xF9DC6F1: gluten::VeloxShuffleWriter::shrinkPartitionBuffersMinSize(long) (VeloxShuffleWriter.cc:1423)
==714619==    by 0xF9DAE6F: gluten::VeloxShuffleWriter::reclaimFixedSize(long, long*) (VeloxShuffleWriter.cc:1206)
==714619==  Block was alloc'd at
==714619==    at 0x484E120: memalign (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==714619==    by 0x20EC2594: gluten::StdMemoryAllocator::allocateAligned(unsigned long, long, void**) (MemoryAllocator.cc:120)
==714619==    by 0x20EC3700: gluten::ArrowMemoryPool::Allocate(long, long, unsigned char**) (ArrowMemoryPool.cc:30)
==714619==    by 0xFAC7523: gluten::SelfEvictedMemoryPool::Allocate(long, long, unsigned char**) (MemoryPoolUtils.cc:95)
==714619==    by 0x20EF285D: gluten::ShuffleMemoryPool::Allocate(long, long, unsigned char**) (ShuffleMemoryPool.cc:25)
==714619==    by 0x226BD675: arrow::PoolBuffer::Reserve(long) (memory_pool.cc:867)
==714619==    by 0x226BD8C3: arrow::PoolBuffer::Resize(long, bool) (memory_pool.cc:891)
==714619==    by 0x226BA102: arrow::Result<std::unique_ptr<arrow::ResizableBuffer, std::default_delete<arrow::ResizableBuffer> > > arrow::(anonymous namespace)::ResizePoolBuffer<std::unique_ptr<arrow::ResizableBuffer, std::default_delete<arrow::ResizableBuffer> >, std::unique_ptr<arrow::PoolBuffer, std::default_delete<arrow::PoolBuffer> > >(std::unique_ptr<arrow::PoolBuffer, std::default_delete<arrow::PoolBuffer> >&&, long) (memory_pool.cc:931)
==714619==    by 0x226B9B8D: arrow::AllocateResizableBuffer(long, long, arrow::MemoryPool*) (memory_pool.cc:958)
==714619==    by 0x226B9B19: arrow::AllocateResizableBuffer(long, arrow::MemoryPool*) (memory_pool.cc:951)
==714619==    by 0xF9D88AF: gluten::VeloxShuffleWriter::allocatePartitionBuffer(unsigned int, unsigned int) (VeloxShuffleWriter.cc:1009)
==714619==    by 0xF9DD1A6: gluten::VeloxShuffleWriter::preAllocPartitionBuffers(unsigned int) (VeloxShuffleWriter.cc:1491)

Copy link

github-actions bot commented Jan 2, 2024

Thanks for opening a pull request!

Could you open an issue for this pull request on Github Issues?

https://github.com/oap-project/gluten/issues

Then could you also rename commit message and pull request title in the following format?

[GLUTEN-${ISSUES_ID}][COMPONENT]feat/fix: ${detailed message}

See also:

@marin-ma
Copy link
Contributor Author

marin-ma commented Jan 2, 2024

@FelixYBW Please help to review. Thanks!

@marin-ma marin-ma merged commit fdd65ab into apache:main Jan 2, 2024
14 of 16 checks passed
@GlutenPerfBot
Copy link
Contributor

===== Performance report for TPCH SF2000 with Velox backend, for reference only ====

query log/native_4240_time.csv log/native_master_01_01_2024_fcb31fc6b_time.csv difference percentage
q1 33.38 34.09 0.712 102.13%
q2 24.98 24.97 -0.009 99.96%
q3 38.11 38.31 0.197 100.52%
q4 39.82 38.90 -0.916 97.70%
q5 71.79 70.30 -1.493 97.92%
q6 6.63 7.11 0.479 107.23%
q7 88.17 86.69 -1.475 98.33%
q8 87.48 86.97 -0.510 99.42%
q9 127.30 125.90 -1.400 98.90%
q10 44.32 46.03 1.707 103.85%
q11 19.91 20.46 0.547 102.75%
q12 26.81 28.23 1.421 105.30%
q13 46.87 45.95 -0.917 98.04%
q14 17.21 17.54 0.332 101.93%
q15 27.19 28.69 1.508 105.55%
q16 15.12 16.88 1.752 111.59%
q17 104.48 104.27 -0.206 99.80%
q18 150.79 151.07 0.280 100.19%
q19 14.81 12.72 -2.095 85.85%
q20 27.06 27.40 0.344 101.27%
q21 225.93 226.10 0.170 100.08%
q22 14.12 14.00 -0.115 99.18%
total 1252.27 1252.58 0.315 100.03%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants