Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix](BufferedReader) fix BufferedReader::_read_once call memcpy function using null pointor _buffer #27775

Conversation

ryanzryu
Copy link
Contributor

Proposed changes

I got a core stack like this:
#0 _mm_loadu_si128(long long __vector(2) const*) (P=0x576ff) at /var/local/ldb-toolchain/lib/gcc/x86_64-linux-gnu/11/include/emmintrin.h:703
#1 inline_memcpy (size=77472, src
=, dst
=0x7f76a5252d00) at /data/doris-1.x/be/src/glibc-compatibility/memcpy/memcpy_x86_64.cpp:187
#2 memcpy (dst=0x7f76a5252d00, src=, size=size@entry=77472) at /data/doris-1.x/be/src/glibc-compatibility/memcpy/memcpy_x86_64.cpp:219
#3 0x0000557c566d3abd in memcpy (__len=77472, __src=, __dest=) at /var/local/ldb-toolchain/usr/include/x86_64-linux-gnu/bits/string_fortified.h:34
#4 doris::BufferedReader::_read_once (this=0x7f7647e98180, position=, nbytes=, bytes_read=0x7f7a19088178, out=)
at /data/doris-1.x/be/src/io/buffered_reader.cpp:137
#5 0x0000557c566d3c42 in doris::BufferedReader::readat (this=0x7f7647e98180, position=358146, nbytes=77472, bytes_read=0x7f7a19088178, out=0x7f76a5252d00)
at /data/doris-1.x/be/src/io/buffered_reader.cpp:94
#6 0x0000557c56817a27 in doris::ArrowFile::ReadAt (this=0x7f765b4ec4b0, position=, nbytes=77472, out=0x7f76a5252d00)
at /data/doris-1.x/be/src/exec/arrow/arrow_reader.cpp:228
#7 0x0000557c5d07a6d0 in ?? ()
#8 0x0000557c5dd4e724 in orc::SeekableFileInputStream::Next(void const**, int*) ()
#9 0x0000557c5dd4fd0c in orc::DecompressionStream::readBuffer(bool) ()
#10 0x0000557c5dd4fa4f in orc::DecompressionStream::readHeader() ()
#11 0x0000557c5dd4fe30 in orc::DecompressionStream::Next(void const**, int*) ()
#12 0x0000557c5dd5aed1 in orc::StringDirectColumnReader::next(orc::ColumnVectorBatch&, unsigned long, char*) ()
#13 0x0000557c5dd5a207 in orc::StructColumnReader::next(orc::ColumnVectorBatch&, unsigned long, char*) ()
#14 0x0000557c5dd4679b in orc::RowReaderImpl::next(orc::ColumnVectorBatch&) ()
#15 0x0000557c5d07b243 in ?? ()
#16 0x0000557c5ce7403f in arrow::RecordBatchReader::ReadAll(std::vector<std::shared_ptrarrow::RecordBatch, std::allocator<std::shared_ptrarrow::RecordBatch > >*) ()
#17 0x0000557c57bed85d in doris::ORCReaderWrap::read_batches (this=0x7f7a0f5a0400, batches=..., current_group=)
at /var/local/ldb-toolchain/include/c++/11/bits/shared_ptr_base.h:1290
#18 0x0000557c5681907f in doris::ArrowReaderWrap::prefetch_batch (this=0x7f7a0f5a0400) at /data/doris-1.x/be/src/exec/arrow/arrow_reader.cpp:183
#19 0x0000557c5e7933e0 in execute_native_thread_routine ()
#20 0x00007f7aa0215ea5 in start_thread () from /lib64/libpthread.so.0
#21 0x00007f7aa05289fd in clone () from /lib64/libc.so.6

I gdb the core file and found be core because of BufferedReader::_read_once call memcpy using null pointor _buffer.

image
image

I have read the relevant code, and I couldn't find any place that sets the _buffer to a null pointer except for BufferedReader::close(). The close() function is only called when BufferedReader is destructed. I didn't find any place where BufferedReader continues to be used after it is destructed. Therefore, I will temporarily add a null pointer protection before memcpy.

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@ryanzryu
Copy link
Contributor Author

I found master and 2.0 branch don't have BufferedReader any more,so I commit to 1.2 branch Individualy

Copy link
Member

@xy720 xy720 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like a thread has accessed a previously freed wild pointer. Can you test it with ASAN

@ryanzryu
Copy link
Contributor Author

ryanzryu commented Nov 29, 2023 via email

Copy link
Contributor

@lide-reed lide-reed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ryanzryu
Copy link
Contributor Author

ryanzryu commented Dec 20, 2023 via email

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Dec 20, 2023
Copy link
Contributor

PR approved by at least one committer and no changes requested.

Copy link
Contributor

PR approved by anyone and no changes requested.

@morningman morningman merged commit d9d9511 into apache:branch-1.2-lts Mar 15, 2024
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/1.2.9-merged reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants