Skip to content

[Bug] Spill GC crashes due to missing pointer dereference in LocalFileSystem::list_impl #60904

@xuchenhao

Description

@xuchenhao

Search before asking

  • I had searched in the issues and found no similar issues.

Version

master

What's Wrong?

http://43.132.222.7:8111/buildConfiguration/Doris_DorisRegression_P0Regression/896382?expandBuildDeploymentsSection=false&hideTestsFromDependencies=false&hideProblemsFromDependencies=false&expandPull+Request+Details=true&expandBuildProblemsSection=true&expandBuildTestsSection=true&expandBuildChangesSection=true
During P0 regression testing, a crash occurred in the BE process with AddressSanitizer reporting a CHECK failure. The stack trace indicates the crash originated from LocalFileSystem::list_impl(line 248) when called by the spill GC thread.

AddressSanitizer: CHECK failed: sanitizer_posix_libcdep.cpp:319 "((14)) == ((write_errno))" (0xe, 0x20) (tid=41007)
    #0 0x55d2766388e1 in __asan::CheckUnwind() (/mnt/ssd01/pipline/OpenSourceDoris/clusterEnv/P0/Cluster0/be/lib/doris_be+0x297bd8e1)
    #1 0x55d276653182 in __sanitizer::CheckFailed(char const*, int, char const*, unsigned long long, unsigned long long) (/mnt/ssd01/pipline/OpenSourceDoris/clusterEnv/P0/Cluster0/be/lib/doris_be+0x297d8182)
    #2 0x55d2766556cf in __sanitizer::IsAccessibleMemoryRange(unsigned long, unsigned long) (/mnt/ssd01/pipline/OpenSourceDoris/clusterEnv/P0/Cluster0/be/lib/doris_be+0x297da6cf)
    #3 0x55d27667237a in __ubsan::checkDynamicType(void*, void*, unsigned long) (/mnt/ssd01/pipline/OpenSourceDoris/clusterEnv/P0/Cluster0/be/lib/doris_be+0x297f737a)
    #4 0x55d276671712 in HandleDynamicTypeCacheMiss(__ubsan::DynamicTypeCacheMissData*, unsigned long, unsigned long, __ubsan::ReportOptions) (/mnt/ssd01/pipline/OpenSourceDoris/clusterEnv/P0/Cluster0/be/lib/doris_be+0x297f6712)
    #5 0x55d2766716e3 in __ubsan_handle_dynamic_type_cache_miss (/mnt/ssd01/pipline/OpenSourceDoris/clusterEnv/P0/Cluster0/be/lib/doris_be+0x297f66e3)
    #6 0x55d276bcc7a7 in doris::io::LocalFileSystem::list_impl(std::filesystem::__cxx11::path const&, bool, std::vector<doris::io::FileInfo, std::allocator<doris::io::FileInfo> >*, bool*) /root/doris/be/build_ASAN/../src/io/fs/local_file_system.cpp:248:32
    #7 0x55d2768c62b0 in doris::io::FileSystem::list(std::filesystem::__cxx11::path const&, bool, std::vector<doris::io::FileInfo, std::allocator<doris::io::FileInfo> >*, bool*) /root/doris/be/build_ASAN/../src/io/fs/file_system.cpp:84:5
    #8 0x55d2933cd95e in doris::vectorized::SpillStreamManager::gc(int) /root/doris/be/build_ASAN/../src/vec/spill/spill_stream_manager.cpp:233:49
    #9 0x55d2933ccce1 in doris::vectorized::SpillStreamManager::_spill_gc_thread_callback() /root/doris/be/build_ASAN/../src/vec/spill/spill_stream_manager.cpp:110:9
    #10 0x55d27c171956 in std::function<void ()>::operator()() const /usr/local/ldb-toolchain-v0.26/bin/../lib/gcc/x86_64-pc-linux-gnu/15/include/g++-v15/bits/std_function.h:593:9
    #11 0x55d27c171956 in doris::Thread::supervise_thread(void*) /root/doris/be/build_ASAN/../src/util/thread.cpp:460:5
    #12 0x55d276628d26 in asan_thread_start(void*) (/mnt/ssd01/pipline/OpenSourceDoris/clusterEnv/P0/Cluster0/be/lib/doris_be+0x297add26)
    #13 0x7f460580a608 in start_thread /build/glibc-SzIz7B/glibc-2.31/nptl/pthread_create.c:477:8
    #14 0x7f460571d132 in __clone /build/glibc-SzIz7B/glibc-2.31/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:95

The bug is in the following code segment:

Status LocalFileSystem::list_impl(const Path& dir, bool only_file, std::vector<FileInfo>* files,
                                  bool* exists) {
    RETURN_IF_ERROR(exists_impl(dir, exists));
    if (!exists) {  // BUG: This checks if the pointer is null, not the boolean value it points to.
        return Status::OK();
    }
    // ... rest of the function
}

What You Expected?

The condition should dereference the pointer to check the actual boolean value.

Status LocalFileSystem::list_impl(const Path& dir, bool only_file, std::vector<FileInfo>* files,
                                  bool* exists) {
    RETURN_IF_ERROR(exists_impl(dir, exists));
    if (!*(exists)) {  // CORRECT: Check the value pointed to by 'exists'.
        return Status::OK();
    }
    // ... rest of the function
}

How to Reproduce?

No response

Anything Else?

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions