Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ASAN and LSAN occassionally gets stuck in __sanitizer::internal_read(int, void*, unsigned long) #1477

Open
shabiel opened this issue Dec 22, 2021 · 4 comments

Comments

@shabiel
Copy link

shabiel commented Dec 22, 2021

Full Stack

This does not happen all the time, but very occasionally when sockets and TLS are used in our code base. I hope somebody can guide me to help you give you more information on the issue. It seems that Clang/LLVM somewhere is trying to read an fd which it can't read from.

clang version 10.0.0-4ubuntu1
(gdb) bt
#0  0x00000000004b1ef4 in __sanitizer::internal_read(int, void*, unsigned long) ()
#1  0x00000000004b3ec0 in __sanitizer::ReadFromFile(int, void*, unsigned long, unsigned long*, int*) ()
#2  0x00000000004bfc9a in __sanitizer::SymbolizerProcess::ReadFromSymbolizer(char*, unsigned long) ()
#3  0x00000000004bfa29 in __sanitizer::SymbolizerProcess::SendCommand(char const*) ()
#4  0x00000000004bf45c in __sanitizer::LLVMSymbolizer::SymbolizePC(unsigned long, __sanitizer::SymbolizedStack*) ()
#5  0x00000000004be9f9 in __sanitizer::Symbolizer::SymbolizePC(unsigned long) ()
#6  0x00000000004bd3ed in __sanitizer::StackTrace::Print() const ()
#7  0x000000000042ca35 in __asan::StackAddressDescription::Print() const ()
#8  0x000000000042f6a6 in __asan::ErrorGeneric::Print() ()
#9  0x000000000049fe29 in __asan::ScopedInErrorReport::~ScopedInErrorReport() ()
#10 0x00000000004a1a5e in __asan::ReportGenericError(unsigned long, unsigned long, unsigned long, unsigned long, bool, unsigned long, unsigned int, bool) ()
#11 0x0000000000436518 in strlen ()
#12 0x000000000051707e in find_function (function_name=<optimized out>, function_hash=<optimized out>)
    at /home/sam/work/gitlab/YDBOcto/src/find_function.c:55
#13 0x0000000000518828 in function_call_data_type_check (fc=<optimized out>, type=<optimized out>, parse_context=<optimized out>,
    table=<optimized out>) at /home/sam/work/gitlab/YDBOcto/src/function_call_data_type_check.c:121
#14 0x000000000058ca36 in populate_data_type (v=<optimized out>, type=<optimized out>, parse_context=<optimized out>)
    at /home/sam/work/gitlab/YDBOcto/src/populate_data_type.c:365
#15 0x0000000000590b38 in populate_data_type (v=<optimized out>, type=<optimized out>, parse_context=<optimized out>)
    at /home/sam/work/gitlab/YDBOcto/src/populate_data_type.c:513
#16 0x000000000058c6c4 in populate_data_type_column_list (v=<optimized out>, type=0x7ffe9cff9400, do_loop=0, callback=0x0,
    parse_context=0x7ffe9cffd9c0) at /home/sam/work/gitlab/YDBOcto/src/populate_data_type.c:91
#17 populate_data_type (v=<optimized out>, type=<optimized out>, parse_context=<optimized out>)
    at /home/sam/work/gitlab/YDBOcto/src/populate_data_type.c:634
#18 0x000000000058c422 in populate_data_type_column_list_alias (v=<optimized out>, type=0x7ffe9cff9430, do_loop=1, parse_context=0x7ffe9cffd9c0)
    at /home/sam/work/gitlab/YDBOcto/src/populate_data_type.c:52
#19 populate_data_type (v=<optimized out>, type=<optimized out>, parse_context=<optimized out>)
    at /home/sam/work/gitlab/YDBOcto/src/populate_data_type.c:313
#20 0x000000000058fc79 in populate_data_type (v=<optimized out>, type=<optimized out>, parse_context=<optimized out>)
    at /home/sam/work/gitlab/YDBOcto/src/populate_data_type.c:724
#21 0x0000000000585300 in validate_query_expression (query_expression=<optimized out>, parse_context=0x3fff, cmd_type=<optimized out>)
    at /home/sam/work/gitlab/YDBOcto/src/parser/validate_query_expression.c:44
#22 0x00000000005c5867 in yyparse (scanner=<optimized out>, out=<optimized out>, plan_id=<optimized out>, parse_context=<optimized out>)
    at /home/sam/work/gitlab/YDBOcto/build/parser.y:271
#23 0x0000000000579f49 in parse_line (parse_context=<optimized out>) at /home/sam/work/gitlab/YDBOcto/src/parse_line.c:42
--Type <RET> for more, q to quit, c to continue without paging--c
#24 0x000000000059d5b1 in run_query (callback=<optimized out>, parms=<optimized out>, msg_type=<optimized out>, parse_context=<optimized out>) at /home/sam/work/gitlab/YDBOcto/src/run_query.c:175
#25 0x00000000004e1ba1 in handle_query (query=<optimized out>, session=<optimized out>) at /home/sam/work/gitlab/YDBOcto/src/rocto/handle_query.c:55
#26 0x00000000004efe51 in rocto_main_loop (session=<optimized out>) at /home/sam/work/gitlab/YDBOcto/src/rocto/rocto_main_loop.c:98
#27 0x00000000004cf4cc in main (argc=<optimized out>, argv=<optimized out>) at /home/sam/work/gitlab/YDBOcto/src/rocto.c:600

and

(gdb) bt
#0  0x00000000004b1ef4 in __sanitizer::internal_read(int, void*, unsigned long) ()
#1  0x00000000004b3ec0 in __sanitizer::ReadFromFile(int, void*, unsigned long, unsigned long*, int*) ()
#2  0x00000000004bfc9a in __sanitizer::SymbolizerProcess::ReadFromSymbolizer(char*, unsigned long) ()
#3  0x00000000004bfa29 in __sanitizer::SymbolizerProcess::SendCommand(char const*) ()
#4  0x00000000004bf45c in __sanitizer::LLVMSymbolizer::SymbolizePC(unsigned long, __sanitizer::SymbolizedStack*) ()
#5  0x00000000004be9f9 in __sanitizer::Symbolizer::SymbolizePC(unsigned long) ()
#6  0x00000000004bd3ed in __sanitizer::StackTrace::Print() const ()
#7  0x00000000004c39ea in __lsan::LeakReport::PrintReportForLeak(unsigned long) ()
#8  0x00000000004c37e5 in __lsan::LeakReport::ReportTopLeaks(unsigned long) ()
#9  0x00000000004c2f6f in __lsan::CheckForLeaks() ()
#10 0x00000000004c3125 in __lsan::DoRecoverableLeakCheckVoid() ()
#11 0x00007f23d80cda27 in __run_exit_handlers (status=0, listp=0x7f23d826f718 <__exit_funcs>, run_list_atexit=run_list_atexit@entry=true,
    run_dtors=run_dtors@entry=true) at exit.c:108
#12 0x00007f23d80cdbe0 in __GI_exit (status=<optimized out>) at exit.c:139
#13 0x00007f23d80ab0ba in __libc_start_main (main=0x4cb740 <main>, argc=4, argv=0x7ffdd9a21608, init=<optimized out>, fini=<optimized out>,
    rtld_fini=<optimized out>, stack_end=0x7ffdd9a215f8) at ../csu/libc-start.c:342
#14 0x00000000004238ce in _start ()
@shabiel shabiel changed the title ASAN and LSAN occassionally get stuck in __sanitizer::internal_read(int, void*, unsigned long) ASAN and LSAN occassionally gets stuck in __sanitizer::internal_read(int, void*, unsigned long) Dec 22, 2021
@shabiel
Copy link
Author

shabiel commented Dec 23, 2021

This seems have to been fixed on Clang 13 (or earlier). Therefore, closing.

@shabiel shabiel closed this as completed Dec 23, 2021
@shabiel
Copy link
Author

shabiel commented Dec 23, 2021

Take that back. The problem is still present on Clang 13. Here's the stack:

0x00005599166c668b in __sanitizer::internal_read(int, void*, unsigned long) ()
(gdb) bt
#0  0x00005599166c668b in __sanitizer::internal_read(int, void*, unsigned long) ()
#1  0x00005599166c8360 in __sanitizer::ReadFromFile(int, void*, unsigned long, unsigned long*, int*) ()
#2  0x00005599166d6a1a in __sanitizer::SymbolizerProcess::ReadFromSymbolizer(char*, unsigned long) [clone .part.0] ()
#3  0x00005599166d7a1a in __sanitizer::SymbolizerProcess::SendCommand(char const*) ()
#4  0x00005599166d82f8 in __sanitizer::Symbolizer::SymbolizePC(unsigned long) ()
#5  0x00005599166d4a7e in __sanitizer::StackTrace::PrintTo(__sanitizer::InternalScopedString*) const ()
#6  0x00005599166d50da in __sanitizer::StackTrace::Print() const ()
#7  0x00005599166de5f8 in __lsan::LeakReport::PrintReportForLeak(unsigned long) ()
#8  0x00005599166de9a1 in __lsan::LeakReport::ReportTopLeaks(unsigned long) ()
#9  0x00005599166dedff in __lsan::PrintResults(__lsan::LeakReport&) ()
#10 0x00005599166df0ef in __lsan::CheckForLeaks() ()
#11 0x00005599166df206 in __lsan::DoRecoverableLeakCheckVoid() ()
#12 0x00007f9a1793da8e in __cxa_finalize () from /usr/lib/libc.so.6
#13 0x00005599166057d8 in __do_global_dtors_aux ()
#14 0x00007ffed057d740 in ?? ()
#15 0x00007f9a185cb1a4 in _dl_fini () from /lib64/ld-linux-x86-64.so.2

@neo1973
Copy link

neo1973 commented Sep 8, 2022

Same problem when using TSAN with clang 14.0.6 on Arch Linux:

(gdb) bt
#0  0x0000555556c70d2f in __sanitizer::internal_read(int, void*, unsigned long) ()
#1  0x0000555556c729b4 in __sanitizer::ReadFromFile(int, void*, unsigned long, unsigned long*, int*) ()
#2  0x0000555556c87a64 in __sanitizer::SymbolizerProcess::ReadFromSymbolizer(char*, unsigned long) ()
#3  0x0000555556c89089 in __sanitizer::SymbolizerProcess::SendCommand(char const*) ()
#4  0x0000555556c89cb8 in __sanitizer::Symbolizer::SymbolizeData(unsigned long, __sanitizer::DataInfo*) ()
#5  0x0000555556d23732 in __tsan::SymbolizeData(unsigned long) ()
#6  0x0000555556d1cc16 in __tsan::ScopedReportBase::AddLocation(unsigned long, unsigned long) ()
#7  0x0000555556d1fbb5 in __tsan::ReportRace(__tsan::ThreadState*, __tsan::RawShadow*, __tsan::Shadow, __tsan::Shadow, unsigned long) ()
#8  0x0000555556ca8cb0 in memcpy ()
#9  0x00007fffe6fd5366 in memcpy () at /usr/include/bits/string_fortified.h:29
#10 si_pm4_emit () at ../mesa-22.1.7/src/gallium/drivers/radeonsi/si_pm4.c:148
#11 0x00007fffe7039b0b in si_begin_new_gfx_cs () at ../mesa-22.1.7/src/gallium/drivers/radeonsi/si_gfx_cs.c:422
#12 0x00007fffe703a794 in si_flush_gfx_cs () at ../mesa-22.1.7/src/gallium/drivers/radeonsi/si_gfx_cs.c:169
#13 0x00007fffe7041aa8 in si_flush_all_queues () at ../mesa-22.1.7/src/gallium/drivers/radeonsi/si_fence.c:501
#14 si_flush_from_st () at ../mesa-22.1.7/src/gallium/drivers/radeonsi/si_fence.c:549
#15 0x00007fffe697691f in st_flush () at ../mesa-22.1.7/src/mesa/state_tracker/st_cb_flush.c:60
#16 st_context_flush () at ../mesa-22.1.7/src/mesa/state_tracker/st_manager.c:808
#17 0x00007fffe6876f9e in dri_flush () at ../mesa-22.1.7/src/gallium/frontends/dri/dri_drawable.c:522
#18 0x00007fffefd7ee91 in dri2_wl_swap_buffers_with_damage () at ../mesa-22.1.7/src/egl/drivers/dri2/platform_wayland.c:1575
#19 0x00007fffefd6ddb8 in dri2_swap_buffers () at ../mesa-22.1.7/src/egl/drivers/dri2/egl_dri2.c:2014
#20 0x00007fffefd64939 in eglSwapBuffers () at ../mesa-22.1.7/src/egl/main/eglapi.c:1351
#21 0x0000555557ba4075 in CEGLContextUtils::TrySwapBuffers (this=0x7b7400004da8) at /..xbmc/utils/EGLUtils.cpp:597
#22 0x000055555957495f in KODI::WINDOWING::WAYLAND::CWinSystemWaylandEGLContext::PresentFrame (this=0x7b7400004600, rendered=true) at /..xbmc/windowing/wayland/WinSystemWaylandEGLContext.cpp:129
#23 0x00005555595768cb in KODI::WINDOWING::WAYLAND::CWinSystemWaylandEGLContextGL::PresentRenderImpl (this=0x7b7400004600, rendered=true) at /..xbmc/windowing/wayland/WinSystemWaylandEGLContextGL.cpp:119
#24 0x0000555559576932 in non-virtual thunk to KODI::WINDOWING::WAYLAND::CWinSystemWaylandEGLContextGL::PresentRenderImpl(bool) () at /..xbmc/windowing/wayland/WinSystemWaylandEGLContextGL.cpp:120
#25 0x00005555571c2746 in CRenderSystemGL::PresentRender (this=0x7b7400004de8, rendered=true, videoLayer=false) at /..xbmc/rendering/gl/RenderSystemGL.cpp:295
#26 0x000055555793c66f in CGraphicContext::Flip (this=0x7b6000007400, rendered=true, videoLayer=false) at /..xbmc/windowing/GraphicContext.cpp:983
#27 0x00005555584b539f in CApplication::Render (this=0x7b7000000800) at /..xbmc/Application.cpp:909
#28 0x00005555584bd183 in CApplication::Run (this=0x7b7000000800) at /..xbmc/Application.cpp:1860
#29 0x0000555557c49a2e in XBMC_Run (renderGUI=true, params=warning: RTTI symbol not found for class 'std::_Sp_counted_ptr_inplace<CAppParams, std::allocator<void>, (__gnu_cxx::_Lock_policy)2>'
warning: RTTI symbol not found for class 'std::_Sp_counted_ptr_inplace<CAppParams, std::allocator<void>, (__gnu_cxx::_Lock_policy)2>'
std::shared_ptr<CAppParams> (use count 3, weak count 0) = {...}) at /..xbmc/platform/xbmc.cpp:64
#30 0x0000555556d29430 in main (argc=1, argv=0x7fffffffe418) at /..xbmc/platform/posix/main.cpp:69

@zu1k
Copy link

zu1k commented Jun 14, 2023

Check your open files limit, ulimit -n, if this number is large, this is the culprit of the problem.

Launching llvm-symbolizer via fork and then execve requires closing previously opened file descriptors. compiler-rt uses a crude approach, which causes child processes to get stuck and fill up the CPU when the maximum file descriptor limit is too large.

https://github.com/llvm/llvm-project/blob/f9d0bf06319203a8cbb47d89c2f39d2c782f3887/compiler-rt/lib/sanitizer_common/sanitizer_posix_libcdep.cpp#L465

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants