Process hang with high CPU usage

Our dockerized dotnet application hangs and we don't know why. We know, that process is using a lot of CPU time(CPU usage near 100% or uses 1 CPU core).

Steps to reproduce

Run test app:

docker run -it 6opuc/lldb-netcore-use-cases InfiniteLoop

Check, that our app is using a lot of CPU time(%CPU=~100.0):

top -c -p $(pgrep -d',' -f dotnet)

Steps to analyze

Get id of container with our application(dotnet Runner.dll ...):

docker ps

Run container with createdump utility:

docker run --rm -it \
	--cap-add sys_admin \
	--cap-add sys_ptrace \
	--net=container:f827a6b79300 \
	--pid=container:f827a6b79300 \
	-v /tmp:/tmp \
	6opuc/lldb-netcore \
	/bin/bash

Where f827a6b79300 is id of container with our application.

Find PID of dotnet process we need to analyze(dotnet Runner.dll ...):

ps aux

In this example PID is "1"

Find id(PID) of thread with highest CPU utilization(%CPU=~100.0)

top -p 1 -H

Where 1 is id of out process. In this example thread id is 16.

Create coredump of dotnet process and exit from container:

createdump -u -f /tmp/coredump 1
exit

Where 1 is PID of dotnet process

Open coredump with debugger:

docker run --rm -it -v /tmp/coredump:/tmp/coredump 6opuc/lldb-netcore

Print list of all threads using command thread list.

* thread #1: tid = 1, 0x00007feb80ca117f libpthread.so.0`__pthread_cond_wait + 191, name = 'dotnet', stop reason = signal SIGABRT
  thread #2: tid = 7, 0x00007feb8013c469 libc.so.6`syscall + 25, stop reason = signal SIGABRT
  thread #3: tid = 8, 0x00007feb8013c469 libc.so.6`syscall + 25, stop reason = signal SIGABRT
  thread #4: tid = 11, 0x00007feb801378bd libc.so.6`__poll + 45, stop reason = signal SIGABRT
  thread #5: tid = 12, 0x00007feb80ca485d libpthread.so.0`__GI_open64 + 45, stop reason = signal SIGABRT
  thread #6: tid = 13, 0x00007feb80ca117f libpthread.so.0`__pthread_cond_wait + 191, stop reason = signal SIGABRT
  thread #7: tid = 14, 0x00007feb80ca1528 libpthread.so.0`__pthread_cond_timedwait + 296, stop reason = signal SIGABRT
  thread #8: tid = 15, 0x00007feb80ca1528 libpthread.so.0`__pthread_cond_timedwait + 296, stop reason = signal SIGABRT
  thread #9: tid = 16, 0x00007feb06373dcb, stop reason = signal SIGABRT
  thread #10: tid = 17, 0x00007feb80ca1528 libpthread.so.0`__pthread_cond_timedwait + 296, stop reason = signal SIGABRT

Find our thread by its 'tid': it should be equal to PID from top -p 1 -H(in our example it is 16). In our example thread# is 9.

Switch to our thread:

thread select 9

Where 9 is our thread# from thread list.

Print thread stack trace using command clrstack:

OS Thread Id: 0x10 (9)
        Child SP               IP Call Site
00007FEAE657F900 00007FEB06373DCB Runner.InfiniteLoop+<>c.<Run>b__0_0()
00007FEAE657F920 00007FEB066A50B0 System.Threading.Thread.ThreadMain_ThreadStart()
00007FEAE657F930 00007FEB05D947FD System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)
00007FEAE657FC90 00007feb7f53b17f [GCFrame: 00007feae657fc90]
00007FEAE657FD50 00007feb7f53b17f [DebuggerU2MCatchHandlerFrame: 00007feae657fd50]

We see that thread is busy somewhere inside Runner.InfiniteLoop.Run()

Look at source code of our method Runner.InfiniteLoop.Run: https://github.com/6opuc/lldb-netcore-use-cases/blob/master/src/Runner/InfiniteLoop.cs

var thread = new Thread(() =>
{
    while (true)
    {
    }
});
thread.Start();
thread.Join();

We see, that high CPU usage is caused by infinite loop inside thread worker method.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hang_cpu_high.md

hang_cpu_high.md

Process hang with high CPU usage

Steps to reproduce

Steps to analyze

Files

hang_cpu_high.md

Latest commit

History

hang_cpu_high.md

File metadata and controls

Process hang with high CPU usage

Steps to reproduce

Steps to analyze