Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

arthas 执行报错 断点调试arthas 代码- StackOverflowError #1610

Closed
WangJi92 opened this issue Dec 8, 2020 · 3 comments
Closed

arthas 执行报错 断点调试arthas 代码- StackOverflowError #1610

WangJi92 opened this issue Dec 8, 2020 · 3 comments
Labels
bug Something isn't working wontfix This will not be worked on

Comments

@WangJi92
Copy link
Contributor

WangJi92 commented Dec 8, 2020

环境信息

  • arthas-boot.jar 或者 as.sh 的版本:
  • Arthas 版本: 3.45
  • 操作系统版本: mac
  • 目标进程的JVM版本:java8

1.1 发现问题

使用arthas 执行一个 trace java.lang.Thread

 trace java.lang.Thread getName -n 7 '1==1' --skipJDKMethod false

响应信息

[arthas@3085]$ trace java.lang.Thread getName -n 7 '1==1' --skipJDKMethod false
trace failed, condition is: 1==1, null, visit /Users/wangji/logs/arthas/arthas.log for more details.

错误日志

20-12-08 22:53:28 [arthas-command-execute] INFO  c.t.arthas.core.advisor.Enhancer -enhance matched classes: [class java.util.TimerThread, class org.apache.tomcat.util.threads.TaskThread, class sun.security.ssl.SSLSocketImpl$NotifyHandshakeThread, class java.lang.ref.Finalizer$FinalizerThread, class org.springframework.boot.web.embedded.tomcat.TomcatWebServer$1, class java.util.logging.LogManager$Cleaner, class org.apache.tomcat.util.net.NioBlockingSelector$BlockPoller, class java.lang.Thread, class org.springframework.context.support.AbstractApplicationContext$1, class com.taobao.arthas.agent334.AgentBootstrap$1, class java.lang.ref.Reference$ReferenceHandler]
2020-12-08 22:53:28 [arthas-command-execute] INFO  c.t.arthas.core.advisor.Enhancer -Success to batch transform classes: [class java.util.TimerThread, class org.apache.tomcat.util.threads.TaskThread, class sun.security.ssl.SSLSocketImpl$NotifyHandshakeThread, class java.lang.ref.Finalizer$FinalizerThread, class org.springframework.boot.web.embedded.tomcat.TomcatWebServer$1, class java.util.logging.LogManager$Cleaner, class org.apache.tomcat.util.net.NioBlockingSelector$BlockPoller, class java.lang.Thread, class org.springframework.context.support.AbstractApplicationContext$1, class com.taobao.arthas.agent334.AgentBootstrap$1, class java.lang.ref.Reference$ReferenceHandler]
2020-12-08 22:53:28 [arthas-command-execute] ERROR c.t.a.c.c.monitor200.EnhancerCommand -error happens when enhancing class: Cannot write to standard output when terminated
java.lang.IllegalStateException: Cannot write to standard output when terminated
	at com.taobao.arthas.core.shell.system.impl.ProcessImpl$CommandProcessImpl.appendResult(ProcessImpl.java:618)
	at com.taobao.arthas.core.command.monitor200.EnhancerCommand.enhance(EnhancerCommand.java:178)
	at com.taobao.arthas.core.command.monitor200.EnhancerCommand.process(EnhancerCommand.java:96)
	at com.taobao.arthas.core.shell.command.impl.AnnotatedCommandImpl.process(AnnotatedCommandImpl.java:82)
	at com.taobao.arthas.core.shell.command.impl.AnnotatedCommandImpl.access$100(AnnotatedCommandImpl.java:18)
	at com.taobao.arthas.core.shell.command.impl.AnnotatedCommandImpl$ProcessHandler.handle(AnnotatedCommandImpl.java:111)
	at com.taobao.arthas.core.shell.command.impl.AnnotatedCommandImpl$ProcessHandler.handle(AnnotatedCommandImpl.java:108)
	at com.taobao.arthas.core.shell.system.impl.ProcessImpl$CommandProcessTask.run(ProcessImpl.java:385)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
2020-12-08 22:53:28 [arthas-command-execute] ERROR c.t.a.c.s.system.impl.ProcessImpl -Error during processing the command:
java.lang.IllegalStateException: Cannot write to standard output when terminated
	at com.taobao.arthas.core.shell.system.impl.ProcessImpl$CommandProcessImpl.appendResult(ProcessImpl.java:618)
	at com.taobao.arthas.core.command.monitor200.EnhancerCommand.enhance(EnhancerCommand.java:184)
	at com.taobao.arthas.core.command.monitor200.EnhancerCommand.process(EnhancerCommand.java:96)
	at com.taobao.arthas.core.shell.command.impl.AnnotatedCommandImpl.process(AnnotatedCommandImpl.java:82)
	at com.taobao.arthas.core.shell.command.impl.AnnotatedCommandImpl.access$100(AnnotatedCommandImpl.java:18)
	at com.taobao.arthas.core.shell.command.impl.AnnotatedCommandImpl$ProcessHandler.handle(AnnotatedCommandImpl.java:111)
	at com.taobao.arthas.core.shell.command.impl.AnnotatedCommandImpl$ProcessHandler.handle(AnnotatedCommandImpl.java:108)
	at com.taobao.arthas.core.shell.system.impl.ProcessImpl$CommandProcessTask.run(ProcessImpl.java:385)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

1.2 我调试一下arthas的代码?

1.2.1 启动demo

java -agentlib:jdwp=transport=dt_socket,address=5005,server=y,suspend=n -jar /arthas-plugin-demo/target/demo-0.0.1-SNAPSHOT.jar

1.2.2 启动arthas

https://arthas.aliyun.com/doc/install-detail.html
as.sh 直接启动

curl -sk https://arthas.aliyun.com/arthas-boot.jar -o ~/.arthas-boot.jar  && echo "alias as.sh='java -jar ~/.arthas-boot.jar --repo-mirror aliyun --use-http 2>&1'" >> ~/.bashrc && source ~/.bashrc && echo "source ~/.bashrc" >> ~/.bash_profile && source ~/.bash_profile

1.2.3 下载arthas 最新代码

git clone git@github.com:alibaba/arthas.git

1.2.4 arthas 工程直接一个remote debug

image

1.2.5 通过错误代码debug

com.taobao.arthas.core.shell.system.impl.ProcessImpl.CommandProcessImpl#appendResult
image

小技巧打印堆栈
image

找到入口
com.taobao.arthas.core.shell.handlers.term.RequestHandler#accept

image

com.taobao.arthas.core.shell.system.impl.JobImpl#run(boolean)

com.taobao.arthas.core.shell.system.impl.ProcessImpl#run(boolean)

com.taobao.arthas.core.shell.system.impl.ProcessImpl.CommandProcessTask#run

com.taobao.arthas.core.shell.handlers.Handler#handle

com.taobao.arthas.core.shell.command.impl.AnnotatedCommandImpl#process

com.taobao.arthas.core.command.monitor200.EnhancerCommand#process

com.taobao.arthas.core.command.monitor200.EnhancerCommand#enhance

1.3 跟踪问题

com.taobao.arthas.core.command.monitor200.EnhancerCommand#enhance
image

代码增强是ok的
image

增强的反馈
image

不知道什么时候 变为 终结状态的。

查看到异常、但是日志好像没有看到

com.taobao.arthas.core.command.monitor200.AbstractTraceAdviceListener#finishing
image

image

Method threw 'java.lang.StackOverflowError' exception. Cannot evaluate java.lang.Thread.toString()

日志里面没有搜索到 StackOverflowError
image

@hengyunabc
Copy link
Collaborator

上面的说明少了一步,要先执行 options unsafe true, 这样子可以稳定重现。

@WangJi92
Copy link
Contributor Author

上面的说明少了一步,要先执行 options unsafe true, 这样子可以稳定重现。

是滴,少写了一步

@favoorr
Copy link
Contributor

favoorr commented Dec 18, 2020

根因分析及修复方法

这里出错的根因是 方法递归调用,最终造成 StackOverflowError

下面这行代码导致了递归调用,把线程名这里修改下,就可以执行成功了,解决问题的方法是这个,但是从功能上看, trace 类 java.lang.Thread 的 getName() 算是一个比较少见的场景,wiki 上说明下,不是非常有必要修复

修改前

threadNode.setThreadName(currentThread.getName());

修改后

threadNode.setThreadName("aa");

递归调用路径如图:

循环调用

复现 debug 建议

arthas 建议断点位置

执行完下面这行代码就会出现了

inst.retransformClasses(classArray);

测试 Demo 代码片段

public class Demo {
    static class Counter {
        private static AtomicInteger count = new AtomicInteger(0);
        public static void increment() {
            count.incrementAndGet();
        }
        public static int value() {
            return count.get();
        }
    }

    public static void main(String[] args) {
        while (true) {
            Thread testThread1 = new Thread(new Runnable() {
                @Override
                public void run() {
                    System.out.println(Thread.currentThread().getName() +
                            "counter: " + Counter.value());
                }
            });
            Counter.increment();
            testThread1.setName("favoorr-test-thread" + Counter.value());
            testThread1.start();
            System.out.println(testThread1.getName());
            try {
                Thread.sleep(1000);
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
            testThread1 = null;
        }
    }
}

Bug 修复后正确执行示例

[arthas@13723]$ trace java.lang.Thread getName -n 7 '1==1' --skipJDKMethod false
`---ts=2020-12-18 16:52:22;thread_name=aa;id=44;is_daemon=true;priority=5;TCCL=sun.misc.Launcher$AppClassLoader@18b4aac2
    `---[0.362835ms] java.lang.Thread:getName()

Press Q or Ctrl+C to abort.
Affect(class count: 6 , method count: 1) cost in 125 ms, listenerId: 6
`---ts=2020-12-18 16:52:23;thread_name=aa;id=60;is_daemon=false;priority=5;TCCL=sun.misc.Launcher$AppClassLoader@18b4aac2`---ts=2020-12-18 16:52:23;thread_name=aa;id=1;is_daemon=false;priority=5;TCCL=sun.misc.Launcher$AppClassLoader@18b4aac2
    `---[0.047906ms] java.lang.Thread:getName()


    `---[0.065373ms] java.lang.Thread:getName()

`---ts=2020-12-18 16:52:25;thread_name=aa;id=1;is_daemon=false;priority=5;TCCL=sun.misc.Launcher$AppClassLoader@18b4aac2
    `---[0.050861ms] java.lang.Thread:getName()

`---ts=2020-12-18 16:52:25;thread_name=aa;id=61;is_daemon=false;priority=5;TCCL=sun.misc.Launcher$AppClassLoader@18b4aac2
    `---[0.044406ms] java.lang.Thread:getName()

`---ts=2020-12-18 16:52:27;thread_name=aa;id=1;is_daemon=false;priority=5;TCCL=sun.misc.Launcher$AppClassLoader@18b4aac2
    `---[0.049202ms] java.lang.Thread:getName()

`---ts=2020-12-18 16:52:27;thread_name=aa;id=62;is_daemon=false;priority=5;TCCL=sun.misc.Launcher$AppClassLoader@18b4aac2
    `---[0.058552ms] java.lang.Thread:getName()
Command execution times exceed limit: 7, so command will exit. You can set it with -n option.

@hengyunabc hengyunabc added bug Something isn't working wontfix This will not be worked on labels Dec 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

3 participants