Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Arthas里 Trace 命令怎样工作的/ Trace命令的实现原理 #597

Closed
hengyunabc opened this issue Mar 22, 2019 · 7 comments
Closed

Comments

@hengyunabc
Copy link
Collaborator

hengyunabc commented Mar 22, 2019

3.3.0 版本后,增加动态trace功能,可以动态深入下一层: https://alibaba.github.io/arthas/trace.html


Trace只对匹配到的method内的 子method 做统计

常见的一个疑问是 trace命令为什么有时候输出有时候是只有一级的,有时候是多级的?

首先trace命令的原理是:对匹配到的method内的 子method 做统计。

比如这个简单的Demo类:

import java.util.concurrent.TimeUnit;

public class Demo {
    public static void main(String[] args) throws InterruptedException {
        Demo demo = new Demo();
        while (true) {
            TimeUnit.SECONDS.sleep(1);
            demo.hello();
        }
    }
    public void hello() {
        System.out.println(this.getClass().getName() + "hello");
    }
}

使用arthas执行 trace Demo hello:

结果是:

$ trace Demo hello
Press Q or Ctrl+C to abort.
Affect(class-cnt:1 , method-cnt:1) cost in 108 ms.
`---ts=2019-03-22 18:49:44;thread_name=main;id=1;is_daemon=false;priority=5;TCCL=sun.misc.Launcher$AppClassLoader@4e25154f
    `---[1.477191ms] Demo:hello()
        +---[0.035988ms] java.lang.Object:getClass()
        +---[0.01844ms] java.lang.Class:getName()
        +---[0.018744ms] java.lang.String:valueOf()
        +---[0.023555ms] java.lang.StringBuilder:<init>()
        +---[0.022698ms] java.lang.StringBuilder:append()
        +---[0.014707ms] java.lang.StringBuilder:toString()
        `---[0.091016ms] java.io.PrintStream:println()

我们看下Demo类的字节码:

  public void hello();
    descriptor: ()V
    flags: ACC_PUBLIC
    Code:
      stack=4, locals=1, args_size=1
         0: getstatic     #37                 // Field java/lang/System.out:Ljava/io/PrintStream;
         3: new           #43                 // class java/lang/StringBuilder
         6: dup
         7: aload_0
         8: invokevirtual #45                 // Method java/lang/Object.getClass:()Ljava/lang/Class;
        11: invokevirtual #49                 // Method java/lang/Class.getName:()Ljava/lang/String;
        14: invokestatic  #55                 // Method java/lang/String.valueOf:(Ljava/lang/Object;)Ljava/lang/String;
        17: invokespecial #61                 // Method java/lang/StringBuilder."<init>":(Ljava/lang/String;)V
        20: ldc           #64                 // String hello
        22: invokevirtual #65                 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
        25: invokevirtual #69                 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
        28: invokevirtual #72                 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
        31: return

可以看到每一个invokevirtual都对应一个 trace结果里的entry。

所以,trace实际上是在每一个invokevirtual 前后插入代码,然后统计调用的时间。

trace本身只能拿到当前method的字节码,所以它只能trace当前method里的 invokevirtual,再深层的invokevirtual,它并不能知道。

@hengyunabc
Copy link
Collaborator Author

为什么trace结果有时会有多层?

在下面的Demo里,代码很简单:

  • ClassD调用ClassCClassC调用ClassBClassB调用ClassA
  • ClassDClassA实现了Hello接口
import java.util.concurrent.TimeUnit;

public class Demo {
    public static void main(String[] args) throws InterruptedException {
        ClassD d = new ClassD();
        while (true) {
            TimeUnit.SECONDS.sleep(1);
            d.hello();
        }
    }

    interface Hello {
        public void hello();
    }

    static class ClassA implements Hello {
        @Override
        public void hello() {
            System.out.println(this.getClass().getName() + "hello");
        }
    }

    static class ClassB {
        public void test() {
            System.out.println(this.getClass().getName() + "hello");
            new ClassA().hello();
        }
    }

    static class ClassC {
        public void test() {
            System.out.println(this.getClass().getName() + "hello");
            new ClassB().test();
        }
    }

    static class ClassD implements Hello {
        public void hello() {
            System.out.println(this.getClass().getName() + "hello");
            new ClassC().test();
        }
    }
}

下面用arthas执行 trace Demo$ClassD hello,可以看到结果树是只有一层的,符合预期:

$ trace Demo$ClassD hello
Press Q or Ctrl+C to abort.
Affect(class-cnt:1 , method-cnt:1) cost in 96 ms.
`---ts=2019-03-22 19:04:19;thread_name=main;id=1;is_daemon=false;priority=5;TCCL=sun.misc.Launcher$AppClassLoader@4e25154f
    `---[3.911624ms] Demo$ClassD:hello()
        +---[0.032104ms] java.lang.Object:getClass()
        +---[0.016964ms] java.lang.Class:getName()
        +---[0.016445ms] java.lang.String:valueOf()
        +---[0.015006ms] java.lang.StringBuilder:<init>()
        +---[0.01918ms] java.lang.StringBuilder:append()
        +---[0.01548ms] java.lang.StringBuilder:toString()
        +---[1.409546ms] java.io.PrintStream:println()
        +---[0.01654ms] Demo$ClassC:<init>()
        `---[0.688504ms] Demo$ClassC:test()

下面再用arthas来trace Hello接口:trace Demo$Hello hello

$ trace Demo$Hello hello
Press Q or Ctrl+C to abort.
Affect(class-cnt:2 , method-cnt:2) cost in 29 ms.
`---ts=2019-03-22 19:04:40;thread_name=main;id=1;is_daemon=false;priority=5;TCCL=sun.misc.Launcher$AppClassLoader@4e25154f
    `---[1.019782ms] Demo$ClassD:hello()
        +---[0.008038ms] java.lang.Object:getClass()
        +---[0.008512ms] java.lang.Class:getName()
        +---[0.00866ms] java.lang.String:valueOf()
        +---[0.02074ms] java.lang.StringBuilder:<init>()
        +---[0.01381ms] java.lang.StringBuilder:append()
        +---[0.006114ms] java.lang.StringBuilder:toString()
        +---[0.110468ms] java.io.PrintStream:println()
        +---[0.009324ms] Demo$ClassC:<init>()
        `---[0.59506ms] Demo$ClassC:test()
            `---[0.416302ms] Demo$ClassA:hello()
                +---[0.00866ms] java.lang.Object:getClass()
                +---[0.01141ms] java.lang.Class:getName()
                +---[0.012544ms] java.lang.String:valueOf()
                +---[0.013024ms] java.lang.StringBuilder:<init>()
                +---[0.065776ms] java.lang.StringBuilder:append()
                +---[0.016712ms] java.lang.StringBuilder:toString()
                `---[0.060616ms] java.io.PrintStream:println()

可以看到:

  • 结果树有两层,一层是Demo$ClassD:hello(),另一层是Demo$ClassA:hello()

为什么ClassC/ClassB没有出现在结果树里?为什么树不是四层的?

因为trace匹配的是Hello接口,只有ClassDClassA实现了Hello接口。

总结

  • trace匹配到了多个class,多个method
  • 多个method在同一个线程里被调用,一个method就会是trace结果里的一层
  • 因为trace在ThreadLocal里记录了树的结点信息,所以在一个线程里调用多个匹配到的method,就会形成一个树

@hengyunabc
Copy link
Collaborator Author

hengyunabc commented Mar 22, 2019

Trace 还有两个常见问题:

以上面的demo为例,如果想把整个4层的调用树列出来,那么可以执行

trace -E 'Demo\$Hello|Demo\$ClassB|Demo\$ClassC' 'hello|test'

那么会把完整的调用树打印出来:

$ trace -E 'Demo\$Hello|Demo\$ClassB|Demo\$ClassC' 'hello|test'
Press Q or Ctrl+C to abort.
Affect(class-cnt:4 , method-cnt:4) cost in 99 ms.
`---ts=2019-03-22 19:17:53;thread_name=main;id=1;is_daemon=false;priority=5;TCCL=sun.misc.Launcher$AppClassLoader@4e25154f
    `---[3.07526ms] Demo$ClassD:hello()
        +---[0.034132ms] java.lang.Object:getClass()
        +---[0.015133ms] java.lang.Class:getName()
        +---[0.012802ms] java.lang.String:valueOf()
        +---[0.010077ms] java.lang.StringBuilder:<init>()
        +---[0.010013ms] java.lang.StringBuilder:append()
        +---[0.010582ms] java.lang.StringBuilder:toString()
        +---[0.17528ms] java.io.PrintStream:println()
        +---[0.022878ms] Demo$ClassC:<init>()
        `---[2.404445ms] Demo$ClassC:test()
            `---[2.158ms] Demo$ClassC:test()
                +---[0.01101ms] java.lang.Object:getClass()
                +---[0.016783ms] java.lang.Class:getName()
                +---[0.016797ms] java.lang.String:valueOf()
                +---[0.020633ms] java.lang.StringBuilder:<init>()
                +---[0.017495ms] java.lang.StringBuilder:append()
                +---[0.010945ms] java.lang.StringBuilder:toString()
                +---[0.043742ms] java.io.PrintStream:println()
                +---[0.019857ms] Demo$ClassB:<init>()
                `---[1.741185ms] Demo$ClassB:test()
                    `---[1.581065ms] Demo$ClassB:test()
                        +---[0.173202ms] java.lang.Object:getClass()
                        +---[0.019643ms] java.lang.Class:getName()
                        +---[0.019705ms] java.lang.String:valueOf()
                        +---[0.013422ms] java.lang.StringBuilder:<init>()
                        +---[0.021875ms] java.lang.StringBuilder:append()
                        +---[0.096128ms] java.lang.StringBuilder:toString()
                        +---[0.050018ms] java.io.PrintStream:println()
                        +---[0.02202ms] Demo$ClassA:<init>()
                        `---[0.798537ms] Demo$ClassA:hello()
                            `---[0.641463ms] Demo$ClassA:hello()
                                +---[0.010265ms] java.lang.Object:getClass()
                                +---[0.02994ms] java.lang.Class:getName()
                                +---[0.015822ms] java.lang.String:valueOf()
                                +---[0.034097ms] java.lang.StringBuilder:<init>()
                                +---[0.015282ms] java.lang.StringBuilder:append()
                                +---[0.010235ms] java.lang.StringBuilder:toString()
                                `---[0.038225ms] java.io.PrintStream:println()

@hengyunabc hengyunabc changed the title Trace 命令怎样工作的 Arthas里 Trace 命令怎样工作的 Mar 22, 2019
@hengyunabc hengyunabc mentioned this issue May 10, 2020
1 task
@hengyunabc hengyunabc changed the title Arthas里 Trace 命令怎样工作的 Arthas里 Trace 命令怎样工作的/ Trace命令的实现原理 May 21, 2020
@YancyPeng
Copy link

trace的方法是有入参的,所以trace是如何构造参数的?

@hengyunabc
Copy link
Collaborator Author

trace的方法是有入参的,所以trace是如何构造参数的?

trace不支持获取入参。

@wayss000
Copy link

trace的方法是有入参的,所以trace是如何构造参数的?

trace不支持获取入参。

我trace的方法也是有入参的,不太明白具体入参怎么写?

@hengyunabc
Copy link
Collaborator Author

hengyunabc commented Jun 9, 2021

min/max/total/count 意义解析

public class Test {
    
    public static void main(String[] args) {
        for(;;) {
            aaa();
        }
    }
    
    public static void aaa() {
        for(int i = 1; i < 10; ++i) {
            bbb(i);
            System.err.println(i);
        }
        
    }
    
    public static void bbb(int i ) {
        try {
            Thread.sleep(i * 100);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
    }

}

启动上面的demo,然后执行 trace Test aaa ,可以看到下面的输出结果:

image

[arthas@25252]$ trace Test aaa
Press Q or Ctrl+C to abort.
Affect(class count: 1 , method count: 1) cost in 87 ms, listenerId: 1
`---ts=2021-06-09 15:51:33;thread_name=main;id=1;is_daemon=false;priority=5;TCCL=sun.misc.Launcher$AppClassLoader@73d16e93
    `---[4527.809232ms] Test:aaa()
        `---[min=105.151799ms,max=900.906267ms,total=4525.33612ms,count=9] Test:bbb() #12

那么这一行红色的是什么意思呢?

  • count=9,表示在 aaa 里调用了bbb函数9次
  • min=105.151799ms表示在9次调用里,时间最少的是105毫秒
  • max=900.906267ms 表示在9次调用里,时间最多的是900毫秒
  • 4525.33612ms 表示9次调用,总耗时是4525毫秒

对应代码: https://github.com/alibaba/arthas/blob/arthas-all-3.5.1/core/src/main/java/com/taobao/arthas/core/command/model/MethodNode.java#L33

@hupapaliuli
Copy link

请问下,trace命令对性能的压力大不大,目前碰到线程池超时的线上问题,很难复现,想要通过异步任务 + trace指令来排查和记录问题,但是担心这种方式会有大的性能瓶颈/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants