Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dfsTraverse 性能提升 #1

Closed
henix opened this issue Mar 2, 2015 · 5 comments
Closed

dfsTraverse 性能提升 #1

henix opened this issue Mar 2, 2015 · 5 comments

Comments

@henix
Copy link
Owner

henix commented Mar 2, 2015

线上用 VisualVM sampler 发现 Iterator 的某个匿名中的 Iterator 的 hasNext 占用 CPU 较高。故考虑优化一下 dfsTraverse 中的 Iterator 使用。

采用 ScalaMeter

@henix henix closed this as completed in 2b2163b Mar 2, 2015
@henix
Copy link
Owner Author

henix commented Mar 10, 2015

可考虑只测量 dfsTraverse 的性能,而不是 buildIdCache,因为 buildIdCache 可能哈希就占了大头。

@henix henix reopened this Mar 10, 2015
@henix
Copy link
Owner Author

henix commented Mar 10, 2015

使用 jmh 重新进行了性能测试,比较下面两种 dfsTraverse 谁更高效:

def dfsTraverse(e: Element): Iterator[Element] = Iterator.single(e) ++ getChilds(e).flatMap(dfsTraverse)

def dfsTraverse(e: Element): Iterator[Element] = if (e.childNodeSize() == 0) Iterator.single(e) else Iterator.single(e) ++ getChilds(e).flatMap(dfsTraverse)

第一感觉似乎第二种更好,而且对 gc 也更友好。但测量结果表明并不是这样。

第一种实现的 3 次结果:

[info] Benchmark                                           Mode  Cnt     Score   Error   Units
[info] SelectorsBenchmark.dfsTraverse                     thrpt   10  1033.550 ± 4.036   ops/s
[info] SelectorsBenchmark.dfsTraverse:@gc.count.profiled  thrpt   10     7.000 ±   NaN  counts
[info] SelectorsBenchmark.dfsTraverse:@gc.count.total     thrpt   10    19.000 ±   NaN  counts
[info] SelectorsBenchmark.dfsTraverse:@gc.time.profiled   thrpt   10     7.000 ±   NaN      ms
[info] SelectorsBenchmark.dfsTraverse:@gc.time.total      thrpt   10    73.000 ±   NaN      ms
[info] SelectorsBenchmark.dfsTraverse                      avgt   10     0.963 ± 0.016   ms/op
[info] SelectorsBenchmark.dfsTraverse:@gc.count.profiled   avgt   10    11.000 ±   NaN  counts
[info] SelectorsBenchmark.dfsTraverse:@gc.count.total      avgt   10    24.000 ±   NaN  counts
[info] SelectorsBenchmark.dfsTraverse:@gc.time.profiled    avgt   10    25.000 ±   NaN      ms
[info] SelectorsBenchmark.dfsTraverse:@gc.time.total       avgt   10    68.000 ±   NaN      ms

[info] Benchmark                                           Mode  Cnt     Score    Error   Units
[info] SelectorsBenchmark.dfsTraverse                     thrpt   10  1060.601 ± 40.026   ops/s
[info] SelectorsBenchmark.dfsTraverse:@gc.count.profiled  thrpt   10    11.000 ±    NaN  counts
[info] SelectorsBenchmark.dfsTraverse:@gc.count.total     thrpt   10    22.000 ±    NaN  counts
[info] SelectorsBenchmark.dfsTraverse:@gc.time.profiled   thrpt   10    29.000 ±    NaN      ms
[info] SelectorsBenchmark.dfsTraverse:@gc.time.total      thrpt   10   109.000 ±    NaN      ms
[info] SelectorsBenchmark.dfsTraverse                      avgt   10     0.952 ±  0.009   ms/op
[info] SelectorsBenchmark.dfsTraverse:@gc.count.profiled   avgt   10     8.000 ±    NaN  counts
[info] SelectorsBenchmark.dfsTraverse:@gc.count.total      avgt   10    20.000 ±    NaN  counts
[info] SelectorsBenchmark.dfsTraverse:@gc.time.profiled    avgt   10    25.000 ±    NaN      ms
[info] SelectorsBenchmark.dfsTraverse:@gc.time.total       avgt   10    76.000 ±    NaN      ms

[info] Benchmark                                           Mode  Cnt     Score    Error   Units
[info] SelectorsBenchmark.dfsTraverse                     thrpt   10  1063.436 ± 75.428   ops/s
[info] SelectorsBenchmark.dfsTraverse:@gc.count.profiled  thrpt   10    18.000 ±    NaN  counts
[info] SelectorsBenchmark.dfsTraverse:@gc.count.total     thrpt   10    35.000 ±    NaN  counts
[info] SelectorsBenchmark.dfsTraverse:@gc.time.profiled   thrpt   10    48.000 ±    NaN      ms
[info] SelectorsBenchmark.dfsTraverse:@gc.time.total      thrpt   10   111.000 ±    NaN      ms
[info] SelectorsBenchmark.dfsTraverse                      avgt   10     0.928 ±  0.022   ms/op
[info] SelectorsBenchmark.dfsTraverse:@gc.count.profiled   avgt   10    12.000 ±    NaN  counts
[info] SelectorsBenchmark.dfsTraverse:@gc.count.total      avgt   10    26.000 ±    NaN  counts
[info] SelectorsBenchmark.dfsTraverse:@gc.time.profiled    avgt   10    20.000 ±    NaN      ms
[info] SelectorsBenchmark.dfsTraverse:@gc.time.total       avgt   10    69.000 ±    NaN      ms

第二种实现的 4 次结果:

[info] Benchmark                                           Mode  Cnt    Score    Error   Units
[info] SelectorsBenchmark.dfsTraverse                     thrpt   10  991.781 ± 66.554   ops/s
[info] SelectorsBenchmark.dfsTraverse:@gc.count.profiled  thrpt   10   22.000 ±    NaN  counts
[info] SelectorsBenchmark.dfsTraverse:@gc.count.total     thrpt   10   33.000 ±    NaN  counts
[info] SelectorsBenchmark.dfsTraverse:@gc.time.profiled   thrpt   10   51.000 ±    NaN      ms
[info] SelectorsBenchmark.dfsTraverse:@gc.time.total      thrpt   10  144.000 ±    NaN      ms
[info] SelectorsBenchmark.dfsTraverse                      avgt   10    0.980 ±  0.036   ms/op
[info] SelectorsBenchmark.dfsTraverse:@gc.count.profiled   avgt   10    8.000 ±    NaN  counts
[info] SelectorsBenchmark.dfsTraverse:@gc.count.total      avgt   10   20.000 ±    NaN  counts
[info] SelectorsBenchmark.dfsTraverse:@gc.time.profiled    avgt   10    8.000 ±    NaN      ms
[info] SelectorsBenchmark.dfsTraverse:@gc.time.total       avgt   10   46.000 ±    NaN      ms

[info] Benchmark                                           Mode  Cnt     Score    Error   Units
[info] SelectorsBenchmark.dfsTraverse                     thrpt   10  1082.235 ± 15.005   ops/s
[info] SelectorsBenchmark.dfsTraverse:@gc.count.profiled  thrpt   10    14.000 ±    NaN  counts
[info] SelectorsBenchmark.dfsTraverse:@gc.count.total     thrpt   10    29.000 ±    NaN  counts
[info] SelectorsBenchmark.dfsTraverse:@gc.time.profiled   thrpt   10    28.000 ±    NaN      ms
[info] SelectorsBenchmark.dfsTraverse:@gc.time.total      thrpt   10    87.000 ±    NaN      ms
[info] SelectorsBenchmark.dfsTraverse                      avgt   10     0.958 ±  0.011   ms/op
[info] SelectorsBenchmark.dfsTraverse:@gc.count.profiled   avgt   10     8.000 ±    NaN  counts
[info] SelectorsBenchmark.dfsTraverse:@gc.count.total      avgt   10    20.000 ±    NaN  counts
[info] SelectorsBenchmark.dfsTraverse:@gc.time.profiled    avgt   10    15.000 ±    NaN      ms
[info] SelectorsBenchmark.dfsTraverse:@gc.time.total       avgt   10    67.000 ±    NaN      ms

[info] Benchmark                                           Mode  Cnt     Score   Error   Units
[info] SelectorsBenchmark.dfsTraverse                     thrpt   10  1075.301 ± 9.973   ops/s
[info] SelectorsBenchmark.dfsTraverse:@gc.count.profiled  thrpt   10    10.000 ±   NaN  counts
[info] SelectorsBenchmark.dfsTraverse:@gc.count.total     thrpt   10    22.000 ±   NaN  counts
[info] SelectorsBenchmark.dfsTraverse:@gc.time.profiled   thrpt   10    23.000 ±   NaN      ms
[info] SelectorsBenchmark.dfsTraverse:@gc.time.total      thrpt   10    72.000 ±   NaN      ms
[info] SelectorsBenchmark.dfsTraverse                      avgt   10     0.966 ± 0.013   ms/op
[info] SelectorsBenchmark.dfsTraverse:@gc.count.profiled   avgt   10     9.000 ±   NaN  counts
[info] SelectorsBenchmark.dfsTraverse:@gc.count.total      avgt   10    21.000 ±   NaN  counts
[info] SelectorsBenchmark.dfsTraverse:@gc.time.profiled    avgt   10     8.000 ±   NaN      ms
[info] SelectorsBenchmark.dfsTraverse:@gc.time.total       avgt   10    51.000 ±   NaN      ms

[info] Benchmark                                           Mode  Cnt     Score    Error   Units
[info] SelectorsBenchmark.dfsTraverse                     thrpt   10  1025.276 ± 14.869   ops/s
[info] SelectorsBenchmark.dfsTraverse:@gc.count.profiled  thrpt   10     8.000 ±    NaN  counts
[info] SelectorsBenchmark.dfsTraverse:@gc.count.total     thrpt   10    20.000 ±    NaN  counts
[info] SelectorsBenchmark.dfsTraverse:@gc.time.profiled   thrpt   10    36.000 ±    NaN      ms
[info] SelectorsBenchmark.dfsTraverse:@gc.time.total      thrpt   10    91.000 ±    NaN      ms
[info] SelectorsBenchmark.dfsTraverse                      avgt   10     0.964 ±  0.010   ms/op
[info] SelectorsBenchmark.dfsTraverse:@gc.count.profiled   avgt   10     8.000 ±    NaN  counts
[info] SelectorsBenchmark.dfsTraverse:@gc.count.total      avgt   10    20.000 ±    NaN  counts
[info] SelectorsBenchmark.dfsTraverse:@gc.time.profiled    avgt   10    28.000 ±    NaN      ms
[info] SelectorsBenchmark.dfsTraverse:@gc.time.total       avgt   10    85.000 ±    NaN      ms

运行时间方面第一种似乎更优,而 gc 方面则看不出差别。

@henix
Copy link
Owner Author

henix commented Mar 10, 2015

上面是 -i 10 的结果,如果用 20 次的话: jmh/run -i 20 -wi 10 -f1 -t1 -prof gc

第一种:

[info] Benchmark                                           Mode  Cnt     Score    Error   Units
[info] SelectorsBenchmark.dfsTraverse                     thrpt   20  1066.971 ± 21.352   ops/s
[info] SelectorsBenchmark.dfsTraverse:@gc.count.profiled  thrpt   20    36.000 ±    NaN  counts
[info] SelectorsBenchmark.dfsTraverse:@gc.count.total     thrpt   20    50.000 ±    NaN  counts
[info] SelectorsBenchmark.dfsTraverse:@gc.time.profiled   thrpt   20    84.000 ±    NaN      ms
[info] SelectorsBenchmark.dfsTraverse:@gc.time.total      thrpt   20   144.000 ±    NaN      ms
[info] SelectorsBenchmark.dfsTraverse                      avgt   20     0.957 ±  0.007   ms/op
[info] SelectorsBenchmark.dfsTraverse:@gc.count.profiled   avgt   20    35.000 ±    NaN  counts
[info] SelectorsBenchmark.dfsTraverse:@gc.count.total      avgt   20    48.000 ±    NaN  counts
[info] SelectorsBenchmark.dfsTraverse:@gc.time.profiled    avgt   20   102.000 ±    NaN      ms
[info] SelectorsBenchmark.dfsTraverse:@gc.time.total       avgt   20   161.000 ±    NaN      ms

[info] Benchmark                                           Mode  Cnt     Score   Error   Units
[info] SelectorsBenchmark.dfsTraverse                     thrpt   20  1031.682 ± 8.696   ops/s
[info] SelectorsBenchmark.dfsTraverse:@gc.count.profiled  thrpt   20    33.000 ±   NaN  counts
[info] SelectorsBenchmark.dfsTraverse:@gc.count.total     thrpt   20    46.000 ±   NaN  counts
[info] SelectorsBenchmark.dfsTraverse:@gc.time.profiled   thrpt   20    73.000 ±   NaN      ms
[info] SelectorsBenchmark.dfsTraverse:@gc.time.total      thrpt   20   115.000 ±   NaN      ms
[info] SelectorsBenchmark.dfsTraverse                      avgt   20     0.937 ± 0.007   ms/op
[info] SelectorsBenchmark.dfsTraverse:@gc.count.profiled   avgt   20    37.000 ±   NaN  counts
[info] SelectorsBenchmark.dfsTraverse:@gc.count.total      avgt   20    50.000 ±   NaN  counts
[info] SelectorsBenchmark.dfsTraverse:@gc.time.profiled    avgt   20   106.000 ±   NaN      ms
[info] SelectorsBenchmark.dfsTraverse:@gc.time.total       avgt   20   161.000 ±   NaN      ms

第二种:

[info] SelectorsBenchmark.dfsTraverse                     thrpt   20  1061.645 ± 13.963   ops/s
[info] SelectorsBenchmark.dfsTraverse:@gc.count.profiled  thrpt   20    22.000 ±    NaN  counts
[info] SelectorsBenchmark.dfsTraverse:@gc.count.total     thrpt   20    35.000 ±    NaN  counts
[info] SelectorsBenchmark.dfsTraverse:@gc.time.profiled   thrpt   20    54.000 ±    NaN      ms
[info] SelectorsBenchmark.dfsTraverse:@gc.time.total      thrpt   20   114.000 ±    NaN      ms
[info] SelectorsBenchmark.dfsTraverse                      avgt   20     0.968 ±  0.018   ms/op
[info] SelectorsBenchmark.dfsTraverse:@gc.count.profiled   avgt   20    23.000 ±    NaN  counts
[info] SelectorsBenchmark.dfsTraverse:@gc.count.total      avgt   20    35.000 ±    NaN  counts
[info] SelectorsBenchmark.dfsTraverse:@gc.time.profiled    avgt   20    74.000 ±    NaN      ms
[info] SelectorsBenchmark.dfsTraverse:@gc.time.total       avgt   20   115.000 ±    NaN      ms

[info] Benchmark                                           Mode  Cnt     Score   Error   Units
[info] SelectorsBenchmark.dfsTraverse                     thrpt   20  1033.959 ± 6.182   ops/s
[info] SelectorsBenchmark.dfsTraverse:@gc.count.profiled  thrpt   20    23.000 ±   NaN  counts
[info] SelectorsBenchmark.dfsTraverse:@gc.count.total     thrpt   20    35.000 ±   NaN  counts
[info] SelectorsBenchmark.dfsTraverse:@gc.time.profiled   thrpt   20    25.000 ±   NaN      ms
[info] SelectorsBenchmark.dfsTraverse:@gc.time.total      thrpt   20    64.000 ±   NaN      ms
[info] SelectorsBenchmark.dfsTraverse                      avgt   20     0.965 ± 0.018   ms/op
[info] SelectorsBenchmark.dfsTraverse:@gc.count.profiled   avgt   20    24.000 ±   NaN  counts
[info] SelectorsBenchmark.dfsTraverse:@gc.count.total      avgt   20    36.000 ±   NaN  counts
[info] SelectorsBenchmark.dfsTraverse:@gc.time.profiled    avgt   20    33.000 ±   NaN      ms
[info] SelectorsBenchmark.dfsTraverse:@gc.time.total       avgt   20    75.000 ±   NaN      ms

似乎终于可以看出结果了:第一种速度更快,第二种 gc 次数更少。

@henix henix closed this as completed Mar 10, 2015
@henix
Copy link
Owner Author

henix commented Mar 18, 2015

又想到一招:将递归改为非递归。非递归的 DFS 参考 https://en.wikipedia.org/wiki/Tree_traversal

@henix henix reopened this Mar 18, 2015
@henix
Copy link
Owner Author

henix commented Mar 18, 2015

jmh 运行结果,dfsTraverse 6x 提升:

[info] Benchmark                                           Mode  Cnt     Score     Error   Units
[info] SelectorsBenchmark.dfsTraverse                     thrpt   20  6640.880 ± 108.802   ops/s
[info] SelectorsBenchmark.dfsTraverse:@gc.count.profiled  thrpt   20    95.000 ±     NaN  counts
[info] SelectorsBenchmark.dfsTraverse:@gc.count.total     thrpt   20   134.000 ±     NaN  counts
[info] SelectorsBenchmark.dfsTraverse:@gc.time.profiled   thrpt   20   159.000 ±     NaN      ms
[info] SelectorsBenchmark.dfsTraverse:@gc.time.total      thrpt   20   264.000 ±     NaN      ms
[info] SelectorsBenchmark.bfsTraverse                      avgt   20     0.159 ±   0.003   ms/op
[info] SelectorsBenchmark.bfsTraverse:@gc.count.profiled   avgt   20    66.000 ±     NaN  counts
[info] SelectorsBenchmark.bfsTraverse:@gc.count.total      avgt   20    95.000 ±     NaN  counts
[info] SelectorsBenchmark.bfsTraverse:@gc.time.profiled    avgt   20   140.000 ±     NaN      ms
[info] SelectorsBenchmark.bfsTraverse:@gc.time.total       avgt   20   238.000 ±     NaN      ms
[info] SelectorsBenchmark.dfsTraverse                      avgt   20     0.151 ±   0.003   ms/op
[info] SelectorsBenchmark.dfsTraverse:@gc.count.profiled   avgt   20    74.000 ±     NaN  counts
[info] SelectorsBenchmark.dfsTraverse:@gc.count.total      avgt   20   111.000 ±     NaN  counts
[info] SelectorsBenchmark.dfsTraverse:@gc.time.profiled    avgt   20   177.000 ±     NaN      ms
[info] SelectorsBenchmark.dfsTraverse:@gc.time.total       avgt   20   263.000 ±     NaN      ms

@henix henix closed this as completed in 1e5a810 Mar 18, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant