perf: optimize Groovy-based LAL DSL with static compilation by kezhenxu94 · Pull Request #7263 · apache/skywalking

kezhenxu94 · 2021-07-08T04:56:21Z

Groovy naturally supports many dynamic features that we don't benefit for now but cost performance loss, in this patch we compile our Groovy-based DSL scripts statically to optimize performance.

Improve the performance of LAL

Add a benchmark for the improvement, refer to the existing ones

class MyBenchmark {

    ModuleManager manager;
    DSL dsl;
    LogAnalyzerModuleConfig config;
    FilterSpec filter;


    @Setup
    public void setup() throws ModuleStartException {
        manager = new ModuleManager();
        config = new LogAnalyzerModuleConfig();
        filter = new FilterSpec(manager, config);
        dsl = DSL.of(manager, new LogAnalyzerModuleConfig(), "filter {\n" +
                                                             "  json {\n" +
                                                             "  }\n" +
                                                             "  // only collect abnormal logs (http status code >= 300, or commonProperties?.responseFlags is not empty)\n" +
                                                             "  if (parsed?.response?.responseCode as Integer < 400 && !parsed?.commonProperties?.responseFlags) {\n" +
                                                             "    abort {}\n" +
                                                             "  }\n" +
                                                             "  extractor {\n" +
                                                             "    if (parsed?.response?.responseCode) {\n" +
                                                             "      tag 'status.code': parsed?.response?.responseCode as int\n" +
                                                             "    }\n" +
                                                             "    tag 'response.flag': parsed?.commonProperties?.responseFlags\n" +
                                                             "  }\n" +
                                                             "  sink {\n" +
                                                             "    sampler {\n" +
                                                             "      if (parsed?.commonProperties?.responseFlags) {\n" +
                                                             "        // use service:errorCode as sampler id so that each service:errorCode has its own sampler,\n" +
                                                             "        // e.g. checkoutservice:[upstreamConnectionFailure], checkoutservice:[upstreamRetryLimitExceeded]\n" +
                                                             "        rateLimit(\"${log.service}:${parsed?.commonProperties?.responseFlags}\") {\n" +
                                                             "          qps 100\n" +
                                                             "        }\n" +
                                                             "      } else {\n" +
                                                             "        // use service:responseCode as sampler id so that each service:responseCode has its own sampler,\n" +
                                                             "        // e.g. checkoutservice:500, checkoutservice:404.\n" +
                                                             "        rateLimit(\"${log.service}:${parsed?.response?.responseCode}\") {\n" +
                                                             "          qps 100\n" +
                                                             "        }\n" +
                                                             "      }\n" +
                                                             "    }\n" +
                                                             "  }\n" +
                                                             "}"
        );
        dsl.bind(new Binding().log(LogData.newBuilder().build()));
    }

    @Benchmark
    public void testDSL(Blackhole bh) throws ModuleStartException {
        dsl.evaluate();
    }

    public static void main(String[] args) throws RunnerException {
        LogManager.shutdown();
        Options opt = new OptionsBuilder()
                .warmupForks(3)
                .forks(5)
                .include(MyBenchmark.class.getSimpleName())
                .addProfiler(GCProfiler.class)
                .build();

        new Runner(opt).run();
    }
}

The benchmark result.

Before

Benchmark                                              Mode  Cnt       Score      Error   Units
MyBenchmark.testDSL                                   thrpt   25  246670.893 ± 6418.101   ops/s
MyBenchmark.testDSL:·gc.alloc.rate                    thrpt   25     602.013 ±   15.646  MB/sec
MyBenchmark.testDSL:·gc.alloc.rate.norm               thrpt   25    2688.000 ±    0.001    B/op
MyBenchmark.testDSL:·gc.churn.PS_Eden_Space           thrpt   25     602.247 ±   15.530  MB/sec
MyBenchmark.testDSL:·gc.churn.PS_Eden_Space.norm      thrpt   25    2689.112 ±   11.894    B/op
MyBenchmark.testDSL:·gc.churn.PS_Survivor_Space       thrpt   25       0.058 ±    0.008  MB/sec
MyBenchmark.testDSL:·gc.churn.PS_Survivor_Space.norm  thrpt   25       0.257 ±    0.035    B/op
MyBenchmark.testDSL:·gc.count                         thrpt   25    1498.000             counts
MyBenchmark.testDSL:·gc.time                          thrpt   25    1636.000                 ms

After

Benchmark                                              Mode  Cnt       Score      Error   Units
MyBenchmark.testDSL                                   thrpt   25  385547.228 ± 9829.731   ops/s
MyBenchmark.testDSL:·gc.alloc.rate                    thrpt   25     627.257 ±   16.030  MB/sec
MyBenchmark.testDSL:·gc.alloc.rate.norm               thrpt   25    1792.000 ±    0.001    B/op
MyBenchmark.testDSL:·gc.churn.PS_Eden_Space           thrpt   25     628.196 ±   18.549  MB/sec
MyBenchmark.testDSL:·gc.churn.PS_Eden_Space.norm      thrpt   25    1794.442 ±   15.527    B/op
MyBenchmark.testDSL:·gc.churn.PS_Survivor_Space       thrpt   25       0.066 ±    0.009  MB/sec
MyBenchmark.testDSL:·gc.churn.PS_Survivor_Space.norm  thrpt   25       0.190 ±    0.027    B/op
MyBenchmark.testDSL:·gc.count                         thrpt   25    1513.000             counts
MyBenchmark.testDSL:·gc.time                          thrpt   25    1798.000                 ms

Links/URLs to the theory proof or discussion articles/blogs. <links/URLs here>
If this pull request closes/resolves/fixes an existing issue, replace the issue number. Closes #. Not close, related to Improve LAL performance #7180
Update the CHANGES log.

codecov · 2021-07-08T05:15:12Z

Codecov Report

Merging #7263 (b2952fa) into master (58dce44) will decrease coverage by 2.79%.
The diff coverage is 65.75%.

❗ Current head b2952fa differs from pull request most recent head f9879d7. Consider uploading reports for the commit f9879d7 to get more accurate results

@@             Coverage Diff              @@
##             master    #7263      +/-   ##
============================================
- Coverage     53.81%   51.01%   -2.80%     
+ Complexity     4349     4050     -299     
============================================
  Files          1899     1904       +5     
  Lines         40950    41139     +189     
  Branches       4595     4583      -12     
============================================
- Hits          22038    20988    -1050     
- Misses        17822    19022    +1200     
- Partials       1090     1129      +39

Impacted Files	Coverage Δ
...ng/oap/log/analyzer/dsl/spec/sink/SamplerSpec.java	`64.70% <ø> (+41.17%)`	⬆️
...lking/oap/log/analyzer/dsl/spec/sink/SinkSpec.java	`62.50% <ø> (+43.75%)`	⬆️
...oap/log/analyzer/dsl/spec/LALDelegatingScript.java	`23.52% <23.52%> (ø)`
...p/log/analyzer/dsl/spec/parser/TextParserSpec.java	`66.66% <50.00%> (+2.38%)`	⬆️
.../oap/log/analyzer/dsl/LALPrecompiledExtension.java	`71.42% <71.42%> (ø)`
...log/analyzer/dsl/spec/extractor/ExtractorSpec.java	`50.00% <76.92%> (+25.78%)`	⬆️
...rg/apache/skywalking/oap/log/analyzer/dsl/DSL.java	`100.00% <100.00%> (ø)`
...g/oap/log/analyzer/dsl/spec/filter/FilterSpec.java	`73.43% <100.00%> (+28.82%)`	⬆️
...g/analyzer/dsl/spec/parser/AbstractParserSpec.java	`100.00% <100.00%> (ø)`
...p/log/analyzer/dsl/spec/parser/JsonParserSpec.java	`100.00% <100.00%> (+20.00%)`	⬆️
... and 137 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 58dce44...f9879d7. Read the comment docs.

oap-server/analyzer/log-analyzer/pom.xml

wu-sheng · 2021-07-08T06:58:07Z

Please fix CI. And serialization has more performance impact for groovy, right?

wu-sheng

Is this static compilation available for runtime script update or new script as input?
I am thinking to share the same core for #7168 requirement

wu-sheng · 2021-07-09T04:03:14Z

Comparing 385k to 246k, we have a 50% performance increment, is this running in one Mac Intel core? Could you attach the hardware standard here?

kezhenxu94 · 2021-07-09T04:14:09Z

Is this static compilation available for runtime script update or new script as input?
I am thinking to share the same core for #7168 requirement

Yes. Whenever a new LAL DSL script is created, we can compile it to Java-style static byte codes once, after that its execution performance is close to Java.

Comparing 385k to 246k, we have a 50% performance increment, is this running in one Mac Intel core? Could you attach the hardware standard here?

Yes, they were benchmarked in a single thread on my Mac.

MacBook Pro (16-inch, 2019)
Processor 2.4 GHz 8-Core Intel Core i9
Memory 32 GB 2667 MHz DDR4

# JMH version: 1.32
# VM version: JDK 1.8.0_292, OpenJDK 64-Bit Server VM, 25.292-b10
# VM invoker: /Users/kezhenxu94/.asdf/installs/java/adoptopenjdk-8.0.292+10/jre/bin/java
# VM options: <none>
# Blackhole mode: full + dont-inline hint
# Warmup: 5 iterations, 10 s each
# Measurement: 5 iterations, 10 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Throughput, ops/time
# Benchmark: org.apache.skywalking.MyBenchmark.testDSL

wu-sheng · 2021-07-09T04:19:01Z

Is this static compilation available for runtime script update or new script as input?

I am thinking to share the same core for #7168 requirement

Yes. Whenever a new LAL DSL script is created, we can compile it to Java-style static byte codes once, after that its execution performance is close to Java.

Then my next question would be, if we use this as debug tool? Whether could cause metaspace oom? Or we need to set up previous dymanic engine for debug tool.

kezhenxu94 · 2021-07-09T05:00:07Z

Is this static compilation available for runtime script update or new script as input?

I am thinking to share the same core for #7168 requirement

Yes. Whenever a new LAL DSL script is created, we can compile it to Java-style static byte codes once, after that its execution performance is close to Java.

Then my next question would be, if we use this as debug tool? Whether could cause metaspace oom? Or we need to set up previous dymanic engine for debug tool.

In terms of dynamic update LAL / MAL and debugging tool for LAL, we can discuss in their corresponding issues because metaspace OOM may also exist before this PR, it's due to Groovy Shell's compilation and references to some class loaders, etc., when we want to dynamically update LAL / MAL and debug them in UI, we should consider this then.

Groovy naturally supports many dynamic features that we don't benefit for now but cost performance loss, in this patch we compile our Groovy-based DSL scripts statically to optimize performance.

wu-sheng · 2021-07-09T09:36:33Z

Is this static compilation available for runtime script update or new script as input?

I am thinking to share the same core for #7168 requirement

Yes. Whenever a new LAL DSL script is created, we can compile it to Java-style static byte codes once, after that its execution performance is close to Java.

Then my next question would be, if we use this as debug tool? Whether could cause metaspace oom? Or we need to set up previous dymanic engine for debug tool.

In terms of dynamic update LAL / MAL and debugging tool for LAL, we can discuss in their corresponding issues because metaspace OOM may also exist before this PR, it's due to Groovy Shell's compilation and references to some class loaders, etc., when we want to dynamically update LAL / MAL and debug them in UI, we should consider this then.

FYI @dmsolr This change should have a very clear impact for your update feature. Please recheck at your sides, especially whether update cause memory leak(because of new classes generated).

wu-sheng

LGTM. Once all tests passed, I am good.

kezhenxu94 added backend OAP backend related. enhancement Enhancement on performance or codes labels Jul 8, 2021

kezhenxu94 added this to the 8.7.0 milestone Jul 8, 2021

kezhenxu94 requested a review from wu-sheng July 8, 2021 04:56

sonatype-lift bot reviewed Jul 8, 2021

View reviewed changes

oap-server/analyzer/log-analyzer/pom.xml Show resolved Hide resolved

wu-sheng reviewed Jul 9, 2021

View reviewed changes

perf: optimize Groovy-based LAL DSL with static compilation

f9879d7

Groovy naturally supports many dynamic features that we don't benefit for now but cost performance loss, in this patch we compile our Groovy-based DSL scripts statically to optimize performance.

kezhenxu94 force-pushed the perf/groovy branch from 453e988 to f9879d7 Compare July 9, 2021 08:48

kezhenxu94 requested a review from wu-sheng July 9, 2021 09:16

wu-sheng approved these changes Jul 9, 2021

View reviewed changes

kezhenxu94 merged commit fd92f46 into master Jul 9, 2021

kezhenxu94 deleted the perf/groovy branch July 9, 2021 10:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

perf: optimize Groovy-based LAL DSL with static compilation#7263

perf: optimize Groovy-based LAL DSL with static compilation#7263
kezhenxu94 merged 1 commit intomasterfrom
perf/groovy

kezhenxu94 commented Jul 8, 2021

Uh oh!

codecov bot commented Jul 8, 2021 •

edited

Loading

Uh oh!

Uh oh!

wu-sheng commented Jul 8, 2021

Uh oh!

wu-sheng left a comment

Uh oh!

wu-sheng commented Jul 9, 2021

Uh oh!

kezhenxu94 commented Jul 9, 2021

Uh oh!

wu-sheng commented Jul 9, 2021

Uh oh!

kezhenxu94 commented Jul 9, 2021 •

edited

Loading

Uh oh!

wu-sheng commented Jul 9, 2021

Uh oh!

wu-sheng left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

kezhenxu94 commented Jul 8, 2021

Improve the performance of LAL

Uh oh!

codecov bot commented Jul 8, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

wu-sheng commented Jul 8, 2021

Uh oh!

wu-sheng left a comment

Choose a reason for hiding this comment

Uh oh!

wu-sheng commented Jul 9, 2021

Uh oh!

kezhenxu94 commented Jul 9, 2021

Uh oh!

wu-sheng commented Jul 9, 2021

Uh oh!

kezhenxu94 commented Jul 9, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wu-sheng commented Jul 9, 2021

Uh oh!

wu-sheng left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov bot commented Jul 8, 2021 •

edited

Loading

kezhenxu94 commented Jul 9, 2021 •

edited

Loading