Skip to content
This repository has been archived by the owner on Aug 3, 2020. It is now read-only.

Why state-backend's benchmark including time for job submitting and task deploying. #11

Closed
Myasuka opened this issue Oct 11, 2018 · 2 comments

Comments

@Myasuka
Copy link
Contributor

Myasuka commented Oct 11, 2018

Hi all

From my understanding, the benchmark for state-backend should only compare the performance of combined state operations, but the operation to benchmark within JMH contains the time of submitting job, deploying tasks and so on.
Why not just compare the standalone state-backend performance without influence from other Flink components?

@pnowojski
Copy link
Contributor

pnowojski commented Oct 12, 2018

Hi,

yes, that's true. There are two reasons why.

  1. Large ITCase style benchmarks for state backend were just much easier to implement. It's easier to just change one config value and running already existing application compared to setting up whole new low level benchmark of state writes/reads. You can check package org.apache.flink.streaming.runtime.io.benchmark; package in flink-streaming-java module how much more code is required to set up lower level network benchmarks.
  2. Even having lower level pure state benchmarks, there is still need for larger ITCases (and for even larger cluster benchmarking, which is also not covered btw). This the same discussion as whether you should write a unit test, integration test or maybe a stress test for the given component. All of them have some value.

As far as I recall, the setup/submission logic wasn't affecting the results that much - this can be easily verified by increasing the number of processed records. If increasing it by factor of 2 doesn't change the throughput by much, then submission overheads are negligible.

But having said that, it would be great to have some good low level benchmark suite for state accesses :)

@pnowojski
Copy link
Contributor

I think this got resolved by #13

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants