New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test(bench): introduce bench tool for the file cache system #3889
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
license-eye has totally checked 937 files.
Valid | Invalid | Ignored | Fixed |
---|---|---|---|
933 | 3 | 1 | 0 |
Click to see the invalid file list
- src/bench/file_cache_bench/bench.rs
- src/bench/file_cache_bench/main.rs
- src/bench/file_cache_bench/utils.rs
Codecov Report
@@ Coverage Diff @@
## main #3889 +/- ##
==========================================
- Coverage 73.88% 73.60% -0.29%
==========================================
Files 828 832 +4
Lines 117044 117500 +456
==========================================
+ Hits 86481 86487 +6
- Misses 30563 31013 +450
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
license-eye has totally checked 942 files.
Valid | Invalid | Ignored | Fixed |
---|---|---|---|
940 | 1 | 1 | 0 |
Click to see the invalid file list
- src/bench/file_cache_bench/analyze.rs
Calculating quantiles seems suffer from a performance issue after long running. Investigating the cause. UPDATE: Fixed with another quantile lib. |
The get hit latency looks high. I'll run a |
fio result:
|
An observation is that: When the disk pressure is high, the latency is mush higher. But because of the bandwidth limitation, the read throughput is no longer increasing. And the latency can be even higher than S3. FYI: ScyllaDB has found the disk character before. See: https://www.scylladb.com/2018/04/19/scylla-i-o-scheduler-3/ To solve the problem, IMO, we can introduce an IO scheduler:
There's not only bad news, because we cannot tell how badly the read/write workload is for the file cache system. Maybe we can simply introduce a threshold for IOs. How do you think about it? @hzxa21 @wenym1 @Little-Wallace |
With a limited rate, the latency can be much better: --default
--loop-min-interval 150 --read 13 --write 1 --concurrency 4
So IMO the key point is:
|
…velabs#3889) * test(bench): introduce bench tool for the file cache system * add foreground write and flush metrics * rename metrics * add more metrics * fix license header * fix file early close when benching and fix timeout * remove unused args * replace quantile lib * replace f64 with u64 for lat * separate get hit and miss lat * add more options
I hereby agree to the terms of the Singularity Data, Inc. Contributor License Agreement.
What's changed and what's your intention?
As titled, introduce bench tool for the file cache system.
Quick Startup:
Usage:
Output example:
To monitor:
Checklist
./risedev check
(or alias,./risedev c
)Refer to a related PR or issue link (optional)
#198
#3556