Add await metrics to "OS Metrics" dashboard #1343

vladzcloudius · 2021-03-26T14:49:52Z

System information

Scylla version (you are using): All of them
Are you willing to contribute it (Yes/No): I'll attach queries below

Describe the feature and the current behavior/state.
"OS Metrics" dashboard is missing a very important graph: r-wait/w-wait disk graphs.
These are crucial when debugging latencies related issues.

Who will benefit with this feature?
Everybody

Any Other info.
Here are Prometheus query that may be used for r-wait:

irate(node_disk_read_time_seconds_total[30s]) / irate(node_disk_reads_completed_total[30s])

More info may be found here: https://www.robustperception.io/mapping-iostat-to-the-node-exporters-node_disk_-metrics

The text was updated successfully, but these errors were encountered:

vladzcloudius · 2021-03-26T14:50:36Z

@slivne @amnonh FYI

slivne · 2021-04-04T15:11:30Z

Since scylla is using a user space i/o scheduler and will not submit items to the disk more then the disk can "chew" these values are expected to be 0 or close to to 0. The cases in which these values can be higher than 0 are: - something else (aside of scylla) is using the disk (backup upload is an example as well - and in such cases we should tune the system accordingly to assure that manager-agent is not using to much bandwidth and that scyllaallows enough bandwidth for the backup upload) - or we have an issue with iotune / disk settings (cache in gcp causing). Amnon this can also be an advisor input - e.g. a value higher than 0.5 can be considered an indication that the system is not setup/tuned correctly.

…

On Fri, Mar 26, 2021 at 5:50 PM Vladislav Zolotarov < ***@***.***> wrote: @slivne <https://github.com/slivne> @amnonh <https://github.com/amnonh> FYI — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1343 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AA2OCCDBRF62R336WUGD6SDTFSNM7ANCNFSM4Z3RCRJA> .

vladzcloudius · 2021-04-05T20:52:44Z

Since scylla is using a user space i/o scheduler and will not submit items to the disk more then the disk can "chew" these values are expected to be 0 or close to to 0.

Right. Not zero but it should be reasonably low. Note that it would also depend on the size of I/O operations taking place at a given time frame.

The cases in which these values can be higher than 0 are: - something else (aside of scylla) is using the disk (backup upload is an example as well - and in such cases we should tune the system accordingly to assure that manager-agent is not using to much bandwidth and that scyllaallows enough bandwidth for the backup upload) - or we have an issue with iotune / disk settings (cache in gcp causing). Amnon this can also be an advisor input - e.g. a value higher than 0.5 can be considered an indication that the system is not setup/tuned correctly.

Good point.
0.5 second is definitely a very high await value.
If memory serves me well iotune targets a specific max I/O latency.
@xemul Could you, please, tell me what is it exactly? (Let's call it X for now).

So, we should probably indicate when actual await times are higher than x10 than that value. Maybe even before...

xemul · 2021-04-06T06:34:40Z

Since scylla is using a user space i/o scheduler and will not submit items to the disk more then the disk can "chew" these values are expected to be 0 or close to to 0.

Depends on what "can chew" means. Note, that seastar's goal is to send to the disk as many data as it can complete within a given time. Strictly speaking it's not the same as "as many data as disk can process without delays". From what I see on both AWS and GCE instances pure writes are always ~2x "overdispatched" in the sense that disk cannot process everything right at once, and queues some data, but anyway manages to complete everything within the latency goal.

0.5 second is definitely a very high await value.
If memory serves me well iotune targets a specific max I/O latency.
@xemul Could you, please, tell me what is it exactly? (Let's call it X for now).

Default latency goal is 0.75ms.

vladzcloudius · 2021-04-06T22:31:51Z

Since scylla is using a user space i/o scheduler and will not submit items to the disk more then the disk can "chew" these values are expected to be 0 or close to to 0.

Depends on what "can chew" means. Note, that seastar's goal is to send to the disk as many data as it can complete within a given time. Strictly speaking it's not the same as "as many data as disk can process without delays". From what I see on both AWS and GCE instances pure writes are always ~2x "overdispatched" in the sense that disk cannot process everything right at once, and queues some data, but anyway manages to complete everything within the latency goal.

Absolutely.
@slivne note that there MUST be some concurrency if we want to get close to the optimum throughput. Hence the added latency (and therefore added await time) will be non-zero either. This is on top of the latency of handling a single request which will also be part of the resulting await time.

xemul · 2021-04-07T06:33:01Z

note that there MUST be some concurrency if we want to get close to the optimum throughput

Not always. For reads -- yes. For writes I saw on AWS 4-disks raid that doing one-at-a-time 64k write req showed peak throughput.

amnonh · 2021-05-18T08:58:49Z

@vladzcloudius @slivne Close to zero is not something I can work with? I understand if it needs to be very low, can we agree on some value?
On my laptop (with an nvme disk) I get the following:

vladzcloudius · 2021-05-18T14:49:54Z

@vladzcloudius @slivne Close to zero is not something I can work with? I understand if it needs to be very low, can we agree on some value?

@amnonh I'm not sure what you are asking here about. Could you, please, clarify?

amnonh · 2021-05-18T17:07:03Z

@vladzcloudius There are two issues here:

the the comment for the graph, right now it's just say average time of read/write, do we want to say something more? Stating that it should be low is not very meaningful for the user, in my example, is that ok? how can the user know?
For an advisor alert, I need to have a hard-coded value, so I need a number we can agree upon that if the average write or read will be higher it will generate an alert and suggest to check iotune.

vladzcloudius · 2021-05-21T20:00:32Z

@vladzcloudius There are two issues here:

1. the the comment for the graph, right now it's just say average time of read/write, do we want to say something more? Stating that it should be low is not very meaningful for the user, in my example, is that ok? how can the user know?

@amnonh
I believe what you have right now is good enough. No need to add anything.

2. For an advisor alert, I need to have a hard-coded value, so I need a number we can agree upon that if the average write or read will be higher it will generate an alert and suggest to check iotune.

The "normal" value depends on the actual HW used for I/O.
"Agreeing" on a fixed value here is not going to cut it.

@xemul Can iotune generate an expected await time?

@amnonh In any case the value needs to be configurable: e.g. there are going to be very different values for NVMe, SDD and HDD.

vladzcloudius added the enhancement label Mar 26, 2021

amnonh added this to the monitoring 3.8 milestone May 6, 2021

amnonh mentioned this issue Jun 1, 2021

Add Await metrics #1409

Merged

amnonh closed this as completed in #1409 Jun 1, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add await metrics to "OS Metrics" dashboard #1343

Add await metrics to "OS Metrics" dashboard #1343

vladzcloudius commented Mar 26, 2021 •

edited

vladzcloudius commented Mar 26, 2021

slivne commented Apr 4, 2021 via email

vladzcloudius commented Apr 5, 2021 •

edited

xemul commented Apr 6, 2021

vladzcloudius commented Apr 6, 2021

xemul commented Apr 7, 2021

amnonh commented May 18, 2021 •

edited

vladzcloudius commented May 18, 2021

amnonh commented May 18, 2021

vladzcloudius commented May 21, 2021

Add await metrics to "OS Metrics" dashboard #1343

Add await metrics to "OS Metrics" dashboard #1343

Comments

vladzcloudius commented Mar 26, 2021 • edited

vladzcloudius commented Mar 26, 2021

slivne commented Apr 4, 2021 via email

vladzcloudius commented Apr 5, 2021 • edited

xemul commented Apr 6, 2021

vladzcloudius commented Apr 6, 2021

xemul commented Apr 7, 2021

amnonh commented May 18, 2021 • edited

vladzcloudius commented May 18, 2021

amnonh commented May 18, 2021

vladzcloudius commented May 21, 2021

vladzcloudius commented Mar 26, 2021 •

edited

vladzcloudius commented Apr 5, 2021 •

edited

amnonh commented May 18, 2021 •

edited