Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

etcdserver: read-only range request took too long #10174

Closed
DylanBLE opened this issue Oct 11, 2018 · 4 comments

Comments

@DylanBLE
Copy link

commented Oct 11, 2018

etcd version: 3.1.18
etcd logs shows sth like "etcdserver: read-only range request took too long"

2018-10-11 08:36:19.724999 W | etcdserver: read-only range request "key:\"/registry/secrets/kube-system/app-manager-appmanager-token-2tv68\" " with result "range_response_count:1 size:2579" took too long (1.602181876s) to execute
2018-10-11 08:36:19.725050 W | etcdserver: read-only range request "key:\"/registry/secrets/kube-system/cert-manager-token-9kbqg\" " with result "range_response_count:1 size:2507" took too long (434.884389ms) to execute
2018-10-11 08:36:19.725103 W | etcdserver: read-only range request "key:\"/registry/secrets/kube-system/metrics-server-token-7p8hh\" " with result "range_response_count:1 size:2521" took too long (574.519966ms) to execute
2018-10-11 08:36:19.725180 W | etcdserver: read-only range request "key:\"/registry/secrets/monitor/kube-prometheus-token-hv9v8\" " with result "range_response_count:1 size:2506" took too long (550.946669ms) to execute
2018-10-11 08:36:19.725293 W | etcdserver: read-only range request "key:\"/registry/secrets/kube-system/metrics-server-token-7p8hh\" " with result "range_response_count:1 size:2521" took too long (574.778948ms) to execute
2018-10-11 08:36:19.725386 W | etcdserver: read-only range request "key:\"/registry/secrets/kube-system/app-manager-appmanager-token-2tv68\" " with result "range_response_count:1 size:2579" took too long (1.6021673s) to execute
2018-10-11 08:36:19.725553 W | etcdserver: read-only range request "key:\"/registry/secrets/kube-system/metrics-server-token-7p8hh\" " with result "range_response_count:1 size:2521" took too long (575.017607ms) to execute
2018-10-11 08:36:28.424718 W | etcdserver: read-only range request "key:\"/registry/configmaps/kube-system/cert-manager-controller\" " with result "range_response_count:1 size:463" took too long (655.555404ms) to execute
2018-10-11 08:36:33.938379 W | wal: sync duration of 1.512199998s, expected less than 1s
2018-10-11 08:36:34.330429 W | etcdserver: read-only range request "key:\"/registry/configmaps/kube-system/cert-manager-controller\" " with result "range_response_count:1 size:463" took too long (1.877004546s) to execute
2018-10-11 08:36:34.332511 W | etcdserver: read-only range request "key:\"/registry/configmaps/kube-system/cert-manager-controller\" " with result "range_response_count:1 size:463" took too long (1.214898596s) to execute
2018-10-11 08:36:34.332604 W | etcdserver: read-only range request "key:\"/registry/configmaps/kube-system/cert-manager-controller\" " with result "range_response_count:1 size:463" took too long (368.532047ms) to execute
2018-10-11 08:36:34.332712 W | etcdserver: read-only range request "key:\"/registry/secrets/monitor/kube-prometheus-token-hv9v8\" " with result "range_response_count:1 size:2506" took too long (985.775514ms) to execute
2018-10-11 08:36:34.332782 W | etcdserver: read-only range request "key:\"/registry/secrets/monitor/kube-prometheus-token-hv9v8\" " with result "range_response_count:1 size:2506" took too long (442.857624ms) to execute
2018-10-11 08:36:38.905450 W | etcdserver: read-only range request "key:\"/registry/configmaps/kube-system/cert-manager-controller\" " with result "range_response_count:1 size:463" took too long (552.704362ms) to execute
2018-10-11 08:36:41.763813 W | etcdserver: read-only range request "key:\"/registry/secrets/kube-system/cert-manager-token-9kbqg\" " with result "range_response_count:1 size:2507" took too long (1.975448008s) to execute
2018-10-11 08:36:41.765057 W | etcdserver: read-only range request "key:\"/registry/secrets/kube-system/app-manager-appmanager-token-2tv68\" " with result "range_response_count:1 size:2579" took too long (1.148544167s) to execute
2018-10-11 08:36:41.765186 W | etcdserver: read-only range request "key:\"/registry/secrets/kube-system/app-manager-appmanager-token-2tv68\" " with result "range_response_count:1 size:2579" took too long (1.148905611s) to execute
2018-10-11 08:36:41.765326 W | etcdserver: read-only range request "key:\"/registry/secrets/kube-system/app-manager-appmanager-token-2tv68\" " with result "range_response_count:1 size:2579" took too long (1.149057771s) to execute
2018-10-11 08:36:41.765448 W | etcdserver: read-only range request "key:\"/registry/secrets/kube-system/app-manager-appmanager-token-2tv68\" " with result "range_response_count:1 size:2579" took too long (1.905296299s) to execute
2018-10-11 08:36:41.765549 W | etcdserver: read-only range request "key:\"/registry/secrets/kube-system/app-manager-appmanager-token-2tv68\" " with result "range_response_count:1 size:2579" took too long (1.149250882s) to execute
2018-10-11 08:36:41.765651 W | etcdserver: read-only range request "key:\"/registry/secrets/kube-system/app-manager-appmanager-token-2tv68\" " with result "range_response_count:1 size:2579" took too long (1.149409844s) to execute
2018-10-11 08:36:41.765771 W | etcdserver: read-only range request "key:\"/registry/events/shengfen/txsql-j62xn-2.155b811d571e37d8\" " with result "range_response_count:1 size:435" took too long (1.074693192s) to execute
2018-10-11 08:36:41.765917 W | etcdserver: read-only range request "key:\"/registry/secrets/kube-system/cert-manager-token-9kbqg\" " with result "range_response_count:1 size:2507" took too long (853.197282ms) to execute

etcd-metrics.txt

@hexfusion

This comment has been minimized.

Copy link
Member

commented Oct 11, 2018

2018-10-11 08:36:33.938379 W | wal: sync duration of 1.512199998s, expected less than 1s
2018-10-11 08:36:41.765917 W | etcdserver: read-only range request "key:"/registry/secrets/kube-system/cert-manager-token-9kbqg" " with result "range_response_count:1 size:2507" took too long (853.197282ms) to execute

@DylanBLE fair to say you have an I/O bottleneck here these numbers are not acceptable for healthy etcd and metrics confirm fsync spikes. Can you tell us a little about your disks that are storing the datadir SSD vs HDD IOPS etc and in general your workload?

FAQ outlines this issue in terms of general expectations pretty well. FAQ: Performance

@wenjiaswe

This comment has been minimized.

Copy link
Contributor

commented Oct 11, 2018

/cc @jingyih

@jingyih

This comment has been minimized.

Copy link
Member

commented Oct 11, 2018

Agree with @hexfusion. The reading of a single key (indicated by range_response_count:1 in the log message) is taking too long. It could be bottlenecked by disk I/O. Or, it could be bottlenecked by networking: assuming you are using a cluster of several etcd nodes, a quorum read request will involve the communication between etcd cluster leader and the remaining etcd nodes.

@xiang90

This comment has been minimized.

Copy link
Contributor

commented Oct 11, 2018

currently some write operations might block read. there is an on going effort (#9384) to resolve this problem.

you should probably fix the slow io regardless though.

@xiang90 xiang90 closed this Oct 11, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
5 participants
You can’t perform that action at this time.