Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

etcdserver: read-only range request took too long #10174

Closed
DylanBLE opened this issue Oct 11, 2018 · 4 comments
Closed

etcdserver: read-only range request took too long #10174

DylanBLE opened this issue Oct 11, 2018 · 4 comments

Comments

@DylanBLE
Copy link

@DylanBLE DylanBLE commented Oct 11, 2018

etcd version: 3.1.18
etcd logs shows sth like "etcdserver: read-only range request took too long"

2018-10-11 08:36:19.724999 W | etcdserver: read-only range request "key:\"/registry/secrets/kube-system/app-manager-appmanager-token-2tv68\" " with result "range_response_count:1 size:2579" took too long (1.602181876s) to execute
2018-10-11 08:36:19.725050 W | etcdserver: read-only range request "key:\"/registry/secrets/kube-system/cert-manager-token-9kbqg\" " with result "range_response_count:1 size:2507" took too long (434.884389ms) to execute
2018-10-11 08:36:19.725103 W | etcdserver: read-only range request "key:\"/registry/secrets/kube-system/metrics-server-token-7p8hh\" " with result "range_response_count:1 size:2521" took too long (574.519966ms) to execute
2018-10-11 08:36:19.725180 W | etcdserver: read-only range request "key:\"/registry/secrets/monitor/kube-prometheus-token-hv9v8\" " with result "range_response_count:1 size:2506" took too long (550.946669ms) to execute
2018-10-11 08:36:19.725293 W | etcdserver: read-only range request "key:\"/registry/secrets/kube-system/metrics-server-token-7p8hh\" " with result "range_response_count:1 size:2521" took too long (574.778948ms) to execute
2018-10-11 08:36:19.725386 W | etcdserver: read-only range request "key:\"/registry/secrets/kube-system/app-manager-appmanager-token-2tv68\" " with result "range_response_count:1 size:2579" took too long (1.6021673s) to execute
2018-10-11 08:36:19.725553 W | etcdserver: read-only range request "key:\"/registry/secrets/kube-system/metrics-server-token-7p8hh\" " with result "range_response_count:1 size:2521" took too long (575.017607ms) to execute
2018-10-11 08:36:28.424718 W | etcdserver: read-only range request "key:\"/registry/configmaps/kube-system/cert-manager-controller\" " with result "range_response_count:1 size:463" took too long (655.555404ms) to execute
2018-10-11 08:36:33.938379 W | wal: sync duration of 1.512199998s, expected less than 1s
2018-10-11 08:36:34.330429 W | etcdserver: read-only range request "key:\"/registry/configmaps/kube-system/cert-manager-controller\" " with result "range_response_count:1 size:463" took too long (1.877004546s) to execute
2018-10-11 08:36:34.332511 W | etcdserver: read-only range request "key:\"/registry/configmaps/kube-system/cert-manager-controller\" " with result "range_response_count:1 size:463" took too long (1.214898596s) to execute
2018-10-11 08:36:34.332604 W | etcdserver: read-only range request "key:\"/registry/configmaps/kube-system/cert-manager-controller\" " with result "range_response_count:1 size:463" took too long (368.532047ms) to execute
2018-10-11 08:36:34.332712 W | etcdserver: read-only range request "key:\"/registry/secrets/monitor/kube-prometheus-token-hv9v8\" " with result "range_response_count:1 size:2506" took too long (985.775514ms) to execute
2018-10-11 08:36:34.332782 W | etcdserver: read-only range request "key:\"/registry/secrets/monitor/kube-prometheus-token-hv9v8\" " with result "range_response_count:1 size:2506" took too long (442.857624ms) to execute
2018-10-11 08:36:38.905450 W | etcdserver: read-only range request "key:\"/registry/configmaps/kube-system/cert-manager-controller\" " with result "range_response_count:1 size:463" took too long (552.704362ms) to execute
2018-10-11 08:36:41.763813 W | etcdserver: read-only range request "key:\"/registry/secrets/kube-system/cert-manager-token-9kbqg\" " with result "range_response_count:1 size:2507" took too long (1.975448008s) to execute
2018-10-11 08:36:41.765057 W | etcdserver: read-only range request "key:\"/registry/secrets/kube-system/app-manager-appmanager-token-2tv68\" " with result "range_response_count:1 size:2579" took too long (1.148544167s) to execute
2018-10-11 08:36:41.765186 W | etcdserver: read-only range request "key:\"/registry/secrets/kube-system/app-manager-appmanager-token-2tv68\" " with result "range_response_count:1 size:2579" took too long (1.148905611s) to execute
2018-10-11 08:36:41.765326 W | etcdserver: read-only range request "key:\"/registry/secrets/kube-system/app-manager-appmanager-token-2tv68\" " with result "range_response_count:1 size:2579" took too long (1.149057771s) to execute
2018-10-11 08:36:41.765448 W | etcdserver: read-only range request "key:\"/registry/secrets/kube-system/app-manager-appmanager-token-2tv68\" " with result "range_response_count:1 size:2579" took too long (1.905296299s) to execute
2018-10-11 08:36:41.765549 W | etcdserver: read-only range request "key:\"/registry/secrets/kube-system/app-manager-appmanager-token-2tv68\" " with result "range_response_count:1 size:2579" took too long (1.149250882s) to execute
2018-10-11 08:36:41.765651 W | etcdserver: read-only range request "key:\"/registry/secrets/kube-system/app-manager-appmanager-token-2tv68\" " with result "range_response_count:1 size:2579" took too long (1.149409844s) to execute
2018-10-11 08:36:41.765771 W | etcdserver: read-only range request "key:\"/registry/events/shengfen/txsql-j62xn-2.155b811d571e37d8\" " with result "range_response_count:1 size:435" took too long (1.074693192s) to execute
2018-10-11 08:36:41.765917 W | etcdserver: read-only range request "key:\"/registry/secrets/kube-system/cert-manager-token-9kbqg\" " with result "range_response_count:1 size:2507" took too long (853.197282ms) to execute

etcd-metrics.txt

@hexfusion
Copy link
Member

@hexfusion hexfusion commented Oct 11, 2018

2018-10-11 08:36:33.938379 W | wal: sync duration of 1.512199998s, expected less than 1s
2018-10-11 08:36:41.765917 W | etcdserver: read-only range request "key:"/registry/secrets/kube-system/cert-manager-token-9kbqg" " with result "range_response_count:1 size:2507" took too long (853.197282ms) to execute

@DylanBLE fair to say you have an I/O bottleneck here these numbers are not acceptable for healthy etcd and metrics confirm fsync spikes. Can you tell us a little about your disks that are storing the datadir SSD vs HDD IOPS etc and in general your workload?

FAQ outlines this issue in terms of general expectations pretty well. FAQ: Performance

Loading

@wenjiaswe
Copy link
Contributor

@wenjiaswe wenjiaswe commented Oct 11, 2018

/cc @jingyih

Loading

@jingyih
Copy link
Member

@jingyih jingyih commented Oct 11, 2018

Agree with @hexfusion. The reading of a single key (indicated by range_response_count:1 in the log message) is taking too long. It could be bottlenecked by disk I/O. Or, it could be bottlenecked by networking: assuming you are using a cluster of several etcd nodes, a quorum read request will involve the communication between etcd cluster leader and the remaining etcd nodes.

Loading

@xiang90
Copy link
Contributor

@xiang90 xiang90 commented Oct 11, 2018

currently some write operations might block read. there is an on going effort (#9384) to resolve this problem.

you should probably fix the slow io regardless though.

Loading

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
5 participants