Etcdctl error message may confuse users #17332

huanghj78 · 2024-01-29T07:56:55Z

Bug report criteria

This bug report is not security related, security issues should be disclosed privately via security@etcd.io.
This is not a support request or question, support requests or questions should be raised in the etcd discussion forums.
You have read the etcd bug reporting guidelines.
Existing open issues along with etcd frequently asked questions have been checked and this is not a duplicate.

What happened?

When etcdserver takes a long time to process requests, etcdctl reports an timeout error:

{"level":"warn","ts":"2024-01-29T15:39:48.719735+0800","caller":"clientv3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"endpoint://client-0b2c668b-8c6a-4756-b20e-10d1584680c0/172.16.238.100:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
Error: context deadline exceeded

For this error message, from the user's perspective, this request may have failed, but in reality, it is still possible for it to execute successfully.

What did you expect to happen?

Ensure consistency in logic between the server and client, or refine client-side error messages.

How can we reproduce it (as minimally and precisely as possible)?

Injecting network latency or simply inserting sleep into the source code

Anything else we need to know?

No response

Etcd version (please run commands below)

$ etcd --version
# paste output here
etcd Version: 3.4.27
Git SHA: c92fb80f3
Go Version: go1.19.10
Go OS/Arch: linux/amd64
$ etcdctl version
# paste output here
etcdctl version: 3.4.27
API version: 3.4

Etcd configuration (command line flags or environment variables)

paste your configuration here

version: "3.6"
services:

node1:
image: etcd:debug1
volumes:
- node1-data:/etcd-data
expose:
- 2379
- 2380
networks:
debug_etcd:
ipv4_address: 172.16.238.100
environment:
- ETCDCTL_API=3
command:
- /usr/local/bin/etcd
- --data-dir=/etcd-data
- --name
- node1
- --initial-advertise-peer-urls
- http://172.16.238.100:2380
- --listen-peer-urls
- http://0.0.0.0:2380
- --advertise-client-urls
- http://172.16.238.100:2379
- --listen-client-urls
- http://0.0.0.0:2379
- --initial-cluster
- node1=http://172.16.238.100:2380,node2=http://172.16.238.101:2380,node3=http://172.16.238.102:2380
- --initial-cluster-state
- new
- --initial-cluster-token
- docker-etcd

node2:
image: etcd:debug1
volumes:
- node2-data:/etcd-data
networks:
debug_etcd:
ipv4_address: 172.16.238.101
environment:
- ETCDCTL_API=3
expose:
- 2379
- 2380
command:
- /usr/local/bin/etcd
- --data-dir=/etcd-data
- --name
- node2
- --initial-advertise-peer-urls
- http://172.16.238.101:2380
- --listen-peer-urls
- http://0.0.0.0:2380
- --advertise-client-urls
- http://172.16.238.101:2379
- --listen-client-urls
- http://0.0.0.0:2379
- --initial-cluster
- node1=http://172.16.238.100:2380,node2=http://172.16.238.101:2380,node3=http://172.16.238.102:2380
- --initial-cluster-state
- new
- --initial-cluster-token
- docker-etcd

node3:
image: etcd:debug1
volumes:
- node3-data:/etcd-data
networks:
debug_etcd:
ipv4_address: 172.16.238.102
environment:
- ETCDCTL_API=3
expose:
- 2379
- 2380
command:
- /usr/local/bin/etcd
- --data-dir=/etcd-data
- --name
- node3
- --initial-advertise-peer-urls
- http://172.16.238.102:2380
- --listen-peer-urls
- http://0.0.0.0:2380
- --advertise-client-urls
- http://172.16.238.102:2379
- --listen-client-urls
- http://0.0.0.0:2379
- --initial-cluster
- node1=http://172.16.238.100:2380,node2=http://172.16.238.101:2380,node3=http://172.16.238.102:2380
- --initial-cluster-state
- new
- --initial-cluster-token
- docker-etcd

volumes:
node1-data:
node2-data:
node3-data:

networks:
debug_etcd:
driver: bridge
ipam:
driver: default
config:
-
subnet: 172.16.238.0/24

Etcd debug information (please run commands below, feel free to obfuscate the IP address or FQDN in the output)

$ etcdctl member list -w table
# paste output here
+------------------+---------+-------+----------------------------+----------------------------+------------+
|        ID        | STATUS  | NAME  |         PEER ADDRS         |        CLIENT ADDRS        | IS LEARNER |
+------------------+---------+-------+----------------------------+----------------------------+------------+
|  daf3fd52e3583ff | started | node3 | http://172.16.238.102:2380 | http://172.16.238.102:2379 |      false |
| 422a74f03b622fef | started | node1 | http://172.16.238.100:2380 | http://172.16.238.100:2379 |      false |
| ed635d2a2dbef43d | started | node2 | http://172.16.238.101:2380 | http://172.16.238.101:2379 |      false |
+------------------+---------+-------+----------------------------+----------------------------+------------+

$ etcdctl --endpoints=<member list> endpoint status -w table
# paste output here
+----------------------------+------------------+---------------+---------+-----------+------------+-----------+------------+--------------------+--------+
|          ENDPOINT          |        ID        |    VERSION    | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+----------------------------+------------------+---------------+---------+-----------+------------+-----------+------------+--------------------+--------+
| http://172.16.238.100:2379 | 422a74f03b622fef | 3.6.0-alpha.0 |   20 kB |      true |      false |         2 |          8 |                  8 |        |
| http://172.16.238.101:2379 | ed635d2a2dbef43d | 3.6.0-alpha.0 |   20 kB |     false |      false |         2 |          8 |                  8 |        |
| http://172.16.238.102:2379 |  daf3fd52e3583ff | 3.6.0-alpha.0 |   20 kB |     false |      false |         2 |          8 |                  8 |        |
+----------------------------+------------------+---------------+---------+-----------+------------+-----------+------------+--------------------+--------+

Relevant log output

No response

rittikdasgupta · 2024-01-29T19:58:42Z

@huanghj78 what value are you using for context timeout? And can you increase the context timeout to avoid context deadline errors?

huanghj78 · 2024-01-30T03:16:57Z

@rittikdasgupta Thank you for your reply! I use the default context.

And can you increase the context timeout to avoid context deadline errors?

My concern is that there will inevitably be occurrences of timeout errors at certain times. In such situations, the inconsistency between the server and client handling logic may lead to user confusion.

xulin152 · 2024-01-30T08:41:31Z

@huanghj78您正在使用什么值进行上下文超时？你能增加上下文超时以避免上下文截止日期错误吗？

The data displayed in the 'target' field is inconsistent with what actually occurs.

shreemaan-abhishek · 2024-02-02T03:31:07Z

@huanghj78 I have seen these logs many times before but never bothered to investigate. Do you have any clues on when this occurs/how to reproduce this?

nitishfy · 2024-02-13T18:19:17Z

@huanghj78 I have seen these logs many times before but never bothered to investigate. Do you have any clues on when this occurs/how to reproduce this?

Try debugging a test case such as Replacing an etcd member and you'll come across this error. However, if you run the test case directly, you won't be encountering this error since time to complete the operation didn't exceed the time allotted to complete the operation.

siyuanfoundation · 2024-04-25T18:15:45Z

@nitishfy is there any AI left for this issue?

huanghj78 added the type/bug label Jan 29, 2024

jmhbnz added the area/etcdctl label Feb 11, 2024

jmhbnz added the priority/backlog Higher priority than priority/awaiting-more-evidence. label Feb 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Etcdctl error message may confuse users #17332

Etcdctl error message may confuse users #17332

huanghj78 commented Jan 29, 2024 •

edited

Loading

paste your configuration here

rittikdasgupta commented Jan 29, 2024 •

edited

Loading

huanghj78 commented Jan 30, 2024

xulin152 commented Jan 30, 2024

shreemaan-abhishek commented Feb 2, 2024

nitishfy commented Feb 13, 2024

siyuanfoundation commented Apr 25, 2024

Etcdctl error message may confuse users #17332

Etcdctl error message may confuse users #17332

Comments

huanghj78 commented Jan 29, 2024 • edited Loading

Bug report criteria

What happened?

What did you expect to happen?

How can we reproduce it (as minimally and precisely as possible)?

Anything else we need to know?

Etcd version (please run commands below)

Etcd configuration (command line flags or environment variables)

paste your configuration here

Etcd debug information (please run commands below, feel free to obfuscate the IP address or FQDN in the output)

Relevant log output

rittikdasgupta commented Jan 29, 2024 • edited Loading

huanghj78 commented Jan 30, 2024

xulin152 commented Jan 30, 2024

shreemaan-abhishek commented Feb 2, 2024

nitishfy commented Feb 13, 2024

siyuanfoundation commented Apr 25, 2024

huanghj78 commented Jan 29, 2024 •

edited

Loading

rittikdasgupta commented Jan 29, 2024 •

edited

Loading