Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Etcdctl error message may confuse users #17332

Open
4 tasks
huanghj78 opened this issue Jan 29, 2024 · 6 comments
Open
4 tasks

Etcdctl error message may confuse users #17332

huanghj78 opened this issue Jan 29, 2024 · 6 comments
Labels
area/etcdctl priority/backlog Higher priority than priority/awaiting-more-evidence. type/bug

Comments

@huanghj78
Copy link

huanghj78 commented Jan 29, 2024

Bug report criteria

What happened?

When etcdserver takes a long time to process requests, etcdctl reports an timeout error:

{"level":"warn","ts":"2024-01-29T15:39:48.719735+0800","caller":"clientv3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"endpoint://client-0b2c668b-8c6a-4756-b20e-10d1584680c0/172.16.238.100:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
Error: context deadline exceeded

For this error message, from the user's perspective, this request may have failed, but in reality, it is still possible for it to execute successfully.
image

What did you expect to happen?

Ensure consistency in logic between the server and client, or refine client-side error messages.

How can we reproduce it (as minimally and precisely as possible)?

Injecting network latency or simply inserting sleep into the source code

Anything else we need to know?

No response

Etcd version (please run commands below)

$ etcd --version
# paste output here
etcd Version: 3.4.27
Git SHA: c92fb80f3
Go Version: go1.19.10
Go OS/Arch: linux/amd64
$ etcdctl version
# paste output here
etcdctl version: 3.4.27
API version: 3.4

Etcd configuration (command line flags or environment variables)

paste your configuration here

version: "3.6"
services:

node1:
image: etcd:debug1
volumes:
- node1-data:/etcd-data
expose:
- 2379
- 2380
networks:
debug_etcd:
ipv4_address: 172.16.238.100
environment:
- ETCDCTL_API=3
command:
- /usr/local/bin/etcd
- --data-dir=/etcd-data
- --name
- node1
- --initial-advertise-peer-urls
- http://172.16.238.100:2380
- --listen-peer-urls
- http://0.0.0.0:2380
- --advertise-client-urls
- http://172.16.238.100:2379
- --listen-client-urls
- http://0.0.0.0:2379
- --initial-cluster
- node1=http://172.16.238.100:2380,node2=http://172.16.238.101:2380,node3=http://172.16.238.102:2380
- --initial-cluster-state
- new
- --initial-cluster-token
- docker-etcd

node2:
image: etcd:debug1
volumes:
- node2-data:/etcd-data
networks:
debug_etcd:
ipv4_address: 172.16.238.101
environment:
- ETCDCTL_API=3
expose:
- 2379
- 2380
command:
- /usr/local/bin/etcd
- --data-dir=/etcd-data
- --name
- node2
- --initial-advertise-peer-urls
- http://172.16.238.101:2380
- --listen-peer-urls
- http://0.0.0.0:2380
- --advertise-client-urls
- http://172.16.238.101:2379
- --listen-client-urls
- http://0.0.0.0:2379
- --initial-cluster
- node1=http://172.16.238.100:2380,node2=http://172.16.238.101:2380,node3=http://172.16.238.102:2380
- --initial-cluster-state
- new
- --initial-cluster-token
- docker-etcd

node3:
image: etcd:debug1
volumes:
- node3-data:/etcd-data
networks:
debug_etcd:
ipv4_address: 172.16.238.102
environment:
- ETCDCTL_API=3
expose:
- 2379
- 2380
command:
- /usr/local/bin/etcd
- --data-dir=/etcd-data
- --name
- node3
- --initial-advertise-peer-urls
- http://172.16.238.102:2380
- --listen-peer-urls
- http://0.0.0.0:2380
- --advertise-client-urls
- http://172.16.238.102:2379
- --listen-client-urls
- http://0.0.0.0:2379
- --initial-cluster
- node1=http://172.16.238.100:2380,node2=http://172.16.238.101:2380,node3=http://172.16.238.102:2380
- --initial-cluster-state
- new
- --initial-cluster-token
- docker-etcd

volumes:
node1-data:
node2-data:
node3-data:

networks:
debug_etcd:
driver: bridge
ipam:
driver: default
config:
-
subnet: 172.16.238.0/24

Etcd debug information (please run commands below, feel free to obfuscate the IP address or FQDN in the output)

$ etcdctl member list -w table
# paste output here
+------------------+---------+-------+----------------------------+----------------------------+------------+
|        ID        | STATUS  | NAME  |         PEER ADDRS         |        CLIENT ADDRS        | IS LEARNER |
+------------------+---------+-------+----------------------------+----------------------------+------------+
|  daf3fd52e3583ff | started | node3 | http://172.16.238.102:2380 | http://172.16.238.102:2379 |      false |
| 422a74f03b622fef | started | node1 | http://172.16.238.100:2380 | http://172.16.238.100:2379 |      false |
| ed635d2a2dbef43d | started | node2 | http://172.16.238.101:2380 | http://172.16.238.101:2379 |      false |
+------------------+---------+-------+----------------------------+----------------------------+------------+

$ etcdctl --endpoints=<member list> endpoint status -w table
# paste output here
+----------------------------+------------------+---------------+---------+-----------+------------+-----------+------------+--------------------+--------+
|          ENDPOINT          |        ID        |    VERSION    | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+----------------------------+------------------+---------------+---------+-----------+------------+-----------+------------+--------------------+--------+
| http://172.16.238.100:2379 | 422a74f03b622fef | 3.6.0-alpha.0 |   20 kB |      true |      false |         2 |          8 |                  8 |        |
| http://172.16.238.101:2379 | ed635d2a2dbef43d | 3.6.0-alpha.0 |   20 kB |     false |      false |         2 |          8 |                  8 |        |
| http://172.16.238.102:2379 |  daf3fd52e3583ff | 3.6.0-alpha.0 |   20 kB |     false |      false |         2 |          8 |                  8 |        |
+----------------------------+------------------+---------------+---------+-----------+------------+-----------+------------+--------------------+--------+

Relevant log output

No response

@rittikdasgupta
Copy link

rittikdasgupta commented Jan 29, 2024

@huanghj78 what value are you using for context timeout? And can you increase the context timeout to avoid context deadline errors?

@huanghj78
Copy link
Author

@rittikdasgupta Thank you for your reply! I use the default context.

And can you increase the context timeout to avoid context deadline errors?

My concern is that there will inevitably be occurrences of timeout errors at certain times. In such situations, the inconsistency between the server and client handling logic may lead to user confusion.

@xulin152
Copy link

@huanghj78您正在使用什么值进行上下文超时?你能增加上下文超时以避免上下文截止日期错误吗?

The data displayed in the 'target' field is inconsistent with what actually occurs.

@shreemaan-abhishek
Copy link
Contributor

@huanghj78 I have seen these logs many times before but never bothered to investigate. Do you have any clues on when this occurs/how to reproduce this?

@nitishfy
Copy link
Contributor

@huanghj78 I have seen these logs many times before but never bothered to investigate. Do you have any clues on when this occurs/how to reproduce this?

Try debugging a test case such as Replacing an etcd member and you'll come across this error. However, if you run the test case directly, you won't be encountering this error since time to complete the operation didn't exceed the time allotted to complete the operation.

@jmhbnz jmhbnz added the priority/backlog Higher priority than priority/awaiting-more-evidence. label Feb 14, 2024
@siyuanfoundation
Copy link
Contributor

@nitishfy is there any AI left for this issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/etcdctl priority/backlog Higher priority than priority/awaiting-more-evidence. type/bug
Development

No branches or pull requests

7 participants