Description
Bug report criteria
- This bug report is not security related, security issues should be disclosed privately via security@etcd.io.
- This is not a support request or question, support requests or questions should be raised in the etcd discussion forums.
- You have read the etcd bug reporting guidelines.
- Existing open issues along with etcd frequently asked questions have been checked and this is not a duplicate.
What happened?
Just yesterday we merged #20173 that added to both Robustness tests and Antithesis tests new type of client, one that connects to all members.
Within 1 day both frameworks found a new issue with watch: broke Bookmarkable - Progress notification events guarantee that all events up to a revision have been already delivered
. Failures:
- Antithesis: https://linuxfoundation.antithesis.com/report/UZjUP_KGxboJepL7k1q_8pa4/ZqL0Vt9a7YESiiBmGecPMkBP8YgM1IwlTZJ4dcYjmZ8.html?auth=v2.public.eyJuYmYiOiIyMDI1LTA2LTI1VDAzOjE4OjIzLjM4MDU2MDQwMFoiLCJzY29wZSI6eyJSZXBvcnRTY29wZVYxIjp7ImFzc2V0IjoiWnFMMFZ0OWE3WUVTaWlCbUdlY1BNa0JQOFlnTTFJd2xUWko0ZGNZam1aOC5odG1sIiwicmVwb3J0X2lkIjoiVVpqVVBfS0d4Ym9KZXBMN2sxcV84cGE0In19feIAsYO4-UIigcL4eMu7QUqA6XFbCU3Hnw7BeyZW06o9x11mFqleHbSbRWdIcLdTH2Xzx42DXNB7dBqYq25Ujg4
- https://prow.k8s.io/view/gs/kubernetes-ci-logs/logs/ci-etcd-robustness-main-amd64/1937584192116232192
Example request for revision 148 returns event with revision 67.
{
"Request": {
"Key": "/registry/pods/",
"Revision": 148,
"WithPrefix": true,
"WithProgressNotify": true,
"WithPrevKV": true
},
"Responses": [
{
"Events": [
{
"Type": "put-operation",
"Key": "/registry/pods/default/89523",
"Value": {
"Value": "65"
},
"Revision": 67,
"PrevValue": {
"Value": {
"Value": "46"
},
"ModRevision": 35,
"Version": 3
}
}
],
"Revision": 67,
"Time": 496351519816
},
As this issue never appeared on clients connecting to single etcd member I expect this is an issue with client loadbalance/watch resume logic.
/cc @ahrtr @fuweid @siyuanfoundation @nwnt @joshjms @marcus-hodgson-antithesis
What did you expect to happen?
For watch request for revision 148, etcd should return revision not lower than 148.
How can we reproduce it (as minimally and precisely as possible)?
Report by Antithesis: var_report_dump.tar.gz
Report by Robustness: results.tar.gz
Anything else we need to know?
Needs to confirm this is a bug on etcd client side and not robustness recording side.
Etcd version (please run commands below)
Robustness found it on etcd main, Antithesis on release-3.6
Etcd configuration (command line flags or environment variables)
paste your configuration here
Etcd debug information (please run commands below, feel free to obfuscate the IP address or FQDN in the output)
$ etcdctl member list -w table
# paste output here
$ etcdctl --endpoints=<member list> endpoint status -w table
# paste output here