Fix "Reason: Expired: too old resource version: 379140622 (380367990)"#23504
Fix "Reason: Expired: too old resource version: 379140622 (380367990)"#23504ecerulm wants to merge 1 commit intoapache:mainfrom
Conversation
The previous implementation relied on watch returning the events sorted by resource_version which is not guaranteed.
|
I think this probably Closes #21087 |
|
Although this PR works in EKS it's not legal to assume that the resourceVersion are numeric or that they can be sorted in any meaningful way. From Resource Version Semantics
|
|
Arf too bad! Thanks for trying though. Ain't that weird. Because then how can kubernetes tell you that a resource is too old then? Doesn't that implies some sort of order? Maybe it's just a matter of phrasing and what it means is that the resource just no longer exists (if it ever existed). But that's very missleading. And then that means there are no other way than deal with the bookmark events to implement this correctly, right? |
Yes, I think it's misleading too. That's why I thought I could sort by resource version, but no, resource versions are just opaque ids.
I don't think the bookmark thing solves it either as stated:
I have another PR #23521 which just resets to a fresh watch if there is any error. It will probably get some old (already processed) events but I don't think that breaks anything. |
The previous implementation relied on watch returning the events
sorted by resource_version which is not guaranteed at least in EKS.
So previously you could end up with KubernetesJobWatcher retrying watch from a resource_version that is not valid (too old already)