Skip to content

Old events from the past yielded due to remembered resource_version #819

@nolar

Description

@nolar

Current behaviour

The list...() operation (actually, the Kubernetes API) returns the resources in random arbitrary order. It can happen so that the very old resources (e.g. custom resources 1-month old) go last in the list.

In current implementation, the watcher-streamer remembers the resource_version of the last seen resource — in that case, the old resource versions (1-month old) will be remembered as the last seen one. And when the HTTP call is disconnected for any reason, the watcher-streamer starts a new one, using that remembered old resource_version as the base.

As a result, all the changes of all the resources of that resource kind are yielded: i.e. all happened in the past month, despite they were already yielded before (and presumable handled by the consumer). For the objects that were created since that old timestamp, it yields the ADDED & all MODIFIED events & event the DELETED events.

Example

Example for my custom resource kind:

In [59]: kubernetes.config.load_kube_config()  # developer's config files
In [60]: api = kubernetes.client.CustomObjectsApi()
In [61]: api_fn = api.list_cluster_custom_object
In [62]: w = kubernetes.watch.Watch()
In [63]: stream = w.stream(api_fn, 'example.com', 'v1', 'mycrds')
In [64]: for ev in stream: print((ev['type'], ev['object'].get('metadata', {}).get('name'), ev['object'].get('metadata', {}).get('resourceVersion'), ev['object'] if ev['type'] == 'ERROR' else None))

('ADDED', 'mycrd-20190328073027', '213646032', None)
('ADDED', 'mycrd-20190404073027', '222002640', None)
('ADDED', 'mycrd-20190408065731', '222002770', None)
('ADDED', 'mycrd-20190409073007', '222002799', None)
('ADDED', 'mycrd-20190410073012', '222070110', None)
('ADDED', 'mycrd-20190412073005', '223458915', None)
('ADDED', 'mycrd-20190416073028', '226128256', None)
('ADDED', 'mycrd-20190314165455', '233262799', None)
('ADDED', 'mycrd-20190315073002', '205552290', None)
('ADDED', 'mycrd-20190321073022', '209509389', None)
('ADDED', 'mycrd-20190322073027', '209915543', None)
('ADDED', 'mycrd-20190326073030', '212318823', None)
('ADDED', 'mycrd-20190402073005', '222002561', None)
('ADDED', 'mycrd-20190415154942', '225660142', None)
('ADDED', 'mycrd-20190419073010', '228579290', None)
('ADDED', 'mycrd-20190423073032', '232894099', None)
('ADDED', 'mycrd-20190424073015', '232894129', None)
('ADDED', 'mycrd-20190319073031', '207954735', None)
('ADDED', 'mycrd-20190403073019', '222002615', None)
('ADDED', 'mycrd-20190405073040', '222002719', None)
('ADDED', 'mycrd-20190415070301', '225374502', None)
('ADDED', 'mycrd-20190417073005', '226917625', None)
('ADDED', 'mycrd-20190418073023', '227736631', None)
('ADDED', 'mycrd-20190327073030', '212984265', None)
('ADDED', 'mycrd-20190422061326', '230661413', None)
('ADDED', 'mycrd-20190318070654', '207313230', None)
('ADDED', 'mycrd-20190401101414', '216222726', None)
('ADDED', 'mycrd-20190320073041', '208884644', None)
('ADDED', 'mycrd-20190326165718', '212611027', None)
('ADDED', 'mycrd-20190329073007', '214304201', None)
('ADDED', 'mycrd-20190325095839', '211712843', None)
('ADDED', 'mycrd-20190411073018', '223394843', None)
^C

Please note the random order of resource_versions. Depending on your luck and current state of the cluster, you can get either the new enough, or the oldest resource in the last line.

Let's use the latest resource_version 223394843 with a new watch object:

In [76]: w = kubernetes.watch.Watch()
In [79]: stream = w.stream(api_fn, 'example.com', 'v1', 'mycrds', resource_version='223394843')
In [80]: for ev in stream: print((ev['type'], ev['object'].get('metadata', {}).get('name'), ev['object'].get('metadata', {}).get('resourceVersion'), ev['object'] if ev['type'] == 'ERROR' else None))

('ERROR', None, None, {'kind': 'Status', 'apiVersion': 'v1', 'metadata': {}, 'status': 'Failure', 'message': 'too old resource version: 223394843 (226210031)', 'reason': 'Gone', 'code': 410})
('ERROR', None, None, {'kind': 'Status', 'apiVersion': 'v1', 'metadata': {}, 'status': 'Failure', 'message': 'too old resource version: 223394843 (226210031)', 'reason': 'Gone', 'code': 410})
('ERROR', None, None, {'kind': 'Status', 'apiVersion': 'v1', 'metadata': {}, 'status': 'Failure', 'message': 'too old resource version: 223394843 (226210031)', 'reason': 'Gone', 'code': 410})
('ERROR', None, None, {'kind': 'Status', 'apiVersion': 'v1', 'metadata': {}, 'status': 'Failure', 'message': 'too old resource version: 223394843 (226210031)', 'reason': 'Gone', 'code': 410})

……… repeated infinitely ………

Well, okay, let's try the recommended resource_version, which is at least known to the API:

In [83]: w = kubernetes.watch.Watch()
In [84]: stream = w.stream(api_fn, 'example.com', 'v1', 'mycrds', resource_version='226210031')
In [85]: for ev in stream: print((ev['type'], ev['object'].get('metadata', {}).get('name'), ev['object'].get('metadata', {}).get('resourceVersion'), ev['object'] if ev['type'] == 'ERROR' else None))

('ADDED', 'mycrd-expr1', '226370109', None)
('MODIFIED', 'mycrd-expr1', '226370111', None)
('MODIFIED', 'mycrd-expr1', '226370116', None)
('MODIFIED', 'mycrd-expr1', '226370127', None)
('MODIFIED', 'mycrd-expr1', '226370549', None)
('DELETED', 'mycrd-expr1', '226370553', None)
('ADDED', 'mycrd-20190417073005', '226917595', None)
('MODIFIED', 'mycrd-20190417073005', '226917597', None)
('MODIFIED', 'mycrd-20190417073005', '226917605', None)
('MODIFIED', 'mycrd-20190417073005', '226917614', None)
('MODIFIED', 'mycrd-20190417073005', '226917625', None)
('ADDED', 'mycrd-20190418073023', '227736612', None)
('MODIFIED', 'mycrd-20190418073023', '227736613', None)
('MODIFIED', 'mycrd-20190418073023', '227736618', None)
('MODIFIED', 'mycrd-20190418073023', '227736629', None)
('MODIFIED', 'mycrd-20190418073023', '227736631', None)
('ADDED', 'mycrd-20190419073010', '228579268', None)
('MODIFIED', 'mycrd-20190419073010', '228579269', None)
('MODIFIED', 'mycrd-20190419073010', '228579276', None)
('MODIFIED', 'mycrd-20190419073010', '228579286', None)
('MODIFIED', 'mycrd-20190419073010', '228579290', None)
('ADDED', 'mycrd-20190422061326', '230661394', None)
('MODIFIED', 'mycrd-20190422061326', '230661395', None)
('MODIFIED', 'mycrd-20190422061326', '230661399', None)
('MODIFIED', 'mycrd-20190422061326', '230661411', None)
('MODIFIED', 'mycrd-20190422061326', '230661413', None)
('ADDED', 'mycrd-20190423073032', '231459008', None)
('MODIFIED', 'mycrd-20190423073032', '231459009', None)
('MODIFIED', 'mycrd-20190423073032', '231459013', None)
('MODIFIED', 'mycrd-20190423073032', '231459025', None)
('MODIFIED', 'mycrd-20190423073032', '231459027', None)
('MODIFIED', 'mycrd-20190423073032', '232128498', None)
('MODIFIED', 'mycrd-20190423073032', '232128514', None)
('MODIFIED', 'mycrd-20190423073032', '232128518', None)
('ADDED', 'mycrd-20190424073015', '232198227', None)
('MODIFIED', 'mycrd-20190424073015', '232198228', None)
('MODIFIED', 'mycrd-20190424073015', '232198235', None)
('MODIFIED', 'mycrd-20190424073015', '232198247', None)
('MODIFIED', 'mycrd-20190424073015', '232198249', None)
('MODIFIED', 'mycrd-20190423073032', '232894049', None)
('MODIFIED', 'mycrd-20190423073032', '232894089', None)
('MODIFIED', 'mycrd-20190424073015', '232894093', None)
('MODIFIED', 'mycrd-20190423073032', '232894099', None)
('MODIFIED', 'mycrd-20190424073015', '232894119', None)
('MODIFIED', 'mycrd-20190424073015', '232894129', None)
('ADDED', 'mycrd-20190425073032', '232973618', None)
('MODIFIED', 'mycrd-20190425073032', '232973619', None)
('MODIFIED', 'mycrd-20190425073032', '232973624', None)
('MODIFIED', 'mycrd-20190425073032', '232973635', None)
('MODIFIED', 'mycrd-20190425073032', '232973638', None)
('MODIFIED', 'mycrd-20190314165455', '233190859', None)
('MODIFIED', 'mycrd-20190314165455', '233190861', None)
('MODIFIED', 'mycrd-20190314165455', '233254055', None)
('MODIFIED', 'mycrd-20190314165455', '233254057', None)
('MODIFIED', 'mycrd-20190314165455', '233262797', None)
('MODIFIED', 'mycrd-20190314165455', '233262799', None)
^C

All this is dumped immediately, nothing happens in the cluster during these operations. All these changes are old, i.e. not expected, as they were processed before doing list...().

Please note that even the deleted non-existing resource are yielded ("expr1").

Dilemma

See kubernetes-client/python-base#131 for a suggested implementation of the monotonically increasing resource_version as remembered by the watcher.

However, one of the unit-tests says about the resource version:

rv must be treated as an opaque value we cannot interpret it and order it so rely on k8s returning the events completely and in order

Kubernetes does not keep the promise, and returns the events in random order.

Way A: If the client library starts interpreting the resource versions, and to remember the maximum value seen, it can break its compatibility with kubernetes.

Way B: If the client library decides to treat the resource version as opaque and non-interpretable, it should also stop remembering it, as it leads to the re-yielding of the events from the past (long ago), as demonstrated above.

In the latter case, all resource_version support should not be in the Watch.stream() at all, and only the users of the watcher-streamer should decide on whether they are interpreting the resource version or not, and to track it by their own rules (at their own risk).

Possibly related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions