identity: Make identity allocations observable #26373

mhofstetter · 2023-06-20T11:28:01Z

This makes the identity allocator changes observable in order to provide cells backend-agnostic access to identity allocations.

The identity allocation changes are now provided in the hive as stream.Observable[cache.IdentityChange]. When observing the initial listing is first waited for, then the current state is replayed followed by sync event and updates to state:

[ (wait for OnListDone()), Upsert, Upsert, Sync, Upsert, Delete, ... ]

Related to #25898 as this lays the foundation to replace the use of Resource[CiliumIdentity] with the observable identity allocator. This comes with the advantage of getting rid of the duplicated K8s Watcher for CiliumIdentities and the additional support for KVStore identity backend. A follow up PR will refactor the authentication related part (garbage collection).

Original PR by @joamaki: #26229

giorio94

Nice! I've left a few comments inline.

pkg/allocator/cache.go

pkg/allocator/allocator_test.go

pkg/allocator/cache.go

mhofstetter · 2023-06-20T15:56:13Z

@giorio94 thanks for your fast feedback 🚀 i applied your suggestions - PTAL!

giorio94

/lgtm

joestringer

Some minor comments and concerns around type reuse and lifecycle questions. I'm clearly not yet qualified enough to review code that uses streams.

pkg/identity/cache/allocator.go

pkg/allocator/cache.go

mhofstetter · 2023-06-21T08:51:03Z

Some minor comments and concerns around type reuse and lifecycle questions. I'm clearly not yet qualified enough to review code that uses streams.

@joestringer thanks for your feedback and questions. I included your feedback (labels.Labels) in the latest force push and tried to answer your questions as good as possible 🙏

mhofstetter · 2023-06-21T18:22:05Z

Added @dylandreimerink as reviewer to this PR to have an additional pair of 👀 focussing on the "Observable" aspects.

joestringer

Just minor nits remaining: mostly just documenting each go ... call to declare why it's necessary and how we know it will complete.

pkg/identity/cache/allocator.go

pkg/allocator/cache.go

mhofstetter · 2023-06-22T06:25:55Z

@joestringer @christarazi thanks for your input. i addressed your suggestions in the latest force push.

mhofstetter · 2023-06-22T13:43:16Z

/test

joestringer · 2023-06-22T17:22:25Z

pkg/identity/cache/allocator.go

+		// Calling complete from a new go routine, the same way as it would happen if the IdentityAllocator would
+		// be set. This is how it should be expected according to stream.Observable and prevents faulty assumptions.


I'm still confused, bear with me:

the same way as it would happen if the IdentityAllocator would be set

What is that way? Is there a function reference or other docs description that the reader can compare with? Given that complete() is an opaque function, it's hard to understand what complete does here and why that means it needs to be run in a goroutine. This comment assumes that the reader already knows this code very well, and then explains to them that "yes it works the way you expect". But for a fresh reader, they will not have the context, so then how will they build up the context to validate this assumption?

This is how it should be expected according to stream.Observable

Observable interface says:

// Observe a stream of values as long as the given context is valid. // 'next' is called for each item, and finally 'complete' is called // when the stream is complete, or an error has occurred. // // Observable implementations are allowed to call 'next' and 'complete' // from any goroutine, but never concurrently.

Nothing in the above tells me that complete must be run in a new goroutine.

prevents faulty assumptions.

What are those faulty assumptions?

There is no specific contract with regards to calling complete from the same or a different routine as Observe. But looking at existing examples, we always call complete from a goroutine spawned inside Observe. Due to that behavior, we might that users of stream.Observables which for some reason don't expect complete to be called before the call to Observe returns. I assume those are the "prevents faulty assumptions." referred to.

The faulty assumption could lead to the following which would deadlock if complete is called synchronously:

mu.Lock() someStream.Observe(ctx, func(){...}, func(err error){ mu.Lock() defer mu.Unlock() }) mu.Unlock()

So by calling complete from a goroutine we avoid deadlocks by uninformed users which can be called "resilience". But its also implementing an implicit contract, so perhaps if we want to allow this we should document it.

TL;DR calling complete without a goroutine is valid as long as users of the API don't use it wrongly

@dylandreimerink I see what you're saying, but it seems odd to me that a caller would grab a mutex outside the Observe and also in the complete function that they provide to Observe(). I would assume that the functions that are called in Observe() can be called from the current thread, so that pattern would just be unsafe. In fact, from the Observable documentation, it sounds like Observable implementations could call it even from the same goroutine, which says that the locking example above is not respecting the Observe() API. ie grabbing a mutex then calling Observe(...) where the Observable can choose the same goroutine to call complete() could easily trigger a deadlock. Therefore we shouldn't assume that people will write such code, we'll catch it in review and reject it because it doesn't use the Observe() function correctly.

One more bit of background: We should have reasons for goroutines, and avoid spawning goroutines if they're not necessary. They do take a little bit of resources, and they also add complexity due to (1) lifecycle management, ensuring they're completed and (2) by making things more asynchronous. I'd imagine there are good reasons for a lot of the existing complete() goroutine cases today, but it makes me nervous to see "well run it in a goroutine to avoid bad assumptions", because we can still make other bad assumptions around the goroutines and then we have bad assumptions and additional async complexity.

Maybe... could you rephrase exactly what the concrete problem is with having complete() run synchronously? I'm clearly still missing something here.

Stepping back through, I think the argument is that since CachingIdentityAllocator -> Allocator -> cache -> newCache() initializes the stream with a stream.Multicast[...], and since the implementation of stream.Multicast runs complete() from a goroutine, this code must also do the same. Broadly that seems like a consistency argument, which seems fine at face value. Based on the Observable() interface API, it should be equally fine to just run the completion function directly without the goroutine.

Assuming that my latest comment is correct, my suggestion would be either:

(a) drop the goroutine since it's not needed, or
(b) document that assumption more explicitly:

Suggested change

// Calling complete from a new go routine, the same way as it would happen if the IdentityAllocator would

// be set. This is how it should be expected according to stream.Observable and prevents faulty assumptions.

// Calling complete from a new go routine, the same way as it would happen from stream.Multicast() in the Observe() function above.

@joestringer @dylandreimerink

i came up with c) : removing the actual go-routine from the id allocation business logic and using the existing stream API stream.Empty[allocator.AllocatorChange] in case where m.IdentityAllocator is nil. I refactored the logic a little bit - so the only difference is the observable (empty or m.IdentityAllocator). IMO this way the code should be readable and understandable from a id allocation point of view - if someone is understanding the observable patterns (there are enough pointers in the go doc of package stream).

even though the called stream.Empty is doing the same thing (immediately calling complete from a new go-routine), it's up to the stream package to provide the necessary (and allowed) observable patterns as API and document them well (so not every business logic need to document this again and again).

It's really a good discussion and i think there are valid arguments to keep or remove the go-routine. It's just that i think the context to discuss this in this "id allocation business related" PR isn't the right one. It should be part of the stream API and is relevant for all usages of it. -> can we discuss this in a follow up? will bring this up once @joamaki is back.

WDYT?

next update:

Trying to understand the initial reason to differentiate between whether m.IdentityAllocator is initialized or not triggered me to dive into this a little bit more. I came to the conclusion that it's necessary to wrap the whole (m *CachingIdentityAllocator) Observe in an additional short lived go-routine (:see_no_evil:) which waits until the m.IdentityAllocator is initialized before starting to observe it.

The reason for this is that the global identity allocator is initialized asynchronously - and might not be properly intialized once starting to observe. This would result in the immediate completion. IMO in these situations its expected that the observation waits until the identity allocator is ready and initialized anyway - otherwise the depending functionality doesn't work as expected (e.g. auth gc job would stop listening for these events)

This comes with the "advantage" that we don't need to discuss whether complete is called from a separate go-routine by itself. (because the whole functionality runs in its own go-routine)

I documented this additional short-lived go-routine! (same situation here - locking the mutex (necessary to ensure that the identitAllocator is really set) shouldn't be a problem - because the inner observer starts a go-routine too)

PTAL @joestringer @dylandreimerink 🙏

PS: yes at best we would start to properly modularize the global identity allocator and register proper lifecycle hooks for the initialization (including the backends) - so dependent components can expect a initialized component.

pkg/allocator/cache.go

dylandreimerink

Looks good to me overall!

pkg/identity/cache/allocator.go

mhofstetter · 2023-06-26T07:40:41Z

@dylandreimerink thanks a lot for your review: i removed the TODO, refactored towards using stream.Map(:pray:) and stream.Empty (discovered after reading through the API of stream)

mhofstetter · 2023-06-26T10:34:15Z

The latest force push changes that the global identity allocator needs to be initialized before starting to observe the identity allocator for changes.

This is necessasry, because the initialization of the global identity allocator is performed asynchronously when the daemon gets started.

jrajahalme

LGTM

joestringer · 2023-06-26T22:34:45Z

/test

This makes the identity allocator changes observable in order to provide cells backend-agnostic access to identity allocations. The identity allocation changes are now provided in the hive as stream.Observable[cache.IdentityChange]. When observing the initial listing is first waited for, then the current state is replayed followed by sync event and updates to state: `[ (wait for OnListDone()), Upsert, Upsert, Sync, Upsert, Deletem, ... ]` Co-authored-by: Jussi Maki <jussi@isovalent.com> Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com>

mhofstetter · 2023-06-27T06:56:54Z

rebased to main (failed tests related to #26496)

mhofstetter · 2023-06-27T07:28:19Z

/test

mhofstetter requested a review from jrajahalme June 20, 2023 11:28

mhofstetter requested review from a team as code owners June 20, 2023 11:28

mhofstetter requested review from christarazi and giorio94 June 20, 2023 11:28

mhofstetter mentioned this pull request Jun 20, 2023

[DRAFT] Observable identity allocations #26229

Closed

mhofstetter force-pushed the pr/mhofstetter/observable-allocator branch from 1a077f6 to 9b74b74 Compare June 20, 2023 11:32

This was referenced Jun 20, 2023

auth: Switch to observing identity changes #26375

Merged

Auth: Support garbage collection for KVStore Identity Backend #25898

Closed

giorio94 reviewed Jun 20, 2023

View reviewed changes

mhofstetter force-pushed the pr/mhofstetter/observable-allocator branch from 9b74b74 to 4a44720 Compare June 20, 2023 15:38

mhofstetter requested a review from giorio94 June 20, 2023 15:56

giorio94 approved these changes Jun 20, 2023

View reviewed changes

joestringer reviewed Jun 20, 2023

View reviewed changes

pkg/identity/cache/allocator.go Outdated Show resolved Hide resolved

pkg/identity/cache/allocator.go Outdated Show resolved Hide resolved

pkg/allocator/cache.go Show resolved Hide resolved

pkg/allocator/cache.go Show resolved Hide resolved

mhofstetter force-pushed the pr/mhofstetter/observable-allocator branch from 4a44720 to f9eb988 Compare June 21, 2023 08:43

mhofstetter requested review from joestringer and dylandreimerink June 21, 2023 09:03

joestringer approved these changes Jun 21, 2023

View reviewed changes

christarazi reviewed Jun 21, 2023

View reviewed changes

pkg/identity/cache/allocator.go Outdated Show resolved Hide resolved

pkg/identity/cache/allocator.go Outdated Show resolved Hide resolved

pkg/allocator/cache.go Show resolved Hide resolved

mhofstetter force-pushed the pr/mhofstetter/observable-allocator branch from f9eb988 to f6c1a02 Compare June 22, 2023 06:21

joestringer reviewed Jun 22, 2023

View reviewed changes

dylandreimerink approved these changes Jun 23, 2023

View reviewed changes

pkg/identity/cache/allocator.go Outdated Show resolved Hide resolved

mhofstetter force-pushed the pr/mhofstetter/observable-allocator branch from f6c1a02 to e6e26c0 Compare June 26, 2023 07:22

mhofstetter force-pushed the pr/mhofstetter/observable-allocator branch 2 times, most recently from e1cfa29 to 5afd28a Compare June 26, 2023 10:30

mhofstetter force-pushed the pr/mhofstetter/observable-allocator branch from 5afd28a to 941eccf Compare June 26, 2023 10:58

mhofstetter requested a review from christarazi June 26, 2023 12:15

jrajahalme approved these changes Jun 26, 2023

View reviewed changes

brb mentioned this pull request Jun 27, 2023

Revert "Revert agent/helm: Deprecate --kpr=partial|strict|disabled and use --kpr=true|false instead" #26496

Merged

mhofstetter force-pushed the pr/mhofstetter/observable-allocator branch from 941eccf to 8bbafb6 Compare June 27, 2023 06:54

christarazi approved these changes Jun 27, 2023

View reviewed changes

maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Jun 27, 2023

borkmann merged commit 591ee68 into cilium:main Jun 27, 2023
65 checks passed

mhofstetter deleted the pr/mhofstetter/observable-allocator branch June 27, 2023 09:49

joestringer mentioned this pull request Jun 28, 2023

Prepare for release v1.14.0-rc.0 #26544

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

identity: Make identity allocations observable #26373

identity: Make identity allocations observable #26373

mhofstetter commented Jun 20, 2023 •

edited

giorio94 left a comment

mhofstetter commented Jun 20, 2023

giorio94 left a comment

joestringer left a comment

mhofstetter commented Jun 21, 2023

mhofstetter commented Jun 21, 2023

joestringer left a comment

mhofstetter commented Jun 22, 2023

mhofstetter commented Jun 22, 2023

joestringer Jun 22, 2023 •

edited

dylandreimerink Jun 23, 2023 •

edited

joestringer Jun 23, 2023 •

edited

joestringer Jun 23, 2023

joestringer Jun 23, 2023

joestringer Jun 23, 2023 •

edited

joestringer Jun 23, 2023

mhofstetter Jun 26, 2023 •

edited

mhofstetter Jun 26, 2023 •

edited

dylandreimerink left a comment

mhofstetter commented Jun 26, 2023

mhofstetter commented Jun 26, 2023 •

edited

jrajahalme left a comment

joestringer commented Jun 26, 2023

mhofstetter commented Jun 27, 2023

mhofstetter commented Jun 27, 2023

		// Calling complete from a new go routine, the same way as it would happen if the IdentityAllocator would
		// be set. This is how it should be expected according to stream.Observable and prevents faulty assumptions.

identity: Make identity allocations observable #26373

identity: Make identity allocations observable #26373

Conversation

mhofstetter commented Jun 20, 2023 • edited

giorio94 left a comment

Choose a reason for hiding this comment

mhofstetter commented Jun 20, 2023

giorio94 left a comment

Choose a reason for hiding this comment

joestringer left a comment

Choose a reason for hiding this comment

mhofstetter commented Jun 21, 2023

mhofstetter commented Jun 21, 2023

joestringer left a comment

Choose a reason for hiding this comment

mhofstetter commented Jun 22, 2023

mhofstetter commented Jun 22, 2023

joestringer Jun 22, 2023 • edited

Choose a reason for hiding this comment

dylandreimerink Jun 23, 2023 • edited

Choose a reason for hiding this comment

joestringer Jun 23, 2023 • edited

Choose a reason for hiding this comment

joestringer Jun 23, 2023

Choose a reason for hiding this comment

joestringer Jun 23, 2023

Choose a reason for hiding this comment

joestringer Jun 23, 2023 • edited

Choose a reason for hiding this comment

joestringer Jun 23, 2023

Choose a reason for hiding this comment

mhofstetter Jun 26, 2023 • edited

Choose a reason for hiding this comment

mhofstetter Jun 26, 2023 • edited

Choose a reason for hiding this comment

dylandreimerink left a comment

Choose a reason for hiding this comment

mhofstetter commented Jun 26, 2023

mhofstetter commented Jun 26, 2023 • edited

jrajahalme left a comment

Choose a reason for hiding this comment

joestringer commented Jun 26, 2023

mhofstetter commented Jun 27, 2023

mhofstetter commented Jun 27, 2023

mhofstetter commented Jun 20, 2023 •

edited

joestringer Jun 22, 2023 •

edited

dylandreimerink Jun 23, 2023 •

edited

joestringer Jun 23, 2023 •

edited

joestringer Jun 23, 2023 •

edited

mhofstetter Jun 26, 2023 •

edited

mhofstetter Jun 26, 2023 •

edited

mhofstetter commented Jun 26, 2023 •

edited