New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
identity: Make identity allocations observable #26373
identity: Make identity allocations observable #26373
Conversation
1a077f6
to
9b74b74
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! I've left a few comments inline.
9b74b74
to
4a44720
Compare
@giorio94 thanks for your fast feedback 🚀 i applied your suggestions - PTAL! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor comments and concerns around type reuse and lifecycle questions. I'm clearly not yet qualified enough to review code that uses streams.
4a44720
to
f9eb988
Compare
@joestringer thanks for your feedback and questions. I included your feedback ( |
Added @dylandreimerink as reviewer to this PR to have an additional pair of 👀 focussing on the "Observable" aspects. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just minor nits remaining: mostly just documenting each go ...
call to declare why it's necessary and how we know it will complete.
f9eb988
to
f6c1a02
Compare
@joestringer @christarazi thanks for your input. i addressed your suggestions in the latest force push. |
/test |
pkg/identity/cache/allocator.go
Outdated
// Calling complete from a new go routine, the same way as it would happen if the IdentityAllocator would | ||
// be set. This is how it should be expected according to stream.Observable and prevents faulty assumptions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still confused, bear with me:
the same way as it would happen if the IdentityAllocator would be set
What is that way? Is there a function reference or other docs description that the reader can compare with? Given that complete()
is an opaque function, it's hard to understand what complete
does here and why that means it needs to be run in a goroutine. This comment assumes that the reader already knows this code very well, and then explains to them that "yes it works the way you expect". But for a fresh reader, they will not have the context, so then how will they build up the context to validate this assumption?
This is how it should be expected according to stream.Observable
Observable
interface says:
// Observe a stream of values as long as the given context is valid.
// 'next' is called for each item, and finally 'complete' is called
// when the stream is complete, or an error has occurred.
//
// Observable implementations are allowed to call 'next' and 'complete'
// from any goroutine, but never concurrently.
Nothing in the above tells me that complete
must be run in a new goroutine.
prevents faulty assumptions.
What are those faulty assumptions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no specific contract with regards to calling complete
from the same or a different routine as Observe
. But looking at existing examples, we always call complete
from a goroutine spawned inside Observe
. Due to that behavior, we might that users of stream.Observables
which for some reason don't expect complete to be called before the call to Observe
returns. I assume those are the "prevents faulty assumptions." referred to.
The faulty assumption could lead to the following which would deadlock if complete
is called synchronously:
mu.Lock()
someStream.Observe(ctx, func(){...}, func(err error){
mu.Lock()
defer mu.Unlock()
})
mu.Unlock()
So by calling complete
from a goroutine we avoid deadlocks by uninformed users which can be called "resilience". But its also implementing an implicit contract, so perhaps if we want to allow this we should document it.
TL;DR calling complete
without a goroutine is valid as long as users of the API don't use it wrongly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dylandreimerink I see what you're saying, but it seems odd to me that a caller would grab a mutex outside the Observe and also in the complete function that they provide to Observe()
. I would assume that the functions that are called in Observe()
can be called from the current thread, so that pattern would just be unsafe. In fact, from the Observable
documentation, it sounds like Observable implementations could call it even from the same goroutine, which says that the locking example above is not respecting the Observe()
API. ie grabbing a mutex then calling Observe(...)
where the Observable
can choose the same goroutine to call complete()
could easily trigger a deadlock. Therefore we shouldn't assume that people will write such code, we'll catch it in review and reject it because it doesn't use the Observe()
function correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more bit of background: We should have reasons for goroutines, and avoid spawning goroutines if they're not necessary. They do take a little bit of resources, and they also add complexity due to (1) lifecycle management, ensuring they're completed and (2) by making things more asynchronous. I'd imagine there are good reasons for a lot of the existing complete()
goroutine cases today, but it makes me nervous to see "well run it in a goroutine to avoid bad assumptions", because we can still make other bad assumptions around the goroutines and then we have bad assumptions and additional async complexity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe... could you rephrase exactly what the concrete problem is with having complete()
run synchronously? I'm clearly still missing something here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stepping back through, I think the argument is that since CachingIdentityAllocator
-> Allocator
-> cache
-> newCache()
initializes the stream with a stream.Multicast[...]
, and since the implementation of stream.Multicast
runs complete()
from a goroutine, this code must also do the same. Broadly that seems like a consistency argument, which seems fine at face value. Based on the Observable()
interface API, it should be equally fine to just run the completion function directly without the goroutine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming that my latest comment is correct, my suggestion would be either:
(a) drop the goroutine since it's not needed, or
(b) document that assumption more explicitly:
// Calling complete from a new go routine, the same way as it would happen if the IdentityAllocator would | |
// be set. This is how it should be expected according to stream.Observable and prevents faulty assumptions. | |
// Calling complete from a new go routine, the same way as it would happen from stream.Multicast() in the Observe() function above. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i came up with c) : removing the actual go-routine from the id allocation business logic and using the existing stream
API stream.Empty[allocator.AllocatorChange]
in case where m.IdentityAllocator
is nil
. I refactored the logic a little bit - so the only difference is the observable (empty or m.IdentityAllocator). IMO this way the code should be readable and understandable from a id allocation point of view - if someone is understanding the observable patterns (there are enough pointers in the go doc of package stream).
even though the called stream.Empty
is doing the same thing (immediately calling complete from a new go-routine), it's up to the stream
package to provide the necessary (and allowed) observable patterns as API and document them well (so not every business logic need to document this again and again).
It's really a good discussion and i think there are valid arguments to keep or remove the go-routine. It's just that i think the context to discuss this in this "id allocation business related" PR isn't the right one. It should be part of the stream
API and is relevant for all usages of it. -> can we discuss this in a follow up? will bring this up once @joamaki is back.
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
next update:
Trying to understand the initial reason to differentiate between whether m.IdentityAllocator
is initialized or not triggered me to dive into this a little bit more. I came to the conclusion that it's necessary to wrap the whole (m *CachingIdentityAllocator) Observe
in an additional short lived go-routine (:see_no_evil:) which waits until the m.IdentityAllocator
is initialized before starting to observe it.
The reason for this is that the global identity allocator is initialized asynchronously - and might not be properly intialized once starting to observe. This would result in the immediate completion. IMO in these situations its expected that the observation waits until the identity allocator is ready and initialized anyway - otherwise the depending functionality doesn't work as expected (e.g. auth gc job would stop listening for these events)
This comes with the "advantage" that we don't need to discuss whether complete
is called from a separate go-routine by itself. (because the whole functionality runs in its own go-routine)
I documented this additional short-lived go-routine! (same situation here - locking the mutex (necessary to ensure that the identitAllocator is really set) shouldn't be a problem - because the inner observer starts a go-routine too)
PTAL @joestringer @dylandreimerink 🙏
PS: yes at best we would start to properly modularize the global identity allocator and register proper lifecycle hooks for the initialization (including the backends) - so dependent components can expect a initialized component.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me overall!
f6c1a02
to
e6e26c0
Compare
@dylandreimerink thanks a lot for your review: i removed the |
e1cfa29
to
5afd28a
Compare
The latest force push changes that the global identity allocator needs to be initialized before starting to observe the identity allocator for changes. This is necessasry, because the initialization of the global identity allocator is performed asynchronously when the daemon gets started. |
5afd28a
to
941eccf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/test |
This makes the identity allocator changes observable in order to provide cells backend-agnostic access to identity allocations. The identity allocation changes are now provided in the hive as stream.Observable[cache.IdentityChange]. When observing the initial listing is first waited for, then the current state is replayed followed by sync event and updates to state: `[ (wait for OnListDone()), Upsert, Upsert, Sync, Upsert, Deletem, ... ]` Co-authored-by: Jussi Maki <jussi@isovalent.com> Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com>
941eccf
to
8bbafb6
Compare
rebased to |
/test |
This makes the identity allocator changes observable in order to provide cells backend-agnostic access to identity allocations.
The identity allocation changes are now provided in the hive as stream.Observable[cache.IdentityChange]. When observing the initial listing is first waited for, then the current state is replayed followed by sync event and updates to state:
[ (wait for OnListDone()), Upsert, Upsert, Sync, Upsert, Delete, ... ]
Related to #25898 as this lays the foundation to replace the use of
Resource[CiliumIdentity]
with the observable identity allocator. This comes with the advantage of getting rid of the duplicated K8s Watcher for CiliumIdentities and the additional support for KVStore identity backend. A follow up PR will refactor the authentication related part (garbage collection).Original PR by @joamaki: #26229