Skip to content

Topology Provider permissions briefly dropped during reconciliation when the reflector watch resets #774

@NickLarsenNZ

Description

@NickLarsenNZ

Affected Stackable version

Any up to and including SDP 26.3.0

Affected Apache HDFS version

N/A

Current and expected behavior

In the HDFS operator (and perhaps any operator based on kube-rs), when the Reflector watch resets, the Store has to be rebuilt.
Reconciliations before the Store is fully consistent can lead to service accounts being dropped from (Cluster)RoleBindings. The leads to the Topoology Provider not being able to determine the topology (or possibly builds an incorrect topology?)

The expected behaviour is that the above doesn't happen 😅.

Possible solution

We can requeue reconciliations (at least for some operations) until the store is fully consistent.

Eg:

  • On error: log error and return early
  • watcher::Event::Init -> the store is empty, waiting for InitApply events, requeue/return early.
  • watcher::Event::InitApply -> store is partially populated, requeue/return early until InitDone.
  • watcher::Event::InitDone -> store is populated, continue with reconcile
  • watcher::Event::Apply -> store is populated, continue with reconcile
  • watcher::Event::Delete -> store is populated, continue with reconcile

Caution

I haven't checked to see whether we can and should requeue, or just return an error which bubbles up as a Result for the error_policy handler which logs and does requeues.
Regardless, we need to make sure it eventually is reconciled and not just ignored.

Additional context

My understanding of the problem/solution should be double checked with someone else.

Tip

This might only be when the topology provider is used... but also seems like something that might affect other products that have components that interact with Kubernetes API in SDP generally

Environment

No response

Would you like to work on fixing this bug?

yes

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions