Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YUNIKORN-2625] Refactor Clients to avoid hard-code checks #844

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

arhtur007
Copy link

What is this PR for?

Use a for loop instead of listing all informer check manually.

What type of PR is it?

  • - Bug Fix
  • - Improvement
  • - Feature
  • - Documentation
  • - Hot Fix
  • - Refactoring

Todos

N/A

What is the Jira issue?

YUNIKORN-2625

How should this be tested?

make test

Screenshots (if appropriate)

image

Questions:

N/A

@chia7712 chia7712 self-requested a review May 20, 2024 06:06
Copy link

codecov bot commented May 20, 2024

Codecov Report

Attention: Patch coverage is 82.60870% with 16 lines in your changes missing coverage. Please review.

Project coverage is 67.78%. Comparing base (8e680af) to head (3ad1a0f).
Report is 1 commits behind head on master.

Files Patch % Lines
pkg/client/clients.go 92.06% 4 Missing and 1 partial ⚠️
pkg/client/apifactory.go 0.00% 4 Missing ⚠️
pkg/client/kubeclient.go 0.00% 4 Missing ⚠️
pkg/client/apifactory_mock.go 0.00% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #844      +/-   ##
==========================================
+ Coverage   67.17%   67.78%   +0.60%     
==========================================
  Files          70       69       -1     
  Lines        7631     7629       -2     
==========================================
+ Hits         5126     5171      +45     
+ Misses       2287     2238      -49     
- Partials      218      220       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

"k8s.io/kubernetes/pkg/scheduler/framework/plugins/volumebinding"

"github.com/apache/yunikorn-k8shim/pkg/conf"
"github.com/apache/yunikorn-k8shim/pkg/log"
"github.com/apache/yunikorn-scheduler-interface/lib/go/api"
)

type informer interface {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we just add a new method to Clients to return all informers. For example:

func (c *Clients) informers() []cache.SharedIndexInformer

The true impl is to collect informers from all resource informers. WDYT

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it better to call Informer() cache.SharedIndexInformer every time we need it?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with using slice instead of map since informers are set and remain unchanged.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it better to call Informer() cache.SharedIndexInformer every time we need it?

It seems both WaitForSync and Run are not hot methods, right? If so, creating the new collection should not a issue about performance regression I think.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I came across this comment in the source code:

// Always prefer using an informer factory to get a shared informer instead of getting an independent
// one. This reduces memory footprint and number of connections to the server.

There doesn't appear to be any issue with it currently.
But I think we should avoid any potential hidden bugs in the codebase.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is getting a little out of hand. This is initialization code to handle common Kubernetes objects. We don't want to expose slices, or other internal implementation details. This code is also not hot, as it gets called once during execution. We should favor simplicity here. I think this new implementation is overly complex.

Copy link
Author

@arhtur007 arhtur007 May 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the pr based on @chia7712's comment.

Copy link
Contributor

@wilfred-s wilfred-s May 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at using something like slices for storing all the different informers. Due to the differences of the listers between the informers that becomes a slice of interface{} That then triggers type casting and all kinds of other handling and checks which is worse than having the separate variables and maintaining the two sets. Just storing the Informer() and not the Lister() for the informers does not make sense for me. The lister is normally accessed outside of startup the informer is never accessed as it just runs...

Any informer defined should be started and stopped. To not regress we could have a simple unit test that checks that all informers defined in the type are also returned in the getInformers() call. If someone adds a new informer we can then ensure that they are updated in both places

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any informer defined should be started and stopped. To not regress we could have a simple unit test that checks that all informers defined in the type are also returned in the getInformers() call. If someone adds a new informer we can then ensure that they are updated in both places

agree to that "started" and "stopped" are the most important here. It means Clients must have all "created" informers. Maybe we can add a small helper to collect all created informers into a slice, and then we pass the slice to Clients. For example:

type hasInformer interface {
	Informer() cache.SharedIndexInformer
}

func save[T hasInformer](n T, collector *[]cache.SharedIndexInformer) T {
	*collector = append(*collector, n.Informer())
	return n
}

We use a little generic here to avoid casting, and all created informers should be wrapped by save. For example:

	var allInformers []cache.SharedIndexInformer

	nodeInformer := save(informerFactory.Core().V1().Nodes(), &allInformers)
	podInformer := save(informerFactory.Core().V1().Pods(), &allInformers)

Finally, allInformers have all created informers, and Clients will use it to start/stop all informers.

@wilfred-s WDYT?

log.Log(log.ShimClient).Info("Waiting for informers to sync",
zap.Duration("timeElapsed", time.Since(syncStartTime).Round(time.Second)))

LOOP:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All we want to do is to make sure all get synced, right? Hence, we can check informer one by one.

	for _, informer := range c.getInformers() {
		for !informer.HasSynced() {
			time.Sleep(time.Second)
			counter++
			if counter%10 == 0 {
				log.Log(log.ShimClient).Info("Waiting for informers to sync",
					zap.Duration("timeElapsed", time.Since(syncStartTime).Round(time.Second)))
			}
		}
	}

Also, that will create a slice only once.

@@ -67,38 +68,38 @@ func (c *Clients) GetConf() *conf.SchedulerConf {
return c.conf
}

func (c *Clients) getInformers() []cache.SharedIndexInformer {
return []cache.SharedIndexInformer{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This list should be in sync with the informerTypes or the other way around. That was the issue that triggered YUNIKORN-2621
Not sure how we do that without manual work. The informers side is simple, the listers are different for every type which makes it tricky.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This list should be in sync with the informerTypes or the other way around.
That was the issue that triggered YUNIKORN-2621
Not sure how we do that without manual work. The informers side is simple, the listers are different for every type which makes it tricky.

agree to that is tricky. Maybe we can have a inner interface HasInformer to offer informer, and then we cast it to true struct to create Clients. For example:

func (t Type) HasInformer(informerFactory informers.SharedInformerFactory) HasInformer {
	switch t {
	case PVInformerHandlers:
		return informerFactory.Core().V1().PersistentVolumes()
	case PodInformerHandlers:
		return informerFactory.Core().V1().Pods()
	}
	return nil
}

func NewAPIFactory(scheduler api.SchedulerAPI, informerFactory informers.SharedInformerFactory, configs *conf.SchedulerConf, testMode bool) *APIFactory {
	kubeClient := NewKubeClient(configs.KubeConfig)

	all := make(map[Type]HasInformer)
	for s := range informerTypes {
		all[Type(s)] = Type(s).HasInformer(informerFactory)
	}

	// init informers
	// volume informers are also used to get the Listers for the predicates
	nodeInformer := all[NodeInformerHandlers].(v1.NodeInformer)
	podInformer := all[PodInformerHandlers].(v1.PodInformer)
}

With this change, we have to add new const Type when adding new informer in the future, as all structs are from Type(s).HasInformer. Also, the map all can pass to Clients to be used as "all informers". BTW, we can enable exhaustive of linters to avoid missing statements in switch.

In short, we can consider adding following changes.

  1. enable exhaustive check - avoid missing switch statement
  2. Type can return a new inner interface HasInformer - bind the informer to Type
  3. cast HasInformer to true struct instead of accessing SharedInformerFactory - make sure all used informers come from Type

Copy link
Author

@arhtur007 arhtur007 May 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Manually initializing each informer is more intuitive. However, in order to ensure the list is synchronized, @chia7712 's method is better.

Another nit suggestion is that we should move all informer-related code to clients.go since all the informers are under the Client struct. Additionally, interfaces.go should be combined with kubeclient.go

@arhtur007
Copy link
Author

Hello @wilfred-s , @chia7712

I've updated the PR and add a test for checking informers.
pleas take a look at it when you're free, thank you.

Copy link
Contributor

@pbacsko pbacsko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to approve this just yet. To me, the more compact code results in a code which is more difficult to comprehend.

If others are OK with it I'm fine too, but I suggest dropping the save() + hasInformers combo.

Comment on lines +98 to +110
hasInformers := []hasInformer{}

podInformer := save(informerFactory.Core().V1().Pods(), &hasInformers)
nodeInformer := save(informerFactory.Core().V1().Nodes(), &hasInformers)
configMapInformer := save(informerFactory.Core().V1().ConfigMaps(), &hasInformers)
storageInformer := save(informerFactory.Storage().V1().StorageClasses(), &hasInformers)
pvInformer := save(informerFactory.Core().V1().PersistentVolumes(), &hasInformers)
pvcInformer := save(informerFactory.Core().V1().PersistentVolumeClaims(), &hasInformers)
priorityClassInformer := save(informerFactory.Scheduling().V1().PriorityClasses(), &hasInformers)
namespaceInformer := save(informerFactory.Core().V1().Namespaces(), &hasInformers)
csiNodeInformer := save(informerFactory.Storage().V1().CSINodes(), &hasInformers)
csiDriversInformer := save(informerFactory.Storage().V1().CSIDrivers(), &hasInformers)
csiStorageCapacityInformer := save(informerFactory.Storage().V1().CSIStorageCapacities(), &hasInformers)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, I don't know. We do save some lines by introducing this intermediate type, at the same, readability is not the best. In fact, I had to download and apply the patch locally to understand what's happening here.
I'm not the fan of this.

For example, even though this is more LOC, it's just much simpler:

	var sharedInformers []cache.SharedIndexInformer
	
	podInformer := informerFactory.Core().V1().Pods()
	sharedInformers = append(sharedInformers, podInformer.Informer())
	nodeInformer := informerFactory.Core().V1().Nodes()
	sharedInformers = append(sharedInformers, nodeInformer.Informer())
...

Then sharedInformers can be passed as a slice to Client after it's done. This way, we also don't need to copy the slice every time we call getInformers(), another thing which I don't really like.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, even though this is more LOC, it's just much simpler:

yep, that is a good idea. However, the previous discussion (#844 (comment)) seems to want a style of avoid ignoring the SharedIndexInformer in creating XXXInformer . Otherwise, this simple way is good to me.

Let me summary the requests:

  1. we must save the SharedIndexInformer from all used XXXInformer ([YUNIKORN-2625] Refactor Clients to avoid hard-code checks #844 (comment))
  2. the informerTypes should be synced with all used XXXInformer ([YUNIKORN-2625] Refactor Clients to avoid hard-code checks #844 (comment))

I agree to @wilfred-s said "Not sure how we do that without manual work" - Maybe we should let this PR go and focus on adding missed CSINodeInformer and NamespaceInformer to informerTypes manually

@@ -64,7 +44,7 @@ type APIProvider interface {
// resource handlers defines add/update/delete operations in response to the corresponding resources updates.
// The associated the type field points the handler functions to the correct receiver.
type ResourceEventHandlers struct {
Type
InformerType
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this changed to avoid confusion with type?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants