Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use csiNode informer to prevent hitting the API server for all time #327

Conversation

@mucahitkurt
Copy link
Contributor

mucahitkurt commented Aug 12, 2019

What type of PR is this?
/kind cleanup

What this PR does / why we need it:
Use csiNode shared index informer instead of hitting the API server for all csiNode resource request.

Which issue(s) this PR fixes:
Fixes #144

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

NONE

cc @msau42

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

k8s-ci-robot commented Aug 12, 2019

Welcome @mucahitkurt!

It looks like this is your first PR to kubernetes-csi/external-provisioner 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-csi/external-provisioner has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

k8s-ci-robot commented Aug 12, 2019

Hi @mucahitkurt. Thanks for your PR.

I'm waiting for a kubernetes-csi or kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the size/L label Aug 12, 2019
@mucahitkurt mucahitkurt force-pushed the mucahitkurt:use-node-and-csinodeinfo-informers branch from 727cb01 to ae4316b Aug 12, 2019
@msau42

This comment has been minimized.

Copy link
Collaborator

msau42 commented Aug 12, 2019

/ok-to-test
/assign @verult

@msau42 msau42 added this to In progress in Topology: GA Aug 12, 2019
@msau42

This comment has been minimized.

Copy link
Collaborator

msau42 commented Aug 14, 2019

/retest

Copy link
Contributor

verult left a comment

Thanks for the PR!

Was the Gopkg.lock change intentional? There isn't any Gopkg.toml changes.

Did you get a chance to test the provisioner end to end manually?

// TODO (#144): use informers
selectedNodeInfo, err := kubeClient.StorageV1beta1().CSINodes().Get(selectedNode.Name, metav1.GetOptions{})
if err != nil {
key := fmt.Sprintf("%s/%s", selectedNode.Namespace, selectedNode.Name)

This comment has been minimized.

Copy link
@verult

verult Aug 14, 2019

Contributor

Node objects are non-namespaced. Is this the right key format?

This comment has been minimized.

Copy link
@mucahitkurt

mucahitkurt Aug 15, 2019

Author Contributor

No it's not right, it should be only name


rand.Shuffle(len(nodeInfos.Items), func(i, j int) {
nodeInfos.Items[i], nodeInfos.Items[j] = nodeInfos.Items[j], nodeInfos.Items[i]
nodeInfos := informer.GetIndexer().List()

This comment has been minimized.

Copy link
@verult

verult Aug 14, 2019

Contributor

nit: rename all occurrences of nodeInfo to csiNode

I know you didn't make the name but it'd be super helpful if you could change it while you are at it, but totally optional

preBetaNodeVersion = "1.13.0"
testDriverName = "com.example.csi/test-driver"
preBetaNodeVersion = "1.13.0"
interval = 100 * time.Millisecond

This comment has been minimized.

Copy link
@verult

verult Aug 14, 2019

Contributor

Use a more descriptive name - what interval is it?

This comment has been minimized.

Copy link
@mucahitkurt

mucahitkurt Aug 15, 2019

Author Contributor

I'll delete this after HasSynced method usage

Labels: l,
Name: fmt.Sprintf("node-%d", i),
Labels: l,
Namespace: "testnamespace",

This comment has been minimized.

Copy link
@verult

verult Aug 14, 2019

Contributor

Node objects shouldn't have namespaces, so this isn't realistic

@@ -1528,3 +1550,20 @@ func requisiteEqual(t1, t2 []*csi.Topology) bool {

return unchecked.Len() == 0
}

func runCsiNodeInformer(kubeClient *fakeclientset.Clientset, t *testing.T, csiNodeNum int) (cache.SharedIndexInformer, chan struct{}) {

This comment has been minimized.

Copy link
@verult

verult Aug 14, 2019

Contributor

nit: csiNodeNum -> csiNodeCount . "Num" could be misinterpreted as index, for example

This comment has been minimized.

Copy link
@mucahitkurt

mucahitkurt Aug 15, 2019

Author Contributor

I'll delete it after HasSynced method usage

@@ -179,9 +181,27 @@ func main() {
provisionerOptions = append(provisionerOptions, controller.AdditionalProvisionerNames([]string{supportsMigrationFromInTreePluginName}))
}

// Create informer to prevent hit the API server for all resource request
csiNodeInformer := v1beta1.NewCSINodeInformer(clientset, 1*time.Hour, cache.Indexers{cache.NamespaceIndex: cache.MetaNamespaceIndexFunc})

This comment has been minimized.

Copy link
@verult

verult Aug 14, 2019

Contributor

Pull 1 * time.Hour into a named const


func waitToObserveCsiNodes(t *testing.T, csiNodeInformer cache.SharedIndexInformer, csiNodeNum int) {
if err := wait.PollImmediate(interval, informerWaitTimeout, func() (bool, error) {
objects := csiNodeInformer.GetIndexer().List()

This comment has been minimized.

Copy link
@verult

verult Aug 14, 2019

Contributor

Does this just make sure the informer is synced? Could you use informer.HasSynced() instead?

@@ -252,6 +255,7 @@ func NewCSIProvisioner(client kubernetes.Interface,
controllerCapabilities: controllerCapabilities,
supportsMigrationFromInTreePluginName: supportsMigrationFromInTreePluginName,
strictTopology: strictTopology,
informer: informer,

This comment has been minimized.

Copy link
@verult

verult Aug 14, 2019

Contributor

Need to wait for the informer to sync.

IIRC other CSI controllers have examples of using informers and they should have sync wait logic

@@ -179,9 +181,27 @@ func main() {
provisionerOptions = append(provisionerOptions, controller.AdditionalProvisionerNames([]string{supportsMigrationFromInTreePluginName}))
}

// Create informer to prevent hit the API server for all resource request
csiNodeInformer := v1beta1.NewCSINodeInformer(clientset, 1*time.Hour, cache.Indexers{cache.NamespaceIndex: cache.MetaNamespaceIndexFunc})
go csiNodeInformer.Run(context.TODO().Done())

This comment has been minimized.

Copy link
@verult

verult Aug 14, 2019

Contributor

IMO context.Background() is more appropriate

@@ -179,9 +181,27 @@ func main() {
provisionerOptions = append(provisionerOptions, controller.AdditionalProvisionerNames([]string{supportsMigrationFromInTreePluginName}))
}

// Create informer to prevent hit the API server for all resource request
csiNodeInformer := v1beta1.NewCSINodeInformer(clientset, 1*time.Hour, cache.Indexers{cache.NamespaceIndex: cache.MetaNamespaceIndexFunc})

This comment has been minimized.

Copy link
@verult

verult Aug 14, 2019

Contributor

Change 1*time.Hour into a named const. Ditto for all the other time constants

@mucahitkurt

This comment has been minimized.

Copy link
Contributor Author

mucahitkurt commented Aug 15, 2019

Thanks for the PR!

Was the Gopkg.lock change intentional? There isn't any Gopkg.toml changes.

Did you get a chance to test the provisioner end to end manually?

Thanks for the comments, I'll try to fix them.

Yes, Gopkg.lock change is intentional, Gopkg.toml is not changed because I did not add a new library dependency, I just added some package dependency under client-go.

No, I didn't make e2e manual test, TBH I'm not very familiar with the all environment of external provisioner but I'll try to run e2e test for the external provisioner.

@mucahitkurt mucahitkurt force-pushed the mucahitkurt:use-node-and-csinodeinfo-informers branch from ae4316b to 0e58fc7 Aug 15, 2019
@msau42

This comment has been minimized.

Copy link
Collaborator

msau42 commented Aug 15, 2019

It might be worth implementing topology first in the hostpath csi driver: kubernetes-csi/csi-driver-host-path#54. Then, we have automated pull/ci jobs that can test the topology path. (We'll also need to update those pull jobs to bring up a multi-node cluster)

I am also working on adding additional topology tests as part of kubernetes/kubernetes#71289

@mucahitkurt

This comment has been minimized.

Copy link
Contributor Author

mucahitkurt commented Aug 16, 2019

/test pull-kubernetes-csi-external-provisioner-1-13-on-kubernetes-1-13

@mucahitkurt

This comment has been minimized.

Copy link
Contributor Author

mucahitkurt commented Aug 16, 2019

/retest

@mucahitkurt

This comment has been minimized.

Copy link
Contributor Author

mucahitkurt commented Aug 18, 2019

@verult Do you have any idea why pull-kubernetes-csi-external-provisioner-1-13-on-kubernetes-1-13 tests are failed? Thanks!

@msau42

This comment has been minimized.

Copy link
Collaborator

msau42 commented Aug 19, 2019

E0816 19:40:18.164363       1 reflector.go:125] github.com/kubernetes-csi/external-provisioner/cmd/csi-provisioner/csi-provisioner.go:186: Failed to list *v1beta1.CSINode: the server could not find the requested resource

Topology was an alpha feature in 1.13, so the beta object doesn't exist. To avoid this, we should only create the informer if the topology feature gate in the external-provisioner is enabled.

@mucahitkurt mucahitkurt force-pushed the mucahitkurt:use-node-and-csinodeinfo-informers branch from 0e58fc7 to 3614485 Aug 19, 2019
@msau42

This comment has been minimized.

Copy link
Collaborator

msau42 commented Aug 20, 2019

/test pull-kubernetes-csi-external-provisioner-1-15-on-kubernetes-1-15

var csiNodeInformer cache.SharedIndexInformer
if utilfeature.DefaultFeatureGate.Enabled(features.Topology) {
// Create informer to prevent hit the API server for all resource request
csiNodeInformer := v1beta1.NewCSINodeInformer(clientset, ctrl.ResyncPeriodOfCsiNodeInformer, cache.Indexers{cache.NamespaceIndex: cache.MetaNamespaceIndexFunc})

This comment has been minimized.

Copy link
@msau42

msau42 Aug 20, 2019

Collaborator

I'm not sure indexing by namespace makes sense because CSINode is a non-namespaced object.

This comment has been minimized.

Copy link
@mucahitkurt

mucahitkurt Aug 20, 2019

Author Contributor

I'll change to cache.Indexers{}

@@ -235,7 +239,8 @@ func NewCSIProvisioner(client kubernetes.Interface,
pluginCapabilities connection.PluginCapabilitySet,
controllerCapabilities connection.ControllerCapabilitySet,
supportsMigrationFromInTreePluginName string,
strictTopology bool) controller.Provisioner {
strictTopology bool,
informer cache.SharedIndexInformer) controller.Provisioner {

This comment has been minimized.

Copy link
@msau42

msau42 Aug 20, 2019

Collaborator

instead of passing in the entire informer to the provisioner, how about only passing in the Lister(), since all we need is Get() and List()

This comment has been minimized.

Copy link
@mucahitkurt

mucahitkurt Aug 20, 2019

Author Contributor

I'll change to informer.GetStore()

@mucahitkurt mucahitkurt force-pushed the mucahitkurt:use-node-and-csinodeinfo-informers branch from 3614485 to 3ecf51a Aug 20, 2019
@mucahitkurt

This comment has been minimized.

Copy link
Contributor Author

mucahitkurt commented Aug 20, 2019

/retest

1 similar comment
@mucahitkurt

This comment has been minimized.

Copy link
Contributor Author

mucahitkurt commented Aug 21, 2019

/retest

@mucahitkurt mucahitkurt force-pushed the mucahitkurt:use-node-and-csinodeinfo-informers branch from 7bdb6d2 to b961349 Oct 1, 2019
var factory informers.SharedInformerFactory
if utilfeature.DefaultFeatureGate.Enabled(features.Topology) {
// Create informer to prevent hit the API server for all resource request
factory = informers.NewSharedInformerFactory(clientset, ctrl.ResyncPeriodOfCsiNodeInformer)

This comment has been minimized.

Copy link
@verult

verult Oct 8, 2019

Contributor

if the lister is only necessary when topology feature is used, should it use supportsTopology() instead to capture all conditions?

func (p *csiProvisioner) supportsTopology() bool {

This comment has been minimized.

Copy link
@mucahitkurt

mucahitkurt Oct 9, 2019

Author Contributor

Do you mean to change the condition utilfeature.DefaultFeatureGate.Enabled(features.Topology) with pluginCapabilities[csi.PluginCapability_Service_VOLUME_ACCESSIBILITY_CONSTRAINTS] && utilfeature.DefaultFeatureGate.Enabled(features.Topology) or do you mean to use csiNodeLister as initiated inside the csiProvisioner with the supportsTopology() check instead of a external provided property of the struct?

This comment has been minimized.

Copy link
@mucahitkurt

mucahitkurt Oct 9, 2019

Author Contributor

I added additional check for plugin capabilities.

This comment has been minimized.

Copy link
@verult

verult Oct 9, 2019

Contributor

Yup this is what I meant, though in its current form, I'm worried that someone later on will only change one place but not the other. It would be good to have a shared utility function instead in a different file (maybe package/util)

This comment has been minimized.

Copy link
@mucahitkurt

mucahitkurt Oct 10, 2019

Author Contributor

Makes sense, thanks. Putting the function inside topology.go instead of util package makes sense to me.

Copy link
Contributor

verult left a comment

Sorry about the delay, just a few more last comments and everything else looks good!

// error with the API server.
return nil, fmt.Errorf("error listing CSINodes: %v", err)
klog.Warningf("No csi nodes found")
return nil, nil

This comment has been minimized.

Copy link
@verult

verult Oct 8, 2019

Contributor

How come it's OK to not return an error now?

This comment has been minimized.

Copy link
@mucahitkurt

mucahitkurt Oct 9, 2019

Author Contributor

I couldn't remember any valid case for this change, I think I changed it in a wrong way, I'll change it to return error.

@mucahitkurt mucahitkurt force-pushed the mucahitkurt:use-node-and-csinodeinfo-informers branch from b961349 to 704a706 Oct 9, 2019
@mucahitkurt mucahitkurt force-pushed the mucahitkurt:use-node-and-csinodeinfo-informers branch from 704a706 to 71473f7 Oct 9, 2019
@mucahitkurt mucahitkurt force-pushed the mucahitkurt:use-node-and-csinodeinfo-informers branch from 71473f7 to 70c2877 Oct 10, 2019
@mucahitkurt mucahitkurt force-pushed the mucahitkurt:use-node-and-csinodeinfo-informers branch from 70c2877 to 6061117 Oct 10, 2019
@mucahitkurt mucahitkurt force-pushed the mucahitkurt:use-node-and-csinodeinfo-informers branch from 6061117 to 14e1e38 Oct 22, 2019
@mucahitkurt

This comment has been minimized.

Copy link
Contributor Author

mucahitkurt commented Oct 22, 2019

@verult PTAL

@mucahitkurt mucahitkurt force-pushed the mucahitkurt:use-node-and-csinodeinfo-informers branch from 14e1e38 to d5b98b8 Oct 22, 2019
@verult

This comment has been minimized.

Copy link
Contributor

verult commented Oct 24, 2019

/lgtm

Thanks @mucahitkurt !

@k8s-ci-robot k8s-ci-robot added the lgtm label Oct 24, 2019
@@ -168,10 +168,6 @@ k8s.io/api/apps/v1
k8s.io/api/apps/v1beta1

This comment has been minimized.

Copy link
@msau42

msau42 Oct 25, 2019

Collaborator

is updating this file necessary? it looks like just some lines got moved around?

This comment has been minimized.

Copy link
@mucahitkurt

mucahitkurt Oct 26, 2019

Author Contributor

Yes content is the same, just some order changes, but I'm getting error from CI like below when I don't change this file with the below command;

ERROR: vendor directory *not* up-to-date, it did get modified by 'GO111MODULE=on go mod vendor':

https://prow.k8s.io/view/gcs/kubernetes-jenkins/pr-logs/pull/kubernetes-csi_external-provisioner/327/pull-kubernetes-csi-external-provisioner-unit/1187927085243437058

https://travis-ci.org/kubernetes-csi/external-provisioner/builds/603113049?utm_source=github_status&utm_medium=notification

@mucahitkurt mucahitkurt force-pushed the mucahitkurt:use-node-and-csinodeinfo-informers branch from d5b98b8 to b1bccd2 Oct 26, 2019
@k8s-ci-robot k8s-ci-robot removed the lgtm label Oct 26, 2019
…source request

Signed-off-by: Mucahit Kurt <mucahitkurt@gmail.com>
@mucahitkurt mucahitkurt force-pushed the mucahitkurt:use-node-and-csinodeinfo-informers branch from b1bccd2 to 17fa2a8 Oct 26, 2019
@msau42

This comment has been minimized.

Copy link
Collaborator

msau42 commented Oct 26, 2019

/lgtm
/approve

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

k8s-ci-robot commented Oct 26, 2019

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: msau42, mucahitkurt

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit 4160887 into kubernetes-csi:master Oct 26, 2019
5 of 6 checks passed
5 of 6 checks passed
tide Not mergeable. Needs approved, lgtm labels.
Details
cla/linuxfoundation mucahitkurt authorized
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
pull-kubernetes-csi-external-provisioner-1-14-on-kubernetes-1-14 Job succeeded.
Details
pull-kubernetes-csi-external-provisioner-1-15-on-kubernetes-1-15 Job succeeded.
Details
pull-kubernetes-csi-external-provisioner-unit Job succeeded.
Details
Topology: GA automation moved this from In progress to Done Oct 26, 2019
@mucahitkurt mucahitkurt deleted the mucahitkurt:use-node-and-csinodeinfo-informers branch Oct 27, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.