Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dynamic resource allocation: reuse gRPC connection #118619

Merged
merged 2 commits into from Aug 16, 2023

Conversation

TommyStarK
Copy link
Contributor

@TommyStarK TommyStarK commented Jun 12, 2023

What type of PR is this?

/kind cleanup

What this PR does / why we need it:

Avoids to create a new gRPC connection each time NodePrepareResource or NodeUnprepareResource are called.

Which issue(s) this PR fixes:

Fixes #113832
part of #118661

Special notes for your reviewer:

Does this PR introduce a user-facing change?

dynamic resource allocation: avoid creating a new gRPC connection for every call of prepare/unprepare resource(s)

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Jun 12, 2023
@k8s-ci-robot k8s-ci-robot added area/kubelet sig/node Categorizes an issue or PR as relevant to SIG Node. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jun 12, 2023
@TommyStarK TommyStarK force-pushed the gh_113832 branch 2 times, most recently from 3638706 to c3ded02 Compare June 12, 2023 17:44
@TommyStarK TommyStarK changed the title WIP: dynamic resource allocation: reuse gRPC connection dynamic resource allocation: reuse gRPC connection Jun 12, 2023
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 12, 2023
@TommyStarK
Copy link
Contributor Author

/assign @bart0sh

Copy link
Contributor

@bart0sh bart0sh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR! A couple of minor things, but overall lgtm.

pkg/kubelet/cm/dra/plugin/client.go Outdated Show resolved Hide resolved
pkg/kubelet/cm/dra/plugin/plugin.go Outdated Show resolved Hide resolved
pkg/kubelet/cm/dra/plugin/plugins_store.go Outdated Show resolved Hide resolved
@bart0sh
Copy link
Contributor

bart0sh commented Jun 12, 2023

/triage accepted
/priority important-soon
@TommyStarK we need to cover this with e2e and unit(if possible) tests.

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Jun 12, 2023
@bart0sh
Copy link
Contributor

bart0sh commented Jun 12, 2023

/cc @klueska

@bart0sh bart0sh moved this from Triage to Needs Reviewer in SIG Node PR Triage Jun 12, 2023
@TommyStarK
Copy link
Contributor Author

/test pull-kubernetes-e2e-gce

@TommyStarK
Copy link
Contributor Author

@bart0sh @klueska After spending a lot of time trying to understand what was going wrong during kind-dra pipeline I came to the conclusion that it is related to the way we detect the version and the changes I made after @pohly PR was merged. I reverted back to how it was and only kept the reuse of the gRPC connection as well as the test covering it.
If you guys agree, I'd like to merge this as it is and I will open another PR to cleanup the flow (including all your feedbacks Kevin) and update the e2e tests as well (as soon as I will be more comfortable with that piece of code).
ps: pull-kubernetes-e2e-gce is failing very often but it is not related to this PR

@bart0sh bart0sh moved this from Needs Approver to Needs Reviewer in SIG Node PR Triage Jul 20, 2023
@TommyStarK
Copy link
Contributor Author

/test pull-kubernetes-kind-dra

@TommyStarK
Copy link
Contributor Author

/test pull-kubernetes-e2e-gce

@sftim
Copy link
Contributor

sftim commented Jul 20, 2023

Could an end user spot the difference before and after this change? If so, we should changelog it.

@TommyStarK
Copy link
Contributor Author

@sftim I guess yes, before there should be more open file descriptors than after. What about ?

dynamic resource allocation: avoid creating a new gRPC connection for every call of prepare/unprepare resource(s)

@sftim
Copy link
Contributor

sftim commented Jul 20, 2023

I'll let the approvers check the changelog entry. In the meantime, please remove the “none” marker.

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-none Denotes a PR that doesn't merit a release note. labels Jul 20, 2023
…gRPC connection

Signed-off-by: TommyStarK <thomasmilox@gmail.com>
@bart0sh
Copy link
Contributor

bart0sh commented Jul 23, 2023

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 23, 2023
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 9193c91512da0029811ab99ad742f3d9eccc04ff

@bart0sh bart0sh moved this from Needs Reviewer to Needs Approver in SIG Node PR Triage Jul 23, 2023
@TommyStarK
Copy link
Contributor Author

@klueska As we agreed this PR only focuses on the reuse of the gRPC conn, nothing else. I am starting to work on the refactoring of the client based on your feedbacks. If you are ok I think we can cancel the hold

@bart0sh
Copy link
Contributor

bart0sh commented Aug 16, 2023

/unhold
to unblock refactoring work promised by @TommyStarK

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 16, 2023
@TommyStarK
Copy link
Contributor Author

/retest

}

return drapb.NewNodeClient(conn), drapbv1alpha2.NewNodeClient(conn), conn, nil
pluginName string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need to necessarily address my comments below in this PR, but I think this code needs a major overhaul in general. It was copied from some CSI plugin code, and some remnants of that are still here.

For example. this type now consists of just the name of the plugin and a reference to the plugin itself. As such, it feels weird to have two separate types for Plugin and draPluginClient. Can they both be merged into a single type? Moreover, what is the value in making the Plugin and PluginStore types public? Are these actually used any where externally? Also, the functions / types defined in each file don't necessarily correlate with what's contained in each file. We should do a proper audit of the code before adding any further PRs on top of this one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After a quick look, I think you are right and we could merge Plugin and draPluginClient into a single one. It appears that the plugin store is only being used under dra so we could make it private. I will take these feedbacks into consideration when proposing a rework of the client/plugin code.

@klueska
Copy link
Contributor

klueska commented Aug 16, 2023

/approve
/lgtm

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: klueska, TommyStarK

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit 19deb04 into kubernetes:master Aug 16, 2023
14 checks passed
SIG Node PR Triage automation moved this from Needs Approver to Done Aug 16, 2023
@k8s-ci-robot k8s-ci-robot added this to the v1.29 milestone Aug 16, 2023
@TommyStarK TommyStarK deleted the gh_113832 branch August 16, 2023 16:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kubelet cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/node Categorizes an issue or PR as relevant to SIG Node. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Development

Successfully merging this pull request may close these issues.

dynamic resource allocation: reuse gRPC connection to the node plugin
7 participants