Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes a race in deviceplugin/manager_test.go and a race in deviceplug… #52561

Merged
merged 1 commit into from
Sep 19, 2017

Conversation

jiayingz
Copy link
Contributor

…in/manager.go.

What this PR does / why we need it:

Which issue this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close that issue when PR gets merged): fixes #
#52560

Special notes for your reviewer:
Tested with go test -count 50 -race k8s.io/kubernetes/pkg/kubelet/deviceplugin and all runs passed.

Release note:

@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Sep 15, 2017
@jiayingz
Copy link
Contributor Author

/release-note-none

@k8s-ci-robot k8s-ci-robot added the release-note-none Denotes a PR that doesn't merit a release note. label Sep 15, 2017
@jiayingz
Copy link
Contributor Author

/assign @mindprince @dchen1107

@jiayingz
Copy link
Contributor Author

/assign @RenaudWasTaken

@k8s-ci-robot
Copy link
Contributor

@jiayingz: GitHub didn't allow me to assign the following users: RenaudWasTaken.

Note that only kubernetes members can be assigned.

In response to this:

/assign @RenaudWasTaken

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@jiayingz
Copy link
Contributor Author

/retest

@jiayingz jiayingz force-pushed the deviceplugin-failure branch 3 times, most recently from ad3708b to 2ab77e0 Compare September 16, 2017 00:59
@dchen1107
Copy link
Member

/approve

@dchen1107 dchen1107 added this to the v1.8 milestone Sep 16, 2017
@k8s-github-robot k8s-github-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 16, 2017
@RenaudWasTaken
Copy link
Contributor

We are still leaking a goroutine in device_plugin_stub.go by not closing the close channel during the Stop implementation

@dims
Copy link
Member

dims commented Sep 18, 2017

@jiayingz @RenaudWasTaken Do we need this for 1.8?

@jiayingz
Copy link
Contributor Author

/assign @vishh

@vishh
Copy link
Contributor

vishh commented Sep 19, 2017

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 19, 2017
@k8s-github-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dchen1107, jiayingz, vishh

Associated issue: 52560

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@dims
Copy link
Member

dims commented Sep 19, 2017

/test all

@jiayingz
Copy link
Contributor Author

/retest

1 similar comment
@dims
Copy link
Member

dims commented Sep 19, 2017

/retest

@@ -70,7 +70,7 @@ func (m *Stub) Start() error {
// Wait till grpc server is ready.
for i := 0; i < 10; i++ {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole loop is useless. m.server.GetServiceInfo() returns the number of services as soon as RegisterDevicePluginServer is called.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it returns the number of services when they are ready, not as soon as RegisterDevicePluginServer is called.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/grpc/grpc-go/blob/master/server.go#L376

You can also test this:

m.server = grpc.NewServer([]grpc.ServerOption{}...)
pluginapi.RegisterDevicePluginServer(m.server, m)
fmt.Println(m.server.GetServiceInfo())

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this is interesting to know. I thought GetServiceInfo returns ready-to-serve no. of methods, but looking at the code, it is indeed should return No. of methods right after registration. However, I did verify that if I removed this code block, running 'bazel test --runs_per_test=20 //pkg/kubelet/deviceplugin:go_default_test' became flaky (I think mostly on endpoint_test.go) but with the current code, the tests all passed. I think we will need to spend more time understand this code. For now, I would like to keep it this way to make sure the tests are not flaky.

@jiayingz
Copy link
Contributor Author

/retest

1 similar comment
@dims
Copy link
Member

dims commented Sep 19, 2017

/retest

@k8s-github-robot
Copy link

/test all [submit-queue is verifying that this PR is safe to merge]

@k8s-github-robot
Copy link

Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here..

@k8s-github-robot k8s-github-robot merged commit 08486ab into kubernetes:master Sep 19, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants