Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove the proxy agent retry logic to simplify the code. #108

Merged
merged 1 commit into from
Jun 12, 2020

Conversation

caesarxuchao
Copy link
Member

@caesarxuchao caesarxuchao commented May 29, 2020

Copied from #106 (comment):

"It seems that the reconnecting code is causing more and more maintenance burden....The advantage of the reconnecting is that it will retry the current packet so it surfaces less errors to the caller."

It's a nice-to-have, but we think the maintenance burden has overgrown the advantage, so we remove it.

In a follow-up PR, I'll add client-side metrics to record how many send/receive errors we got, so that we will know if the optimization is needed.

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels May 29, 2020
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: caesarxuchao

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels May 29, 2020
@caesarxuchao caesarxuchao changed the title [WIP] Remove the proxy agent retry logic to simplify the code. Remove the proxy agent retry logic to simplify the code. May 29, 2020
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 29, 2020
@caesarxuchao
Copy link
Member Author

/assign @cheftako @Jefftree

@caesarxuchao
Copy link
Member Author

/assign @anfernee

Copy link
Member

@Jefftree Jefftree left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to be a straightforward refactor. lgtm from me

pkg/agent/client.go Outdated Show resolved Hide resolved
@anfernee
Copy link
Member

lgtm too. not particularly familiar with the host counting logic. everything else looks good.

stream agent.AgentService_ConnectClient
agentID string
serverID string // the id of the proxy server this client connects to.
serverCount int // the number of the proxy server instances.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure we should keep serverCount here. It seems like there should only be one value for serverCount. Not sure what it means if we somehow get different clients with different values for serverCount. Seems better to just have serverCount in ClientSet.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Bubbled up the serverCount as a return value.

a.recvLock.Lock()
defer a.recvLock.Unlock()

pkt, err := a.stream.Recv()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not blocking for this PR but a little worried that Recv is blocking and may stop us shutting down properly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I did some research on this and it seems fine.

When we close the agent client, we close the underlying GRPC connection, which should close the stream and unblocks Recv() with an error, according to grpc/grpc-go#3230 (comment).

return 0, err
}
scounts := md.Get(header.ServerCount)
if len(scounts) == 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When is it ok to get more than 1 ServerCount? If you get more than 1, how do you know which one is the right value?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated it to check the length is 1.

@@ -175,13 +171,13 @@ func (cs *ClientSet) syncOnce() error {
if err != nil {
return err
}
cs.serverCount = c.stream.serverCount
if err := cs.AddClient(c.stream.serverID, c); err != nil {
cs.serverCount = c.serverCount
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If cs.serverCount > 0 and cs.serverCount != c.serverCount, we should probably at least print a warning.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed c.serverCount per your other comment :)

if err != nil {
return err
}
cs.serverCount = c.stream.serverCount
if err := cs.AddClient(c.stream.serverID, c); err != nil {
cs.serverCount = serverCount
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to log a warning if cs.serverCount !=0 and cs.serverCount != serverCount. Whether we are looking at a serverCount of zero creeping in, two proxy servers/apiservers with differing beliefs in the needed number of servers or a legitimate runtime change in the count, it will be hard to debug these scenarios without this information. In addition this should be a very rare event.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. Updated.

Copy link
Contributor

@cheftako cheftako left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I would really like the mentioned warning added and then I'm happy :D

Copy link
Contributor

@cheftako cheftako left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 12, 2020
@k8s-ci-robot k8s-ci-robot merged commit b202c6e into kubernetes-sigs:master Jun 12, 2020
River-sh pushed a commit to River-sh/apiserver-network-proxy that referenced this pull request Sep 22, 2023
…modules/golang.org/x/sys-0.7.0

Build(deps): bump golang.org/x/sys from 0.6.0 to 0.7.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants