Skip to content

Conversation

@TyqITstudent
Copy link

Problem:

  1. Sometimes Zookeeper cluster will receive a lot of connections from clients, sometimes connection number even exceeds 1W. When zxid rolls over, the clients will reconnect and revalidate the session.
  2. In Zookeeper design structure, when follower server receives the session revalidation requests, it will send requests to leader server, which is designed to be responsible for session revalidation.
  3. In a short time, Leader will handle lots of requests. I use a tool to get the statistics, some clients need to wait over 20s. It is too long for some special clients, like ResourceManager.

Solution:

  1. When zookeeper cluster finishes reelection(which will cost a few seconds). The leader will send the time point TimeA to followers. (which is the approximate value of roll over)
  2. Followers can judge the most session revalidations. When the timeout of the session is less than currentTime - timeA , follower will put the session on the touchTable. (Every half tickTime, followers will send sessions of touchTable to leader to validate).
  3. When the timeout of the session is larger than currentTime - timeA, the follower will send session revalidation request to leader right away.

So the leader will receive fewer requests from followers.

@asfgit
Copy link

asfgit commented Nov 4, 2018

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/2579/

@nkalmar
Copy link
Contributor

nkalmar commented Nov 5, 2018

Please create a jira for this PR and/or mention the jira number in the commit message and PR name. Do you plan on creating PR for master and 3.5 as well? I don't think this will make 3.4

edit: Okay, I see it's ZOOKEEPER-3169.

Thanks!

Copy link
Contributor

@anmolnar anmolnar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TyqITstudent Looks like this is a nice performance improvement, but it this a WIP patch?

  1. Does not compile
  2. sendSessionStartTime() is not used
  3. No tests

Also please create PR for master branch first.

@TyqITstudent TyqITstudent deleted the ZOOKEEPER-3169 branch November 18, 2018 12:38
@TyqITstudent
Copy link
Author

@TyqITstudent
Sorry for my late,will take a close look at this interesting issue this weekend. Promise!!!:D

Thanks for your help.

@TyqITstudent
Copy link
Author

------------looking--(:D)----------
Take it easy.

@maoling
Copy link
Member

maoling commented Mar 11, 2019

@TyqITstudent Thanks for floating up this perfermance improvement issue, I hava some questions about the solution you had provided.

  • "(Every half tickTime, followers will send sessions of touchTable to leader to validate)."

    "Every half tickTime" where do you find this frequency? I found that the sessions which's going to be revalidated by leader were sent with the ping packet(code: Follower#processPacket).

  • if timeout < currentTime - timeA (i.e. currentTime > timeout + timeA) Does this means this parts of sessions are all expired,so have less priorty to process at once? currentTime > timeout + timeA is the majority situation? Do I get your idea?

  • for the improvement,IMO,we can add a rate limiter or throttling strategy when the leader receives too many revalidate requests from the followers,and the leader can process the revalidate requests in the batch way(code:Leader#revalidateSession)

@TyqITstudent
Copy link
Author

  1. Your description is correct. Every half tickTime, Leader will send ping packet to Followers,
    and followers will send sessions stored in touchTable.
  2. Yes, your comprehension is correct. Every half tickTime, all sessions in Follower touchTable will be sent to Leader to revalidate. If the sessions which need to be revalidated now are not expired after half tickTime, we can send them to Leader with ping packets. You get my idea.
  3. Perhaps your idea is much better. But we hope we can revalidate sessions as quickly as possible. For example, if the revalidate request is sent at timeA; at timeA+6s,it is revalidated by Leader. Perhaps it is expired at that time.

@maoling
Copy link
Member

maoling commented Mar 18, 2019

@TyqITstudent

  • I have some other concern:
    comparing the timestamp between leader and follwers in a distributed system may be not a good idea.e.g due to clock skew, the currentTime in the follower's machine will be smaller than the actual value,then always currentTime < timeout + timeA
  • Overall, in the current design of the session, the server cannot have a good ability to handle thousands of clients.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants