Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Imbalanced Initial Cluster #29

Closed
aryanet opened this issue May 30, 2012 · 7 comments
Closed

Imbalanced Initial Cluster #29

aryanet opened this issue May 30, 2012 · 7 comments

Comments

@aryanet
Copy link
Contributor

aryanet commented May 30, 2012

I have launched a 12 node multizone multiregion cluster. However, I am seeing the initial Ownership is not even and one node (presumably the very first node in the first zone) has got a very off token causing it to have 50+% ownership and others have less. I am investigating why this has happened. If you have seen this, please advice. Here is the ring:

Address DC Rack Status State Load Owns Token
77981375752715064543690014205080921348
****** us-east 1a Up Normal 11.5 KB 54.17% 1808575600
****** us-east 1b Up Normal 11.5 KB 4.17% 7089215977519551322153637656637080005
****** us-west-2 2a Up Normal 11.5 KB 4.17% 14178431955039102644307275311624381703
****** us-west-2 2b Up Normal 11.5 KB 4.17% 21267647932558653966460912966452886108
****** us-east 1a Up Normal 11.5 KB 4.17% 28356863910078205288614550621122593220
****** us-east 1b Up Normal 11.5 KB 4.17% 35446079887597756610768188275951097625
****** us-west-2 2a Up Normal 11.5 KB 4.17% 42535295865117307932921825930938399323
****** us-west-2 2b Up Normal 11.5 KB 4.17% 49624511842636859255075463585766903728
****** us-east 1a Up Normal 13.37 KB 4.17% 56713727820156410577229101240436610840
****** us-east 1b Up Normal 11.5 KB 4.17% 63802943797675961899382738895265115245
****** us-west-2 2a Up Normal 11.5 KB 4.17% 70892159775195513221536376550252416943
****** us-west-2 2b Up Normal 11.5 KB 4.17% 77981375752715064543690014205080921348

@Vijay2win
Copy link
Contributor

Double check your property of available.zones to see the ordering of the zones. Make sure you have a SDB doesn't have old Populated values (clean it up before you create a new cluster, If you had 5 node cluster and you spinned down and spin up 10 node cluster without clearing the SDB entries then this might happen)

@aryanet
Copy link
Contributor Author

aryanet commented May 31, 2012

I appreciate your feedback Vijay. I have the following configured for zones:

us-east-1a,us-east-1b,us-west-2a,us-west-2b

Each time I have brought up the cluster, I dropped InstanceIdentity domain as I thought that is were instance info is kept. Tomorrow I'll try to drop everything and create a fresh configuration and will get back to you.

@aryanet
Copy link
Contributor Author

aryanet commented Jun 1, 2012

I cleared everything and I can reproduce the same cluster every time. I am launching all scalability groups at the same time. Is my zone configuration incorrect? Also, I don't understand something in the code. I see that during token calculation, an offset value based on hash or the name of the regions is added to the token. Why is that? Please advice.

@Vijay2win
Copy link
Contributor

an offset value based on hash or the name of the regions is added to the token. Why is that?

Thats added to address the multi region aspects of Token management. You might want to look at
http://www.datastax.com/docs/0.8/install/cluster_init
http://www.datastax.com/dev/blog/deploying-cassandra-across-multiple-data-centers

Regarding imbalance,
How many ASG's have you deployed? (Plz create 3 seperate ASG's one for each zones).
Whats the value for the property set to priam.zones.available?
is the Env variables for EC2_REGION?

@aryanet
Copy link
Contributor Author

aryanet commented Jun 6, 2012

Thanks for the info. I am reading them now. To answer your questions:

4 ASGs, basically 2 in each region as described in next line. They are separate and each have 3 min nodes.
priam.zones.available is "us-east-1a,us-east-1b,us-west-2a,us-west-2b"
EC2_REGION is us-east-1 for the ones in east and us-west-2 for the ones in west.

@aryanet
Copy link
Contributor Author

aryanet commented Jun 7, 2012

OK. I figured out Priam looks for max number of hosts in ASG. I had that as 6 but my minimum and desired was 3. So that was tipping off the token calculation. Thanks you for your attention. I am going to close this and make it more clear in Priam wiki so others don't confuse it.

@dane-xa
Copy link

dane-xa commented Jul 11, 2013

Hi all, I just hit this problem while exploring Priam. I had set my asg.max_size to a number greater than my desired number of nodes-per-AZ. As you all know by now, the result was an unbalanced ring, as Priam assumes the auto scaling group max_size is equal to the rac membership size. How about an update to the wiki clarifying the proper settings for an auto scaling group?

Perhaps something like this:
https://github.com/Netflix/Priam/wiki/Setup

Choosing ASG max_size
When setting up ASG, ensure max_size is set to the desired number of nodes per rac (nodes per availability zone). Priam calculates tokens based on each ASG's max_size, and it is incorrect to set max_size higher than the desired number of nodes. Instead, use the DoubleRing technique if planning for future growth.

Dane

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants