Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

can't start zookeeper in 0.9.2 #18

Closed
jazzl0ver opened this issue Jan 25, 2018 · 9 comments
Closed

can't start zookeeper in 0.9.2 #18

jazzl0ver opened this issue Jan 25, 2018 · 9 comments

Comments

@jazzl0ver
Copy link
Collaborator

I'm sorry for bothering you, but this 0.9.2 release is a headache for me. Can you please check if you can start zookeeper in ECS with the following command:

# ./firecamp-service-cli -op=create-service -service-type=zookeeper -region=us-east-1 -cluster=firecamp-prod -service-name=zoo-prod -replicas=3 -volume-size=20 -zk-heap-size=512

I'm getting:

The ZooKeeper heap size is less than 4096. Please increase it for production system
The zookeeper service is created, wait for all containers running
wait the service containers running, RunningCount 0
...
wait the service containers running, RunningCount 1
not all service containers are running after 120

And finally have one zookeeper container running only.
Service events show:

85171af8-094f-48c8-95c1-8ddc1406cfd3
2018-01-25 20:51:55 +0300
service zoo-prod was unable to place a task because no container instance met all of its requirements. The closest matching container-instance 986e672b-838a-4215-94c0-1ae8d8cf783b encountered error "memberOf constraint unsatisfied". For more information, see the Troubleshooting section.

cec4fbb1-b192-48d1-8747-a34c160a8481
2018-01-25 20:51:42 +0300
service zoo-prod has started 1 tasks: task d4e9c873-4629-4688-8b5b-9f8b1fcda874.

Firecamp log ends up with:

...
I0125 17:54:39.218207 1 server.go:688] get service status &{1 3} requuid req-722d99f1ef0c470c463dee0fe2e1dfea &{us-east-1 firecamp-prod zoo-prod}
I0125 17:54:44.219742 1 server.go:105] request Method GET URL /?Get-Service-Status ?Get-Service-Status Host firecamp-manageserver.firecamp-prod-firecamp.com:27040 requuid req-9a1353ee054041c96c770a55a24813c3 headers map[Accept-Encoding:[gzip] User-Agent:[Go-http-client/1.1] Content-Length:[73]]
I0125 17:54:44.236612 1 ecs.go:759] service zoo-prod has 1 running containers, desired 3
I0125 17:54:44.236634 1 server.go:688] get service status &{1 3} requuid req-9a1353ee054041c96c770a55a24813c3 &{us-east-1 firecamp-prod zoo-prod}
I0125 17:54:49.238279 1 server.go:105] request Method GET URL /?Get-Service-Status ?Get-Service-Status Host firecamp-manageserver.firecamp-prod-firecamp.com:27040 requuid req-97454a904f2b4c8161b8cf499e72d06a headers map[User-Agent:[Go-http-client/1.1] Content-Length:[73] Accept-Encoding:[gzip]]
I0125 17:54:49.256414 1 ecs.go:759] service zoo-prod has 1 running containers, desired 3
I0125 17:54:49.256441 1 server.go:688] get service status &{1 3} requuid req-97454a904f2b4c8161b8cf499e72d06a &{us-east-1 firecamp-prod zoo-prod}

Any ideas what's going on?

@JuniusLuo
Copy link
Contributor

Could you please check the EC2 instance type and the number of availability zones? Did you deploy other services?

@jazzl0ver
Copy link
Collaborator Author

jazzl0ver commented Jan 25, 2018 via email

@JuniusLuo
Copy link
Contributor

Thanks. How many nodes in the cluster? 5 or 3?

@jazzl0ver
Copy link
Collaborator Author

jazzl0ver commented Jan 25, 2018 via email

@JuniusLuo
Copy link
Contributor

This might be the issue. Currently when creating the service, FireCamp does not check if there is node running in one zone. FireCamp manage service simply assigns the service replicas to zones in the round-robin mode. So the replica may be assigned to the zone that no node is running.

Is there any reason that you want the cluster over 5 zones while has only 3 nodes? If you have 5 nodes on 5 zones or 3 nodes on 3 zones, this issue would not show up.

@jazzl0ver
Copy link
Collaborator Author

jazzl0ver commented Jan 25, 2018 via email

@JuniusLuo
Copy link
Contributor

How do you want to scale? Want to scale ZooKeeper to 5 nodes on 5 AZs?

There is one limitation by AutoScalingGroup and EBS. If the cluster has 5 AZs and 3 instances, ASG may create the new instance in the 4th AZ when one instance goes down. But the previous EBS volume is not in the 4th AZ. So one member will fail to start.

You could start with 3 AZs and 3 instances. In the future release, we will support scaling the AZs. We could add the new AZs to the ASG, and update the new AZs to the FireCamp manage service. The manage service will create the new replicas in the new AZs when scaling the ZooKeeper service.

@jazzl0ver
Copy link
Collaborator Author

jazzl0ver commented Jan 26, 2018

After shrinking AZs number to 3, everything worked like a charm! Thank you!

@cloudstax
Copy link
Owner

close this issue, as it works with the correct number of nodes. Scaling the cluster is an advanced feature in the later release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants