Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Storm-166: Nimbus HA design doc and implementation. #354

Merged
merged 96 commits into from
Aug 24, 2015

Conversation

Parth-Brahmbhatt
Copy link
Contributor

I have deleted the bit torrent implementation from this pull request as the only available bit torrent library does not support tracker less torrents. In absence of tracker less torrents a single tracker becomes a single point of failure and a multi tracker implementation requires that if a tracker host fails the replacement host has same dns/network configuration.

Some manual tests I executed:

  • start 3 nimbuses, test simple word count topology works. try storm list/activate/deactivate/rebalance/kill from ui and CLI.
  • set the replication factor to 2 run the first test again.
  • bring up a new nimbus, ensure it catches up and competes for leader lock.
  • with 3 nimbuses and 2 topologies, delete one topology code from each non leader nimbus. After killing master nimbus, ensure one of them eventually becomes leader.

Midpoint Applications and others added 30 commits September 12, 2014 14:17
Conflicts:
	storm-core/test/clj/backtype/storm/supervisor_test.clj
…y way. All tests pass now and was able to run wordcount and excalmation topologies.
…HDFSCodeDistributor. Working version of HDFSCodeDistributor.
… display list of nimbus hosts and current leader.
…ot have all the active topology code locally, keeps the lock if it can verify all active topology code exists locally.
…on to be achieved before the topology is activated.
@vesense
Copy link
Member

vesense commented Mar 20, 2015

+1 looks good to me.

@ptgoetz
Copy link
Member

ptgoetz commented Mar 20, 2015

+1

I'll proceed with crating the necessary branches.

Conflicts:
	storm-core/src/ui/public/index.html
	storm-core/src/ui/public/templates/index-page-template.html
	storm-core/src/ui/public/templates/topology-page-template.html
@ptgoetz
Copy link
Member

ptgoetz commented Mar 20, 2015

Thanks @Parth-Brahmbhatt. I've merged this to 0.11.x-branch.

@ptgoetz
Copy link
Member

ptgoetz commented Mar 21, 2015

NOTE: 0.11.x-branch has been renamed to 'nimbus-ha-branach'.

@longdafeng
Copy link
Contributor

@ptgoetz @revans2 @Parth-Brahmbhatt,

Sorry for late discuss the HA desgin.
Strongly recommend using JStorm's Nimbus HA Design. It is pretty stable. JStorm's Nimbus HA has been released for one year and has been approved stable.

The logic is very simple. The code less 1000 lines.

  1. Every nimbus will try to own the znode /nimbus_master, the winner will be the nimbus's master. Slaves will watch and timely check the znode, once it disappear, slaves will try to own it. During slaves check the znode, it will sync binary from master.
  2. All client API firstly connection ZK and get to know who are the nimbus Master.

(1) The core code in nimbus:
https://github.com/alibaba/jstorm/blob/master/jstorm-server/src/main/java/com/alibaba/jstorm/schedule/FollowerRunnable.java

(2) How to find the master of nimbus:
https://github.com/alibaba/jstorm/blob/master/jstorm-client/src/main/java/backtype/storm/security/auth/ThriftClient.java

@Parth-Brahmbhatt
Copy link
Contributor Author

Disclaimer I did not thoroughly look at the code but I am commenting based on your design description of Jstorm.

@longdafeng Did you have a chance to take a look at the current design? We are using curator for leader election which seems to be a very well tested library and is not really far from what you have proposed for leader election.

As for the length of the code, I don't completely agree with that being a good metric for most things. Due to the usage of an existing library the actual code for leader election in current PR is much smaller, 53 lines. https://github.com/Parth-Brahmbhatt/incubator-storm/blob/STORM-166/storm-core/src/clj/backtype/storm/zookeeper.clj#L250.

On top of that as part of this PR several of us had concerns around all clients connecting to zk to identify leader nimbus , as each new zk connection is a write to zk. We have partially fixed the issue by introducing thrift APIs for nimbus discovery which should be more efficient then the original approach and I plan to add caching at nimbus layer which should further improve the performance.

As @ptgoetz mentioned in the jira, we do not want user's topologies getting lost once nimbus accepts it and we also do not want to force all users to have a dependency on a fully replicated storage layer like HDFS. In current design by adding a code replication interface we are guaranteeing that once a topology is in active state it will be fully replicated, which seems to be another missing feature in your proposal. Its still a choice between availability and initial topology submission time which the users can chose based on their topology.replication.count config setting.

We also added few more features like UI improvements, nimbus summary being stored in zk, thrift API modification so users can figure out replication factor of their topologies, compatibility with rolling upgrade feature. All of which in my opinion are good admin tools and this feature will be incomplete without it.

I appreciate any feedback you can provide based on your experience of running Nimbus HA in production for a year. Please take some time to review the current design and let us know if you have any concerns.

@longdafeng
Copy link
Contributor

@Parth-Brahmbhatt , I don't suggest using "org.apache.curator.framework.recipes.leader"

In fact, We had implemented HA with "org.apache.curator.framework.recipes.leader" in the first version of JStorm Nimbus HA, It occur a lot of problem when Zookeeper load is big. Later We use low level of curator api to implement HA. it's much more stable even when zookeeper load is big.

Right now, JStorm Nimbus HA is the third version.

@revans2
Copy link
Contributor

revans2 commented Mar 24, 2015

@longdafeng I am glad to see the JStorm community starting to help out here. I respect your experience with this, especially since this is your third iteration on the code. ZK load is a big issue and concern for me so I am all for trying to adopt what JStorm has done in the area of HA. Because we are working on a separate feature branch for this, where we can break APIs if need be, perhaps we can check in the code as it is now, and file a follow on JIRA to adopt the model, configs, and ideally some of the code that JStorm is using. This would also make combining the two simpler in the future.

@Parth-Brahmbhatt @ptgoetz do either of you have an opinion on this?

@ptgoetz
Copy link
Member

ptgoetz commented Mar 24, 2015

@revans2 @longdafeng @Parth-Brahmbhatt I haven't had a chance to fully review the JStorm Nimbus HA implementation yet, but I'm open to incorporating any of its concepts/implementation.

But to be clear, we have to complete IP clearance before any decisions are made regarding the JStorm code.

Conflicts:
	storm-core/src/jvm/backtype/storm/utils/NimbusClient.java
	storm-core/test/clj/backtype/storm/security/auth/nimbus_auth_test.clj
Conflicts:
	STORM-UI-REST-API.md
	conf/defaults.yaml
	storm-core/src/clj/backtype/storm/daemon/nimbus.clj
	storm-core/src/clj/backtype/storm/ui/core.clj
	storm-core/src/jvm/backtype/storm/Config.java
	storm-core/src/jvm/backtype/storm/generated/TopologySummary.java
Conflicts:
	storm-core/src/jvm/backtype/storm/utils/NimbusClient.java
…o ensure nimbus sets up the correct code-distributor entries on startup.
Conflicts:
	storm-core/src/jvm/backtype/storm/utils/NimbusClient.java
Conflicts:
	storm-core/src/clj/backtype/storm/ui/core.clj
	storm-core/src/jvm/backtype/storm/Config.java
	storm-core/src/jvm/backtype/storm/utils/NimbusClient.java

Conflicts:
	storm-core/src/clj/backtype/storm/ui/core.clj
	storm-core/src/jvm/backtype/storm/Config.java
	storm-core/src/jvm/backtype/storm/utils/NimbusClient.java
…eral entries getting deleted. Adding a sleep before cody-sycn thread executes ls /code-distributor/topology-id to ensure it gets the correct id back so users dont have to wait for upto 5 minutes to submit topology.
…tributor path as zookeeper does not gurantee Simultaneously Consistent Cross-Client Views unless sync is called.
@Parth-Brahmbhatt
Copy link
Contributor Author

@revans2 @ptgoetz @harshach I have merged with master one more time. We are still using curator's LeaderLatch recipe. Nimbus discovery is done via an API so all clients don't have to connect to zookeeper.

@harshach
Copy link
Contributor

I am +1 on merging into master.

@Parth-Brahmbhatt
Copy link
Contributor Author

@revans2 Can you please review this PR when you have time? Given you have been involved in the original PR I don't want to commit this until I get a confirmation from you.

@revans2
Copy link
Contributor

revans2 commented Aug 24, 2015

Sorry I took so long on this. I am +1 on merging this in. I see a +1 from @harshach before the upmerge and a +1 from @ptgoetz from a long time ago. The upmerge looks like it was mostly trivial changes so I am just going to merge this into master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
9 participants