Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Storm 1.0.0+ #177

Merged
merged 5 commits into from Nov 5, 2016
Merged

Support Storm 1.0.0+ #177

merged 5 commits into from Nov 5, 2016

Conversation

erikdw
Copy link
Collaborator

@erikdw erikdw commented Oct 31, 2016

See individual commits in this PR for more info.
Haven't tested yet, will update the PR when I have.

@erikdw erikdw mentioned this pull request Oct 31, 2016
(Also includes changes to resolve conflicts from merging erikdw/storm-mesos:1.0.0-WIP into master.)

I got the tests to work for `storm-1.0.0` based on DarinJ/storm:1.0.0-WIP with a number of changes:

Main changes for fixing tests:
* change "config maps" from `Map<String, TopologyDetails>` type (which is just *wrong*) to just `Map`.
* add in a couple of keys that are required by TopologyDetails via asserts now:
 * `topologyConf.put("topology.worker.max.heap.size.mb", 768.0);`
 * `topologyConf.put("topology.priority", 0);`
* change some constructor arguments:
 * removed final null param in `SupervisorDetails` case
 * added extra param in `Cluster` case (this one is also storm-1.0+ specific)

Other main changes:
* added storm-shim-1x module and corresponding profile to top-level pom.xml
* `import org.apache.storm.thrift.TBase;` instead of `import org.apache.thrift7.TBase;`
* updated `bin/build-release.sh` to support setting profile for 1x
* ran this sed to fixup stuff appropriately (note `gsed` from brew since I'm on Mac)
 * `find . -type f -not -path '*/\.*' -not -name '*.class' -not -path '*storm-shim-9x*' -not -path '*storm-shim-10x*' -print0 | xargs -0 gsed -i 's/backtype/org.apache/g'`
…anging the master branch to only support storm-1.0+
…of storm which are no longer supported in master, and also update the latest Mesos release to 1.0.1
@erikdw
Copy link
Collaborator Author

erikdw commented Nov 1, 2016

I've tested with some extra changes in the vagrant setup of this project. I need to spend a bit more time refining the vagrant-related settings, but the basis of the storm-1.0+ support is here. I had to make some other minor tweaks that I've rebased appropriately.

I considered collapsing the remaining shims, but opted against it for now. I suspect they might prove useful once storm-2.0 lands given that it is a huge rewrite (storm-core being changed from Clojure to Java).

Another thing I left out is support for nimbus.seeds -- so we're still reliant on the deprecated nimbus.host config option. When we understand how to use marathon we will come back to that.

@DarinJ can you please review when you have a chance?

…ns, and update default storm version from 1.0.0 to 1.0.2
@DarinJ
Copy link
Collaborator

DarinJ commented Nov 1, 2016

@erikdw my team was just discussing working on stom-1.0.2 a few days based off the WIP branch. I pulled this PR, will test out and make comments.

@DarinJ
Copy link
Collaborator

DarinJ commented Nov 1, 2016

Did a quick review of the code ... mostly just package name changes - nothing to worry about. LGTM. Going to run some test topo's a cluster.

@clharris
Copy link

clharris commented Nov 1, 2016

Hi @erikdw , you got it to run successfully on a cluster I'm guessing? I just tried running a test topo and got some version conflicts

2016-11-01 19:06:47.161 o.a.s.d.worker [ERROR] Error on initialization of server mk-worker java.lang.RuntimeException: Fail to construct messaging plugin from plugin backtype.storm.messaging.netty.Context at org.apache.storm.messaging.TransportFactory.makeContext(TransportFactory.java:53) ~[storm-core-1.0.2.jar:1.0.2] at org.apache.storm.daemon.worker$worker_data.invoke(worker.clj:266) ~[storm-core-1.0.2.jar:1.0.2] at org.apache.storm.daemon.worker$fn__8555$exec_fn__2466__auto__$reify__8557.run(worker.clj:611) ~[storm-core-1.0.2.jar:1.0.2] at java.security.AccessController.doPrivileged(Native Method) ~[?:1.7.0_91] at javax.security.auth.Subject.doAs(Subject.java:415) ~[?:1.7.0_91] at org.apache.storm.daemon.worker$fn__8555$exec_fn__2466__auto____8556.invoke(worker.clj:609) ~[storm-core-1.0.2.jar:1.0.2] at clojure.lang.AFn.applyToHelper(AFn.java:178) ~[clojure-1.7.0.jar:?] at clojure.lang.AFn.applyTo(AFn.java:144) ~[clojure-1.7.0.jar:?] at clojure.core$apply.invoke(core.clj:630) ~[clojure-1.7.0.jar:?] at org.apache.storm.daemon.worker$fn__8555$mk_worker__8650.doInvoke(worker.clj:583) [storm-core-1.0.2.jar:1.0.2] at clojure.lang.RestFn.invoke(RestFn.java:512) [clojure-1.7.0.jar:?] at org.apache.storm.daemon.worker$_main.invoke(worker.clj:771) [storm-core-1.0.2.jar:1.0.2] at clojure.lang.AFn.applyToHelper(AFn.java:165) [clojure-1.7.0.jar:?] at clojure.lang.AFn.applyTo(AFn.java:144) [clojure-1.7.0.jar:?] at org.apache.storm.daemon.worker.main(Unknown Source) [storm-core-1.0.2.jar:1.0.2] Caused by: java.lang.ClassNotFoundException: backtype.storm.messaging.netty.Context at java.net.URLClassLoader$1.run(URLClassLoader.java:366) ~[?:1.7.0_91] at java.net.URLClassLoader$1.run(URLClassLoader.java:355) ~[?:1.7.0_91] at java.security.AccessController.doPrivileged(Native Method) ~[?:1.7.0_91] at java.net.URLClassLoader.findClass(URLClassLoader.java:354) ~[?:1.7.0_91] at java.lang.ClassLoader.loadClass(ClassLoader.java:425) ~[?:1.7.0_91] at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) ~[?:1.7.0_91] at java.lang.ClassLoader.loadClass(ClassLoader.java:358) ~[?:1.7.0_91] at java.lang.Class.forName0(Native Method) ~[?:1.7.0_91] at java.lang.Class.forName(Class.java:195) ~[?:1.7.0_91] at org.apache.storm.messaging.TransportFactory.makeContext(TransportFactory.java:38) ~[storm-core-1.0.2.jar:1.0.2] ... 14 more

Any thoughts on this?

@erikdw
Copy link
Collaborator Author

erikdw commented Nov 1, 2016

@clharris : don't recognize those errors. Yes, I tried this on a "cluster", but really it's the vagrant setup in this project -- but it uses mesos-master, mesos-slave/agent, MesosNimbus, MesosSupervisor, storm-ui, etc. Here's the storm-ui showing the worker running:

My guess would be that this is a storm conflict between previously deployed stuff and the newer version. I'm pretty sure it's not backwards compatible. i.e., if that was the case then you would need to:

  • wipe out storm state in storm-local on the Nimbus
  • wipe out storm state in ZooKeeper
  • make sure all the existing workers/supervisors are dead

@erikdw
Copy link
Collaborator Author

erikdw commented Nov 2, 2016

@clharris : upon looking closer at the error, I think this is a problem with your configuration somehow:

Error on initialization of server mk-worker java.lang.RuntimeException: Fail to construct messaging plugin from plugin backtype.storm.messaging.netty.Context

^ Note: backtype.storm.messaging.netty.Context
That should be: org.apache.storm.messaging.netty.Context
See this line in the storm.yaml of this PR.

If not your configuration, then maybe a problem with how you're building?

Here's how I built:

STORM_RELEASE=1.0.2 MESOS_RELEASE=1.0.1 bin/build-release.sh

@clharris
Copy link

clharris commented Nov 2, 2016

@erikdw, oops, you're right, I was using an old storm.yaml which had the wrong storm.messaging.transport. Sorry I didn't notice that one sooner. It still seems that I have something that's not configured quite right, but I'll have to look at it more tomorrow. Thanks for your help

@erikdw
Copy link
Collaborator Author

erikdw commented Nov 2, 2016

@clharris : phew, glad to help!

@clharris
Copy link

clharris commented Nov 2, 2016

@erikdw: just tinkered with it again this morning and everything is running great now! LGTM

@DarinJ
Copy link
Collaborator

DarinJ commented Nov 2, 2016

Tested +1 LGTM

@erikdw
Copy link
Collaborator Author

erikdw commented Nov 5, 2016

@DarinJ & @clharris : thanks for the help with testing and reviewing! Gonna merge and release the 1st storm-1.0+ image.

@erikdw erikdw merged commit f951315 into mesos:master Nov 5, 2016
@erikdw erikdw deleted the storm-1.0.0 branch November 5, 2016 21:37
@maverick2202
Copy link

maverick2202 commented Jan 9, 2017

I would like to test upgrading our storm-mesos framework to use storm 1.0. Is there documentation for upgrade step ? Can I just upgrade nimbus to storm 1.0 and have new workers/existing works on restart pull in storm 1.0 ?

This is assuming backward compatibility with a given topology running both new worker 1.0 and old worker 0.9.6.

@erikdw
Copy link
Collaborator Author

erikdw commented Jan 13, 2017

@maverick2202 : sorry for delayed response. This is not a storm-mesos question really, it's a storm question. But I can tell you that storm 1.0 is not directly backwards compatible with storm 0.9.x -- they changed the package paths for all classes from backtype.* to org.apache.*.

There is a parameter-based workaround you can try for allowing storm 0.9.x topologies to run on storm 1.0+ without needing to be recompiled, but I haven't ever tried it. It's been discussed on the storm user mailing lists. See these links:

It sounds like you need to wipe a lot of state to upgrade to storm-1.0 from the emails in that thread.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants