-
Notifications
You must be signed in to change notification settings - Fork 4.1k
STORM-386 nodejs multilang protocol implementation and examples #177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Sorry for getting to this so late. This looks interesting. I'll need to experiment before I approve, but I will definitely review it. I'd be interested in hearing more about how you are using it. |
|
Also, could you file a JIRA for this and change the pull request title to include the JIRA number? |
|
The JIRA is STORM-386 as mentioned in the description (will fix the title once the developer is back in the office). We are using it in production already to migrate nodejs code to storm (low throughput low latency application). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We generally let git keep track of the authorship of files instead of marking the files with owners.
|
The idea is definitely interesting. I have a few suggestions though. The first is with the 'done' callback. I can see how it is needed to support async processing, but for me it feels like it is not likely to be too common, but then again I have not done anything significant with javascript and nothing with nodejs so I just don't know with the libraries are like. My other issue is around packaging and testing. If we are going to officially support nodejs as an execution environment similar to how we support python and ruby I would like to see us running unit tests with it like we do with python and ruby. I would also like to see it packaged and distributed with storm, and not in the resources of storm-starter. Alternatively we could set it up as a separate jar similar to the kafka spout, that would have the storm.js in the resources directory, and then ask users to add it as a maven dependency when packaging their own topologies. Either way though I would like to see us test this. |
|
On your first comment: |
|
Your logic seems fine to me. Like I said I am not super familiar with the nodejs APIs. I would, however like to see tests and formal packaging for it. If we are going to support it, we should fully support it. I also am curious how likely others would be to use this library too. Although it is a relatively small amount of code, it could become a maintenance issue. |
|
Unit Tests: We can follow the way storm.rb is handled in the repository (3 copies of the same file) and translate the ruby unit tests into nodejs. I can see why you don't want a third window being broken. But I request to take the multilang cleanup into another JIRA, which leads me to ... Packaging: Deploying and upgrading ruby/python/nodejs code is really different than Java code. Java code in storm is JARed-with-dependencies. Packing nodejs code with its entire npm dependency tree and automating it with maven is possible, but you loose the no-need-to-do-the-compile-thingy nature of the language. So even if we do call gem/pip/npm install from within the maven build process, pack everything in a JAR and ship it with storm, I am not sure I would use it that way. Instead the JAR would just be pure java code pointing to the folder where the ruby/python/nodejs code is. And that is also the folder where storm.rb/py/js would be copied to. So there is no reusable component you can ship with storm, other than the official storm.js file that was tested to work with this version of Storm. Maintenance: The amount of maintenance of the multilang project is approx numberOfFeatures times the numberOfLanguages times the numberOfSerialization protocols. |
…onsibility to the process method
…for json. Delete git add brain/storm.js\nDelete erroneous method override
… as the pid number and then send the pid to parent
Fixed stdin parsing to support separator split across chunks
…nodejs-clojure-test
…/dev/resources/storm.js
Anya add nodejs clojure test
|
Hi, the latest version is working for quite a while in production, and includes unit tests. |
|
Sorry I have been fighting some fires and am trying to catch up on things now. I will try it out. |
|
Great to hear that the latest code is in prod now. I ran into a few problems, but none of them are really your code. The test didn't fail when I didn't have nodejs installed. but this is because of a bug in util.clj STORM-461 Also I really hate having two copies of storm.js in there, but we have the same thing for ruby and python. I files STORM-462 for this too. The last problem I has was with documentation. We don't indicate anywhere what needs to be installed to build/run the tests. Could you add a section to DEVELOPER.md that indicates that you need ruby, python, and nodejs for all of the tests to pass. I also assume that you are going to be able to help support this long term? |
8dc1926 to
d7e446f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indentation seems off here.
|
I am OK with adding this to multilang if others are OK. I agree we can move it later if need be. |
|
For testing, I think I need #231 first. |
|
I've gone through the code and ran the build with node v0.10.28 installed on osx all the test passes. I ran a single node cluster and deployed WordCountTopologyNode and it stops after less than minute with this Spout errror apache-storm-0.9.3-incubating-SNAPSHOT/storm-local/supervisor/stormdist/wordcount-1-1409680231/resources/randomsentence.js and my topology complete latency at 30757.945. I haven't debugged enough to find the cause. |
|
Referring the ReferenceError - it was caused by a bug in randomsentence.js that happens only on case of a "fail" for one of the tuples. I fixed the bug (fix committed), still not sure why did you get "fail" in the split sentences example. |
|
@anyatch Thanks. Will test the changes. The above error showed up because of shell process broken probably caused by randomsentence.js bug. |
|
@anyatch I don't see the above mentioned issue anymore. But for some reason |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like you are anchoring the tuple here but there is no ack in WordCountTopologyNode.WordCount Bolt. This might be the one causing the fail count go up and also completeLatency. I removed the anchorTupleId and in storm.js anchors as part of the message. There is no failed tuples and lower completeLatency of 41.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WordCountTopologyNode.WordCount Bolt extends BaseBasicBolt which is then wrapped by BasicBoltExecutor which AFAIK acks
|
RandomSentenceSpout adds a sleep(100) in nextTuple and its not in randomsentece.js that explains the high completeLatency |
|
I am +1 on merging this as is and fixing those above mentioned issues as part of another jira. I don't see any issues with the api above small fixes in example topologies. Thanks. |
|
@harshach Isn't it a better practice to set topology.sleep.spout.wait.strategy.time.ms to 100 instead of adding sleep to the spout ? I could add the equivalent of sleep(100) to randomSentence.js even so. |
|
@itaifrenkel I agree topology.sleep.spout.wait.strategy.time.ms is a better option. |
|
Sorry this took so long to get in. +1 I just merged it to master. Not sure how your team wants to be recognized in the README? There are a lot of contributors to this, but if you want to put us a list of names I can add them in. |
|
I just reviewed the merge. It seems like we placed storm.js in the wrong folder when compared to storm.py. Here are the correction steps: mkdir storm-core/src/multilang/js |
|
Could you file a new pull request for those changes? you can reopen the same JIRA if you like, or file a new one. |
|
Opened #265 |
|
If you could please add @anyatch and @itaifrenkel to the readme that would be nice. thanks. |
|
@itaifrenkel and @anyatch both of you should be in README.markdown now. Thanks again for your contribution. |
Imporve Test Timeouts. [BUG 6844307]
More details at:
https://issues.apache.org/jira/browse/STORM-386