-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a (local mode) Scalding Interpreter to Zeppelin #561
Conversation
…to get console output next. And add tests, and make it work for HDFS.
…need some debugging.
…ala console output. Need to figure that out.
…- will want to move this to Scalding itself
/** | ||
* Scalding interpreter for Zeppelin. Based off the Spark interpreter code. | ||
* | ||
* @author sriramkrishnan |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you be willing to remove that information from the comments, please?
It is not strictly documented yet, but Zeppelin so far, as many other ASF projects (Hadoop, Zookeeper, Avro, etc), do not encourage use of @author tags.
We definitely want and keep contributors credits, but we use git, JIRA and mailing list history, so nothing will be lost.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, not a problem.
@sriramkrishnan Great contribution, thank you! I think there are just a couple of things need to be take care
|
Should we add ScaldingInterpreter to ZEPPELIN_INTERPRETERS in ZeppelinConfiguration.java? tw-172-25-131-152 incubator-zeppelin (presto) $ git diff master -- zeppelin-zengine/src/main/java/org/apache/zeppelin/conf/ZeppelinConfiguration.java
|
@bzz thanks for the review. I will address your other comments and update the PR in a day or so. @prasadwagle good point. I will add that too. |
Addressed all PR comments, except for docs, license as @bzz has suggested, which I will do next. Build is also green now. ps: I do need the hadoop-client jar as the Scalding REPL uses it even for local mode (and provided doesn't work there AFAICT). |
Added docs. @bzz I believe I have addressed all comments, except adding LICENSE info to zeppelin-distribution/src/bin_license/LICENSE. Do you happen to have a script for doing that already? I am dreading doing this by hand. |
I've tested it in my local machine and worked well. Thanks for this contribution. |
… REPL needs org.apache.hadoop.conf.Configuration even in local mode.
Thanks @Leemoonsoo I believe I have addressed all your comments, except for the jgrapht. Note that the jgrapht here is thirdparty:jgrapht, which seems to be a version that Cascading has forked. I am inclined to leave it as it, but open to suggestions. |
bump What else do we need to do to get this merged? Thanks! |
I have queried and waiting for answer about license of thirdparty:jgraph from cascading. One of the fastest way to get this merged, making it optional like we did it for geode https://github.com/apache/incubator-zeppelin/pull/379/files. That would avoid license problem of binary dependency and 3rd party maven repo. Then i think it can be merged. Eventually, i hope https://issues.apache.org/jira/browse/ZEPPELIN-546 solve the problem. |
I am OK making the Scalding interpreter optional with a -Pscalding flag. Does that also mean I have to revert all my changes to the LICENSE file? |
Yes it is, changes to LICENSE file and files that has been added to licenses directory need to be reverted. And you probably want to describe -Pscalding flag in https://github.com/apache/incubator-zeppelin/blob/master/README.md |
fyi, i have got replied about license, like
|
Thanks @Leemoonsoo. So how should we proceed? Should I add it as LGPL? Also, are you OK with using the conjars repo now? Or do you recommend that I add a -Pscalding flag. Would be great to get this merged either ways! |
Apart from license for jgrapht, using 3rd party repo (conjars) need to be avoided. How about revert commits for the LICENSE / files added under licenses dir and add -Pscalding flag, for this pullrequest. After it is merged, create another issue for removing -Pscalding and continue to work. Would this way works for you? |
@Leemoonsoo as you suggested, I have reverted commits for the LICENSE / files added under licenses dir, added -Pscalding flag, and updated the docs. Are you OK with shipping/merging this PR now? |
And we also have a green build now. |
LGTM. @sriramkrishnan Thanks for the contribution! |
Thanks @Leemoonsoo for the review. Could one of the committers please merge if there are no other objections? |
I'm merging it, if there're no more discussions. |
Thanks! |
Filed https://issues.apache.org/jira/browse/ZEPPELIN-555 to track addition of Hadoop mode to the Scalding interpreter. I may pick it up in my "free time", but I have added a description there if anyone wants to tackle it. CC @prasadwagle |
### What is this PR for? Since we avoid publish official binary package built with 3rd party maven repository, currently we can not include [Scalding](#561) dependency in binary package of Zeppelin. But the test for Scalding interpreter should be done in travis. So I just added `-Pscalding` flag to `.travis.yml` file. ### What type of PR is it? Improvement ### Todos * [x] - Add -Pscalding flag to .travis.yml. ### Is there a relevant Jira issue? No. But you may checkout [here](https://github.com/apache/incubator-zeppelin/pull/561/files#r48471634). ### How should this be tested? ### Screenshots (if appropriate) ### Questions: * Does the licenses files need update? No * Is there breaking changes for older versions? No * Does this needs documentation? No Author: Ryu Ah young <fbdkdud93@hanmail.net> Closes #594 from AhyoungRyu/MODIFY-TRAVIS-FILE and squashes the following commits: c7edbf9 [Ryu Ah young] Add -Pscalding build option to .travis.yml
RB_ID=780817
What is this PR for?
Scalding (https://github.com/twitter/scalding) is a Scala library for writing MapReduce jobs.
This issue tracks the addition of a Scalding interpreter for Zeppelin. To keep this work incremental, this PR will focus on just a local mode implementation. The Hadoop mode can be a subsequent addition.
What type of PR is it?
Feature
Todos
Is there a relevant Jira issue?
https://issues.apache.org/jira/browse/ZEPPELIN-526
How should this be tested?
Run the tests in: scalding/src/test/java/org/apache/zeppelin/scalding/ScaldingInterpreterTest.java
Screenshots
### Questions: - This could use documentation, which could just be the example in the screenshot. Where can I contribute that?