-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ZEPPELIN-44 Interpreter for Apache Flink #75
Conversation
@Leemoonsoo Wow, Does it work? |
} | ||
} | ||
|
||
// jarr up |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'jar'?
@jongyoul Yes it works! |
I'm very excited to see Flink support in Zeppelin 👍 |
@Leemoonsoo Great!! |
This PR has some test using FlinkMiniCluster. I think everything is ready, except for this test failure. Any advice is very appreciated.
|
I forwarded the error to the Flink dev mailing list. Hopefully somebody from the Flink community can help with this. |
Fabian Hueske on dev@flink.apache.org replies: the Flink interpreter PR for Apache Zeppelin is blocked by a failing test Thanks, Fabian |
Till Rohrmann on dev@flink.apache.org replies: |
Stephan Ewen on dev@flink.apache.org replies:
repository/org/slf4j/slf4j-log4j12/1.7.10/slf4j-log4j12- Seems that SLF4J has more than one binding. I remember that Akka crashed As a simple fix, can you try and exclude the SLF4J jar from your build |
Till Rohrmann on dev@flink.apache.org replies: |
Stephan Ewen on dev@flink.apache.org replies: |
I figured out the problem: It is a wrong protobuf-java version. As a quick work-around you can set the protobuf-java version to 2.5.0 in the zeppelin-flink |
Really appreciate for investigating the problem. |
Okay, so this pull request is not blocked by anything from the Flink side right now? |
@rmetzger Not blocked by anything from Flink side. |
1b8af5d
to
460cf46
Compare
After resolving Zeppelin side issue by #88, Tests are passing. |
Merging it if there is no more discussions :-) |
W000t! |
…ron jobs takes long time or gets stuck ### What is this PR for? The cron scheduler is easy to get stuck when one of the cron jobs takes long time or gets stuck. I sometimes come across the issue that the cron scheduler stops working suddenly. According to the thread dump of ZeppelinServer, all of the DefaultQuartzScheduler_Worker threads were waiting for the job's completion and there was no thread to launch a new job. Here is the contents of the thread dump: ``` "DefaultQuartzScheduler_Worker-10" #76 prio=5 os_prio=0 tid=0x00007fb41d3b4000 nid=0x1b521 sleeping[0x00007fb3daef1000] java.lang.Thread.State: TIMED_WAITING (sleeping) at java.lang.Thread.sleep(Native Method) at org.apache.zeppelin.notebook.Notebook$CronJob.execute(Notebook.java:889) at org.quartz.core.JobRunShell.run(JobRunShell.java:202) at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:573) - locked <0x00000000c0a7dbf0> (a java.lang.Object) Locked ownable synchronizers: - None "DefaultQuartzScheduler_Worker-9" #75 prio=5 os_prio=0 tid=0x00007fb41d3b2000 nid=0x1b520 waiting on condition [0x00007fb3daff2000] java.lang.Thread.State: TIMED_WAITING (sleeping) at java.lang.Thread.sleep(Native Method) at org.apache.zeppelin.notebook.Notebook$CronJob.execute(Notebook.java:889) at org.quartz.core.JobRunShell.run(JobRunShell.java:202) at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:573) - locked <0x00000000c0a7a470> (a java.lang.Object) Locked ownable synchronizers: - None ... "DefaultQuartzScheduler_Worker-2" #68 prio=5 os_prio=0 tid=0x00007fb41d3c8800 nid=0x1b519 waiting on condition [0x00007fb3da473000] java.lang.Thread.State: TIMED_WAITING (sleeping) at java.lang.Thread.sleep(Native Method) at org.apache.zeppelin.notebook.Notebook$CronJob.execute(Notebook.java:889) at org.quartz.core.JobRunShell.run(JobRunShell.java:202) at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:573) - locked <0x00000000c0a7a7b0> (a java.lang.Object) Locked ownable synchronizers: - None "DefaultQuartzScheduler_Worker-1" #67 prio=5 os_prio=0 tid=0x00007fb41d3cc800 nid=0x1b518 waiting on condition [0x00007fb3da372000] java.lang.Thread.State: TIMED_WAITING (sleeping) at java.lang.Thread.sleep(Native Method) at org.apache.zeppelin.notebook.Notebook$CronJob.execute(Notebook.java:889) at org.quartz.core.JobRunShell.run(JobRunShell.java:202) at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:573) - locked <0x00000000c0a7dd90> (a java.lang.Object) Locked ownable synchronizers: - None ``` The above thread dump says that all of the worker threads get stuck at https://github.com/apache/zeppelin/blob/v0.7.3/zeppelin-zengine/src/main/java/org/apache/zeppelin/notebook/Notebook.java#L889. One way to reproduce this kind of issue is creating a paragraph whose status is "READY" and "disable run". That makes the paragraph status "READY" permanently and `note.isTerminated()` never turns to `true`. To fix this issue, the following two improvements has been made at this PR: 1. Remove the unnecessary `while (!note.isTerminated()) { ... }` block because the execution of all of the paragraphs is finished after `note.runAll()`. 2. Skip the cron execution if there is a running or pending paragraph. That prevents the Zeppelin cron scheduler from getting stuck by the long running paragraph whose execution duration is greater than the cron execution cycle. ### What type of PR is it? [Bug] ### Todos ### What is the Jira issue? https://issues.apache.org/jira/browse/ZEPPELIN-3077 ### How should this be tested? * Tested manually. 1. The cron scheduler does not get stuck if there is a paragraph whose status is "READY" and "disable run". 2. The following message is printed on the log file when the cron job is launched while the previous cron job still has been running. * `execution of the cron job is skipped because there is a running or pending paragraph (note id: XXXXXXXXX)` ### Screenshots (if appropriate) ### Questions: * Does the licenses files need update? No. * Is there breaking changes for older versions? No. * Does this needs documentation? Yes. The behavior of the cron job was changed not to run if there is a running or pending paragraph by this PR. Thus, the documentation `docs/usage/other_features/cron_scheduler.md` was also added by this PR. Its layout is as follow: <img width="711" alt="screen shot 2017-11-28 at 18 30 54" src="https://user-images.githubusercontent.com/31149688/33312407-20664e02-d46b-11e7-9715-9e2562d5e064.png"> Author: Keiji Yoshida <kjmrknsn@gmail.com> Closes #2687 from kjmrknsn/ZEPPELIN-3077 and squashes the following commits: 81e7218 [Keiji Yoshida] [ZEPPELIN-3077] Cron scheduler is easy to get stuck when one of the cron jobs takes long time or gets stuck
…ron jobs takes long time or gets stuck ### What is this PR for? The cron scheduler is easy to get stuck when one of the cron jobs takes long time or gets stuck. I sometimes come across the issue that the cron scheduler stops working suddenly. According to the thread dump of ZeppelinServer, all of the DefaultQuartzScheduler_Worker threads were waiting for the job's completion and there was no thread to launch a new job. Here is the contents of the thread dump: ``` "DefaultQuartzScheduler_Worker-10" apache#76 prio=5 os_prio=0 tid=0x00007fb41d3b4000 nid=0x1b521 sleeping[0x00007fb3daef1000] java.lang.Thread.State: TIMED_WAITING (sleeping) at java.lang.Thread.sleep(Native Method) at org.apache.zeppelin.notebook.Notebook$CronJob.execute(Notebook.java:889) at org.quartz.core.JobRunShell.run(JobRunShell.java:202) at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:573) - locked <0x00000000c0a7dbf0> (a java.lang.Object) Locked ownable synchronizers: - None "DefaultQuartzScheduler_Worker-9" apache#75 prio=5 os_prio=0 tid=0x00007fb41d3b2000 nid=0x1b520 waiting on condition [0x00007fb3daff2000] java.lang.Thread.State: TIMED_WAITING (sleeping) at java.lang.Thread.sleep(Native Method) at org.apache.zeppelin.notebook.Notebook$CronJob.execute(Notebook.java:889) at org.quartz.core.JobRunShell.run(JobRunShell.java:202) at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:573) - locked <0x00000000c0a7a470> (a java.lang.Object) Locked ownable synchronizers: - None ... "DefaultQuartzScheduler_Worker-2" apache#68 prio=5 os_prio=0 tid=0x00007fb41d3c8800 nid=0x1b519 waiting on condition [0x00007fb3da473000] java.lang.Thread.State: TIMED_WAITING (sleeping) at java.lang.Thread.sleep(Native Method) at org.apache.zeppelin.notebook.Notebook$CronJob.execute(Notebook.java:889) at org.quartz.core.JobRunShell.run(JobRunShell.java:202) at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:573) - locked <0x00000000c0a7a7b0> (a java.lang.Object) Locked ownable synchronizers: - None "DefaultQuartzScheduler_Worker-1" apache#67 prio=5 os_prio=0 tid=0x00007fb41d3cc800 nid=0x1b518 waiting on condition [0x00007fb3da372000] java.lang.Thread.State: TIMED_WAITING (sleeping) at java.lang.Thread.sleep(Native Method) at org.apache.zeppelin.notebook.Notebook$CronJob.execute(Notebook.java:889) at org.quartz.core.JobRunShell.run(JobRunShell.java:202) at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:573) - locked <0x00000000c0a7dd90> (a java.lang.Object) Locked ownable synchronizers: - None ``` The above thread dump says that all of the worker threads get stuck at https://github.com/apache/zeppelin/blob/v0.7.3/zeppelin-zengine/src/main/java/org/apache/zeppelin/notebook/Notebook.java#L889. One way to reproduce this kind of issue is creating a paragraph whose status is "READY" and "disable run". That makes the paragraph status "READY" permanently and `note.isTerminated()` never turns to `true`. To fix this issue, the following two improvements has been made at this PR: 1. Remove the unnecessary `while (!note.isTerminated()) { ... }` block because the execution of all of the paragraphs is finished after `note.runAll()`. 2. Skip the cron execution if there is a running or pending paragraph. That prevents the Zeppelin cron scheduler from getting stuck by the long running paragraph whose execution duration is greater than the cron execution cycle. ### What type of PR is it? [Bug] ### Todos ### What is the Jira issue? https://issues.apache.org/jira/browse/ZEPPELIN-3077 ### How should this be tested? * Tested manually. 1. The cron scheduler does not get stuck if there is a paragraph whose status is "READY" and "disable run". 2. The following message is printed on the log file when the cron job is launched while the previous cron job still has been running. * `execution of the cron job is skipped because there is a running or pending paragraph (note id: XXXXXXXXX)` ### Screenshots (if appropriate) ### Questions: * Does the licenses files need update? No. * Is there breaking changes for older versions? No. * Does this needs documentation? Yes. The behavior of the cron job was changed not to run if there is a running or pending paragraph by this PR. Thus, the documentation `docs/usage/other_features/cron_scheduler.md` was also added by this PR. Its layout is as follow: <img width="711" alt="screen shot 2017-11-28 at 18 30 54" src="https://user-images.githubusercontent.com/31149688/33312407-20664e02-d46b-11e7-9715-9e2562d5e064.png"> Author: Keiji Yoshida <kjmrknsn@gmail.com> Closes apache#2687 from kjmrknsn/ZEPPELIN-3077 and squashes the following commits: 81e7218 [Keiji Yoshida] [ZEPPELIN-3077] Cron scheduler is easy to get stuck when one of the cron jobs takes long time or gets stuck
…ack_notifications_fixed_branch to V_1.0.0 * commit '199cb1790f5ff03eee3be91fdeedd5ff08f7d4a5': [ZP-252] add slack integration
Interpreter for Apache Flink.
Flink people helped a lot to write the interpreter. Thanks so much! Some codes are copied from Flink's development branch. Once Flink releases 0.9, copied code and snapshot repository configuration will be removed.
Build
if there're no options, by default it is building against flink 0.9.0-milestone-1.
With combination of Zeppelin, it is good idea to use 0.9-SNAPSHOT, because of it support .collect() that helps really a lot to get results data and display it on Zeppelin.
So, you might want to build in this way,
Screenshot