Create kernel launch application for Toree Scala kernels to enable pull-mode #34

kevin-bates · 2017-06-19T15:39:39Z

Similar to the launch script for the py spark kernel (see launch_ipkernel.py), the Toree/Scala kernel invocations could also utilize a launch application that is responsible for constructing the necessary information for location-relative connection files thereby dramatically decreasing the likelihood of port conflicts. This would include the identification of local ports and public IP - as well as generation of the key used to sign socket traffic.

One approach could be to modify Toree itself to detect a non-existent connection file and produce the applicable information at startup - writing out that information to the specified file, but a launch application provides an area of placing other forms of logic (e.g., workspace management) that we might not have otherwise.

LK-Tmac1 · 2017-06-19T16:59:19Z

If we want to modify the Toree, do we want to open an Issue on JIRA and later a PR?

kevin-bates · 2017-06-19T19:04:38Z

Yes - that is correct. However, I believe we should implement a launch application instead so as to provide us with unlimited flexibility and consistency (with no kernel update necessary). Once we produce a few such launch scripts (these are probably language specific items - not kernel specific), then a template can be documented for kernels of other languages to be produced.

ckadner · 2017-06-20T01:50:03Z

These "launchers" have 2 aspects:

finding available local ports and creating the connection file (different per language)
launching the actual kernel process (different per kernel)

In Spark we have compiled languages (Java, Scala) and scripted languages (Python, R). For the compiled languages we need to build and package a small launcher application as opposed to mere launcher scripts. For that we will need to add some build infrastructure, i.e. Maven or SBT to build and package jar files.

kevin-bates · 2017-06-20T15:58:47Z

+1 @ckadner
Regarding item 2, I was hoping the kernel process to launch would be conveyed as a parameter to the launch script/application, thereby only requiring language-specific launchers. I suspect you meant the same, but just want to be sure we only have an order of N thing and not NxM.

Also, it seems like the 'language' aspect of these launchers is really a function of the kernel's implementation language. For example, the Python and R kernels in Toree require that a Scala application be launched. As a result, the launcher for these kernels would still be written in Scala.

LK-Tmac1 · 2017-06-20T17:37:07Z

Yes, I agree that we'd better create a launch application instead of changing the Toree kernel.

Would it be the easiest way if we do the similar thing as the launch_ipykernel.py script?

That is, simply add another Python script that handles the non-existent connection file and let the run.sh call this script before the spark-submit.

kevin-bates · 2017-06-20T18:28:39Z

Generation of the connection file needs to take place on the destination node and its via spark-submit in which that occurs. I don't know spark-submit well enough to know if it can launch a script/application in one language then have that thing start something in another language. I suspect spark-submit determines the interpreter based on the launch target - so I doubt switching languages would work. (Then again, I could be way off-base.)

ckadner · 2017-06-20T20:32:36Z

Kevin is correct about this:

spark-submit determines the interpreter based on the launch target

LK-Tmac1 · 2017-06-20T23:31:55Z

I see your idea. Since connection file should be in the node where kernel will be running, we need to create the connection file after run.sh is called.

I am going to develop a layer (let's call it ToreeLauncher for now) around the Main method of Toree to generate the missing connection file, after spark-submit issued in run.sh but before the kernel starts.

ckadner · 2017-06-21T02:05:36Z

@liukun1016 -- to better understand how the IPython kernel does find empty ports and create the connection file, start by taking a look at IPKernelApp.initialize(). For finding free ports you could also look at Akka (there your code is already in Scala), but for the session key I think we need to replicate what IPython does in ... jupyter_client/session.py

Just a nit-pick on terminology 😉 ... the "launch application" you are writing is not really a "wrapper" ... it will simply call the ToreeMain application once it is done creating the connection file.

LK-Tmac1 · 2017-06-21T04:25:38Z

@ckadner I see. Yes wrapper might not be a good name though. (updated the comment)

Will take a look at your references and thanks a lot!

kevin-bates · 2017-06-27T16:53:42Z

Please take a look at PR #43. That PR makes a change to the kernelspec format (where connection_file_mode can be specified) - so just wanted to bring that to your attention since it might affect your merge. Also, once pull mode is implemented, socket mode is a small step away.

LK-Tmac1 · 2017-06-29T23:30:01Z

The source code of the Toree Launcher application: https://github.com/liukun1016/ToreeLauncher/blob/master/src/main/scala/launcher/ToreeLauncher.scala

Pull mode is already finished. Will test the socket mode.

kevin-bates · 2017-07-03T16:42:24Z

@liukun1016 - the launcher looks good! Just have a couple comments/suggestions...

We should make sure the close calls for the BufferedWriter and Socket are in finally blocks in case the preceding write calls fail.
Let's move the block of code for sending the file back on the socket into the block of code that detects the file doesn't exist (so just after creating the file). The reason for this is because the user may change the connection mode from 'socket' to 'push' (although extremely unlikely), but if they don't remove the --response-address parameter from the kernel.json file, then the code will try to bind to a socket that won't exist - since Elyra has pushed the file. (Actually, it looks like code will exit(-1) with an invalid socket address message since the value of that parameter will be ERROR__NO__KERNEL_RESPONSE_ADDRESS, but same effect.)

Also, have you had a chance to see if creating a Spark Context allows us to get around the Lazy load issue?

…#49) * Toree Launcher application supports pull and socket mode * update run.sh * use play.lib.json for json serialization; update run.sh + kernel.json for launcher opt/jar * update println in writeToSocket method

LK-Tmac1 · 2017-07-06T21:28:34Z

PR merged. We are going to maintain the Toree Launcher app source codes separately. So we don't have to include that in elyra building process.

Project created under https://github.com/SparkTC/elyra-toree-launcher

Close this issue now.

LK-Tmac1 self-assigned this Jun 19, 2017

kevin-bates added this to the Sprint 4 milestone Jun 19, 2017

kevin-bates added the resource management label Jun 19, 2017

LK-Tmac1 mentioned this issue Jul 4, 2017

[Issue #34] Toree launcher app & update run.sh #49

Merged

LK-Tmac1 closed this as completed Jul 6, 2017

lresende unassigned LK-Tmac1 Oct 14, 2017

kevin-bates modified the milestones: Sprint 4, v0.6 Mar 26, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create kernel launch application for Toree Scala kernels to enable pull-mode #34

Create kernel launch application for Toree Scala kernels to enable pull-mode #34

kevin-bates commented Jun 19, 2017

LK-Tmac1 commented Jun 19, 2017

kevin-bates commented Jun 19, 2017

ckadner commented Jun 20, 2017

kevin-bates commented Jun 20, 2017

LK-Tmac1 commented Jun 20, 2017 •

edited

kevin-bates commented Jun 20, 2017

ckadner commented Jun 20, 2017

LK-Tmac1 commented Jun 20, 2017 •

edited

ckadner commented Jun 21, 2017

LK-Tmac1 commented Jun 21, 2017

kevin-bates commented Jun 27, 2017

LK-Tmac1 commented Jun 29, 2017

kevin-bates commented Jul 3, 2017

LK-Tmac1 commented Jul 6, 2017 •

edited

Create kernel launch application for Toree Scala kernels to enable pull-mode #34

Create kernel launch application for Toree Scala kernels to enable pull-mode #34

Comments

kevin-bates commented Jun 19, 2017

LK-Tmac1 commented Jun 19, 2017

kevin-bates commented Jun 19, 2017

ckadner commented Jun 20, 2017

kevin-bates commented Jun 20, 2017

LK-Tmac1 commented Jun 20, 2017 • edited

kevin-bates commented Jun 20, 2017

ckadner commented Jun 20, 2017

LK-Tmac1 commented Jun 20, 2017 • edited

ckadner commented Jun 21, 2017

LK-Tmac1 commented Jun 21, 2017

kevin-bates commented Jun 27, 2017

LK-Tmac1 commented Jun 29, 2017

kevin-bates commented Jul 3, 2017

LK-Tmac1 commented Jul 6, 2017 • edited

LK-Tmac1 commented Jun 20, 2017 •

edited

LK-Tmac1 commented Jun 20, 2017 •

edited

LK-Tmac1 commented Jul 6, 2017 •

edited