-
Notifications
You must be signed in to change notification settings - Fork 216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create kernel launch application for Toree Scala kernels to enable pull-mode #34
Comments
If we want to modify the Toree, do we want to open an Issue on JIRA and later a PR? |
Yes - that is correct. However, I believe we should implement a launch application instead so as to provide us with unlimited flexibility and consistency (with no kernel update necessary). Once we produce a few such launch scripts (these are probably language specific items - not kernel specific), then a template can be documented for kernels of other languages to be produced. |
These "launchers" have 2 aspects:
In Spark we have compiled languages (Java, Scala) and scripted languages (Python, R). For the compiled languages we need to build and package a small launcher application as opposed to mere launcher scripts. For that we will need to add some build infrastructure, i.e. Maven or SBT to build and package jar files. |
+1 @ckadner Also, it seems like the 'language' aspect of these launchers is really a function of the kernel's implementation language. For example, the Python and R kernels in Toree require that a Scala application be launched. As a result, the launcher for these kernels would still be written in Scala. |
Yes, I agree that we'd better create a launch application instead of changing the Toree kernel. Would it be the easiest way if we do the similar thing as the launch_ipykernel.py script? That is, simply add another Python script that handles the non-existent connection file and let the run.sh call this script before the spark-submit. |
Generation of the connection file needs to take place on the destination node and its via spark-submit in which that occurs. I don't know spark-submit well enough to know if it can launch a script/application in one language then have that thing start something in another language. I suspect spark-submit determines the interpreter based on the launch target - so I doubt switching languages would work. (Then again, I could be way off-base.) |
Kevin is correct about this:
|
I see your idea. Since connection file should be in the node where kernel will be running, we need to create the connection file after I am going to develop a layer (let's call it |
@liukun1016 -- to better understand how the IPython kernel does find empty ports and create the connection file, start by taking a look at IPKernelApp.initialize(). For finding free ports you could also look at Akka (there your code is already in Scala), but for the session key I think we need to replicate what IPython does in ... Just a nit-pick on terminology 😉 ... the "launch application" you are writing is not really a "wrapper" ... it will simply call the |
@ckadner I see. Yes wrapper might not be a good name though. (updated the comment) Will take a look at your references and thanks a lot! |
Please take a look at PR #43. That PR makes a change to the kernelspec format (where connection_file_mode can be specified) - so just wanted to bring that to your attention since it might affect your merge. Also, once pull mode is implemented, socket mode is a small step away. |
The source code of the Toree Launcher application: https://github.com/liukun1016/ToreeLauncher/blob/master/src/main/scala/launcher/ToreeLauncher.scala Pull mode is already finished. Will test the socket mode. |
@liukun1016 - the launcher looks good! Just have a couple comments/suggestions...
Also, have you had a chance to see if creating a Spark Context allows us to get around the Lazy load issue? |
PR merged. We are going to maintain the Toree Launcher app source codes separately. So we don't have to include that in elyra building process. Project created under https://github.com/SparkTC/elyra-toree-launcher Close this issue now. |
Similar to the launch script for the py spark kernel (see launch_ipkernel.py), the Toree/Scala kernel invocations could also utilize a launch application that is responsible for constructing the necessary information for location-relative connection files thereby dramatically decreasing the likelihood of port conflicts. This would include the identification of local ports and public IP - as well as generation of the key used to sign socket traffic.
One approach could be to modify Toree itself to detect a non-existent connection file and produce the applicable information at startup - writing out that information to the specified file, but a launch application provides an area of placing other forms of logic (e.g., workspace management) that we might not have otherwise.
The text was updated successfully, but these errors were encountered: