-
Notifications
You must be signed in to change notification settings - Fork 215
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Asynchronous initialization of Spark session (IPython)
- initialize the Spark variables to a generic wrapper object - start a thread to initialize the SparkContext/SparkSession just before calling embed_kernel() - the notebook will be connected (and ready) before the Spark session gets initialized (Yarn application status is still ACCEPTED) - non-Spark related cells can be executed - tab completion on any Spark variable will show the attribute WAITING_FOR_SPARK_SESSION_TO_BE_INITIALIZED (with value "Spark Session not yet initialized ...") - running a notebook cell that references any of the Spark variables will wait for the initialization thread to complete (blocking) and delegate the execution to the actual Spark objects - Yarn application status is RUNNING once Spark session is initialized Closes #64
- Loading branch information
Showing
1 changed file
with
59 additions
and
12 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters