Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ques: Does this install spark #53

Open
daddydrac opened this issue Dec 16, 2020 · 7 comments
Open

Ques: Does this install spark #53

daddydrac opened this issue Dec 16, 2020 · 7 comments
Assignees

Comments

@daddydrac
Copy link

Does the helm chart for Livy deploy Spark? If not, how do we configure Spark helm chart and Livy helm chart so they can "talk" to each other/submit jobs via REST services?

@jahstreet
Copy link
Collaborator

jahstreet commented Dec 17, 2020

Hi @joehoeller , thx for the question.
Basically there are 2 Livy job submission modes: Batch and Interactive.
For Batch mode the communication flow is the following:

  1. Livy talks to Kubernetes API to request Spark Driver
  2. Spark Driver talks to Kubernetes API to request Spark Executors and resolves them once created to communicate tasks and track the status/progress
  3. Livy resolves Spark Driver and Executrors via Kubernetes API and tracks their statuses
    So no direct interaction of Livy and Spark in this mode.

For interactive mode:

  1. Livy talks to Kubernetes API to request Spark Driver with the specific entrypoint JAR (for the interactive mode) and spins up the RPC server asynchronously waiting for the client registration request
  2. Spark Driver talks to Kubernetes to request the executors and calls Livy RPC server to register and share its own RPC server endpoint
  3. Livy communicates with Spark Driver via their RPC servers

Note: both modes works in basically the same way on Yarn, the only replacement here is that we use Kubernetes as the resource manager instead.

Does that answer your question? Please let me know if you would like to know more about any specific part of these flows. Best.

@daddydrac
Copy link
Author

daddydrac commented Dec 17, 2020 via email

@jahstreet
Copy link
Collaborator

Awesome, happy to help.

To answer the rest 2 questions I would suggest you to give a try to this step-by-step installation guide on Minikube. This will show you how to spin up the required components and have JupyterHub with Jupyter notebooks per user exposed externally from the Kubernetes cluster as well as direct access to Livy UI with the links to Spark UI.

Also some design details can be found here.

In short: to setup the Jupyter -> Livy -> Spark communication Sparkmagic is used. To expose the component endpoints Nginx Controller backed Ingresses are used.

@jahstreet
Copy link
Collaborator

Hi @salinaaaaaa , is that issue fixed for you?

@jahstreet jahstreet self-assigned this Oct 20, 2022
@Almenon
Copy link

Almenon commented Jul 10, 2023

Hi @jahstreet , so if I understand correctly, in interactive mode (the mode you would probably use w/ sparkmagic & jupyter notebook), Livy would:

  1. ask Kubernetes to create a driver job
  2. The driver job would create a pod
  3. Driver asks kubernetes to create executor pods
  4. Driver kills executor pod when they finish
  5. When driver finishes, the pod automatically dies as it is a Kubernetes job.

@jahstreet
Copy link
Collaborator

jahstreet commented Aug 29, 2023

Hi @Almenon , you're almost correct. If we speak about interactive mode:

1 ask Kubernetes to create a driver PODand establish web-socket connection between Livy and Spark driver rpc server
...
5 When interactive session completes Livy deletes the driver pod, which triggers deletion of executor pods referenced to driver pod

@Almenon
Copy link

Almenon commented Aug 29, 2023

Thanks! Sorry for the basic question, but what does it mean when you say the "interactive session completes"? in this example, would the session complete when the function get finishes?

double pi = client.submit(new PiJob(samples)).get();

For context I'm a DevOps engineer, not a spark developer, so this stuff is new to me 😅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants