Skip to content
This repository has been archived by the owner on Apr 24, 2023. It is now read-only.

Spark job dynamic resource allocation with Cook #905

Open
DatPhanTien opened this issue Jul 2, 2018 · 6 comments
Open

Spark job dynamic resource allocation with Cook #905

DatPhanTien opened this issue Jul 2, 2018 · 6 comments

Comments

@DatPhanTien
Copy link

Dear,

We are interesting in running Spark jobs using Cook on our DC/OS cluster.
There are two type of Spark jobs in our cluster.

Batch jobs that have high workload but do not require fast response.
Interactive jobs, that are triggered from user side, which have strict response time requirement (order of seconds or less).

Expectation:

  1. We would like to know whether it is possible to put priority for such jobs, so that the interactive jobs have higher priority? Once the interactive jobs are triggered, batch jobs are supposed to quickly free resources (e.g., by getting killed) so that the interactive jobs can have the maximum resources allocated in order to meet the response time requirement.
  2. It would be perfect if Cook support dynamic resource allocation. By "dynamic resource allocation", we mean that the resource requirement for a job MUST NOT be fixed. For instance, instead of giving a job 3 CPU and 5GB mem, it would be better to set the job that can take 50% of the available resource (and 50% of course a configuration parameter that users can define as their will).

So far, we could not find any documentation about Cook that mentions this.
Could you please enlighten us.

Best

@pschorf
Copy link
Contributor

pschorf commented Jul 2, 2018

For 1, we already have a priority field on jobs you can use which should work for the case you described.

For 2, do you mean that you'd want a job to start with a certain percentage of the resources available on a host when it starts, regardless of the absolute amount? Or do you want a job to receive as many resources as possible, and the percentage is the minimum amount on a host to start the job?

@DatPhanTien
Copy link
Author

Dear,

Thank you for your fast answer.
Regarding 1, do you mean that the priority setting of Cook allows a high priority job to steal resources from a running job which has lower priority (by killing it)?

Regarding 2, allowing a job to start with a certain percentage of the total available resource would be enough. If we set the percentage to 100, then the job is supposed to take all the available resource.

Best regards

@wyegelwel
Copy link
Contributor

wyegelwel commented Jul 3, 2018 via email

@DatPhanTien
Copy link
Author

Dear

Thank you for this information. This looks very interesting and seems like it provides the utility for us to reach the expectation number 2.

BTW, I wonder how the Cook scheduler works with priority. Assume that job A with low priority is running on Spark cluster. Job B comes and has high priority. If the resource requested by B is higher than the current available, DOES Cook send a delete request to Spark API to kill A in order to release resource for B? Moreover, does this deleting action take time, how long it is expected to take?

Please enlighten me for this aspect.

@DaoWen
Copy link
Contributor

DaoWen commented Jul 3, 2018

Regarding priorities: Job priorities in Cook allow jobs with a higher priority to preempt lower-priority jobs started by the same user. In other words, if I have a bunch of (lower-priority) batch jobs running now that are using all of my resources, and I submit a (higher-priority) Spark job, then Cook will kill some of my batch jobs to make room for the Spark job.

The batch jobs that were preempted will be automatically retried again later when I have more resources available. However, if the batch jobs are killed / fail several times, they can run out of available retries, meaning they will no longer be automatically retried by Cook.

Note that Cook's priority value is a priority weight value (in contrast to a priority ranking value). E.g., a priority value of 80 on a Cook job is higher priority than a priority value of 1.

@DatPhanTien
Copy link
Author

Thank you.
That is very clear.
Finally, we would like to know that whether Cook is going to be fully compatible with Spark any time soon?
As for now, one has to patch Spark and re-build Spark bin in order to use Cook.
Besides, is there a pre-built docker image of Spark compatible with Cook on Docker hub?

Best

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants