Adding known issue for MESOS-1688 #1860

MartinWeindel · 2014-08-08T21:01:00Z

When using Mesos with the fine-grained mode, a Spark job can run into a dead lock on low allocatable memory on Mesos slaves. As a work-around 32 MB (= Mesos MIN_MEM) are allocated for each task, to ensure Mesos making new offers after task completion.
From my perspective, it would be better to fix this problem in Mesos by dropping the constraint on memory for offers, but as temporary solution this patch helps to avoid the dead lock on current Mesos versions.
See [MESOS-1688] No offers if no memory is allocatable for details for this problem.

…sues.apache.org/jira/browse/MESOS-1688])

AmplabJenkins · 2014-08-08T21:02:57Z

Can one of the admins verify this patch?

pwendell · 2014-08-24T17:16:07Z

Hey Martin,

I'm having a bit of trouble seeing how this works around the issue. From what I can tell the issue is that if someone creates Executors that consume all memory, Mesos will refuse to make offers for the tasks. However, this fix just adds 32MB of memory as a requirement for the task... but it seems like if the offer is never made in the first place, this will make no difference. Can you describe a sequence of offers where this change alters the execution? Thanks for looking into this!

Patrick

MartinWeindel · 2014-08-24T18:56:17Z

Hey Patrick,

first of all let me emphasize again that this is only a work-around. The
real problem is that Mesos only makes offers if there are at least 32 MB
memory available which conflicts with allocating memory only for Spark
worker executors and none for tasks.
You seem to be right, this work-around does not help if executors
already consume all memory (up to a remainder of <= 31 MB).
So I don't know if it will avoid dead locks in all cases.

I can only argue from an experimental point of view, that I have not
seen the dead lock in my cluster anymore after applying this patch (I
have tested under very heavy work load).
I suspect the chance is very small that another executor starts before
at least one task of the first executor is started.
In any case, after a task is finished, there are at least 32 MB memory
allocatable so that Mesos always will make offers and the dead lock is
avoided.

BTW, I have also played with changing the executor memory so that there
must always be some Mesos slave memory left over, but to my surprise
this did not avoid the dead locks reliable.

So I'm not sure if this patch should be integrated into the Spark source
code.
But I hope it helps to understand the issue. And maybe it makes the
fine-grained mode usable for similar setups like mine until a better
solution has been found.

If I can help in any way, just tell me.

Best regards,
Martin

Am 24.08.2014 19:16, schrieb Patrick Wendell:

Hey Martin,

I'm having a bit of trouble seeing how this works around the issue.
From what I can tell the issue is that if someone creates Executors
that consume all memory, Mesos will refuse to make offers for the
tasks. However, this fix just adds 32MB of memory as a requirement for
the task... but it seems like if the offer is never made in the first
place, this will make no difference. Can you describe a sequence of
offers where this change alters the execution? Thanks for looking into
this!

Patrick

—
Reply to this email directly or view it on GitHub
#1860 (comment).

mateiz · 2014-08-25T04:51:31Z

From my knowledge of Mesos, this seems like a good fix. I think we should do this until MESOS-1688 is fixed.

mateiz · 2014-08-25T04:54:12Z

Jenkins, test this please

mateiz · 2014-08-25T04:55:08Z

BTW @MartinWeindel one small request -- can you update the docs/running-on-mesos.md page to explain that each task will consume 32 MB? Otherwise people might set Spark's executor memory to be all of the memory on the Mesos worker, which is going to mean no tasks launched.

SparkQA · 2014-08-25T05:00:53Z

QA tests have started for PR 1860 at commit d9d2ca6.

This patch merges cleanly.

SparkQA · 2014-08-25T05:01:51Z

QA tests have finished for PR 1860 at commit d9d2ca6.

This patch fails unit tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
- $FWDIR/bin/spark-submit --class org.apache.spark.repl.Main "$
- $FWDIR/bin/spark-submit --class org.apache.spark.repl.Main "$

mateiz · 2014-08-25T05:04:54Z

BTW this failure is due to a style check -- you can run sbt scalastyle locally to find all style issues (the Jenkins log also lists the problem).

iven · 2014-08-25T05:40:18Z

@MartinWeindel I think you should check if there's enough memory in the offer first.

mateiz · 2014-08-25T05:57:55Z

That's true, now that we take 32 MB extra you need to change the logic about how many tasks we can allocate. That will make it trickier.

pwendell · 2014-08-25T05:59:01Z

Hey @MartinWeindel - I'm curious, which of the following cases are you in:

Case 1. You have individual executors that attempt to acquire all the memory on the node.

Case 2. You have multiple executors per node, but their total memory adds up the total amount of memory on the node.

I could see how this would help with Case 2 because it could prevent a second executor from being launched in a way that acquires all of the host memroy. But I'm still wondering whether it affects Case 1.

MartinWeindel · 2014-08-25T12:59:16Z

Yes, this becomes tricky. And I don't see a satisfying solution, as I would
have to predict how many tasks will run in parallel to ensure that there is
enough memory for each task.

This patch solves one problem, but will introduce new ones. Because it's
only dealing on the symptoms not on the cause.
I think it is better not to integrate it.

I've already created a pull request to get the cause fixed in Mesos:
apache/mesos#24

On Mon, Aug 25, 2014 at 7:58 AM, Matei Zaharia notifications@github.com
wrote:

That's true, now that we take 32 MB extra you need to change the logic
about how many tasks we can allocate. That will make it trickier.

—
Reply to this email directly or view it on GitHub
#1860 (comment).

mateiz · 2014-08-25T19:17:45Z

After thinking about this more, it seems that another workaround is to make sure your executors always leave 32 MB free on each node (even if you launch multiple executors, make sure their sizes don't add up to quite the full memory). Would that work? If so, we can just add that to the docs.

MartinWeindel · 2014-08-25T21:16:08Z

OK, so I have reverted the work-around patch and added a known issue paragraph to the running-on-mesos documentation.

mateiz · 2014-08-27T01:24:20Z

Cool, thanks, that looks great.

When using Mesos with the fine-grained mode, a Spark job can run into a dead lock on low allocatable memory on Mesos slaves. As a work-around 32 MB (= Mesos MIN_MEM) are allocated for each task, to ensure Mesos making new offers after task completion. From my perspective, it would be better to fix this problem in Mesos by dropping the constraint on memory for offers, but as temporary solution this patch helps to avoid the dead lock on current Mesos versions. See [[MESOS-1688] No offers if no memory is allocatable](https://issues.apache.org/jira/browse/MESOS-1688) for details for this problem. Author: Martin Weindel <martin.weindel@gmail.com> Closes apache#1860 from MartinWeindel/master and squashes the following commits: 5762030 [Martin Weindel] reverting work-around a6bf837 [Martin Weindel] added known issue for issue MESOS-1688 d9d2ca6 [Martin Weindel] work around for problem with Mesos offering semantic (see [https://issues.apache.org/jira/browse/MESOS-1688])

timothysc · 2014-09-18T16:24:05Z

Just for crossref MESOS-1688 has been committed and will be part of 0.21.0 release cycle.

mateiz · 2014-09-20T23:47:24Z

Great! I'll create a JIRA to update Spark to it when that comes out.

work around for problem with Mesos offering semantic (see [https://is…

d9d2ca6

…sues.apache.org/jira/browse/MESOS-1688])

MartinWeindel added 2 commits August 25, 2014 23:05

added known issue for issue MESOS-1688

a6bf837

reverting work-around

5762030

MartinWeindel changed the title ~~work around for problem with Mesos offering semantic~~ Adding known issue for MESOS-1688 Aug 25, 2014

asfgit closed this in be043e3 Aug 27, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding known issue for MESOS-1688 #1860

Adding known issue for MESOS-1688 #1860

MartinWeindel commented Aug 8, 2014

AmplabJenkins commented Aug 8, 2014

pwendell commented Aug 24, 2014

MartinWeindel commented Aug 24, 2014

mateiz commented Aug 25, 2014

mateiz commented Aug 25, 2014

mateiz commented Aug 25, 2014

SparkQA commented Aug 25, 2014

SparkQA commented Aug 25, 2014

mateiz commented Aug 25, 2014

iven commented Aug 25, 2014

mateiz commented Aug 25, 2014

pwendell commented Aug 25, 2014

MartinWeindel commented Aug 25, 2014

mateiz commented Aug 25, 2014

MartinWeindel commented Aug 25, 2014

mateiz commented Aug 27, 2014

timothysc commented Sep 18, 2014

mateiz commented Sep 20, 2014

Adding known issue for MESOS-1688 #1860

Adding known issue for MESOS-1688 #1860

Conversation

MartinWeindel commented Aug 8, 2014

AmplabJenkins commented Aug 8, 2014

pwendell commented Aug 24, 2014

MartinWeindel commented Aug 24, 2014

mateiz commented Aug 25, 2014

mateiz commented Aug 25, 2014

mateiz commented Aug 25, 2014

SparkQA commented Aug 25, 2014

SparkQA commented Aug 25, 2014

mateiz commented Aug 25, 2014

iven commented Aug 25, 2014

mateiz commented Aug 25, 2014

pwendell commented Aug 25, 2014

MartinWeindel commented Aug 25, 2014

mateiz commented Aug 25, 2014

MartinWeindel commented Aug 25, 2014

mateiz commented Aug 27, 2014

timothysc commented Sep 18, 2014

mateiz commented Sep 20, 2014