Skip to content

Conversation

@yufan-liu
Copy link

Related to SPARK-3875.

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@yufan-liu yufan-liu changed the title Add spark.tmp.dir to set temp directory for spark [SPARK-3875] Add TEMP DIRECTORY configuration Oct 9, 2014
@srowen
Copy link
Member

srowen commented Oct 9, 2014

Utils.getLocalDir already does basically this. I agree that files should not have to go to /tmp since this is rarely a good place for lots of stuff on servers. But I do wonder whether this local dir is in fact the standard and right place for all of these things rather than another temp dir setting. Distributions already correctly configure where Utils.getLocalDir looks so this is going to be much better if possible.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PS this can all be one line if you set the default value to ... .get("spark.tmp.dir", System.getProperty("java.io.tmpdir"))

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @srowen for the comment. That's more clear!
Currently I am running Spark 1.1 in Standalone mode. And I have set the 'SPARK_LOCAL_DIRS' to data disk, which will store the broadcast Files in the target directory. However, the dependencies for executors will be fetched and stored in the /tmp/ directory.
And the dependencies will not be removed.
Like the snappy dependency, for example:
"snappy-1.0.5.3-f4880c9f-95d9-4ab6-b1c8-8686d0b88f42-libsnappyjava.so"
So the /tmp/ directory will grow large.

@mridulm
Copy link
Contributor

mridulm commented Oct 9, 2014

At least for yarn, this will create issues if overridden from default.
Not sure about mesos.

Why not use std java property and define it for local and standalone mode where relevant.

@yufan-liu
Copy link
Author

@mridulm Using std java property is fine.
Just add a more specific configuration argument.

@mridulm
Copy link
Contributor

mridulm commented Oct 9, 2014

There is a java property which controls this ... java.io.tmpdir
On 09-Oct-2014 1:22 pm, "刘钰帆" notifications@github.com wrote:

@mridulm https://github.com/mridulm Using std java property is fine.
Just add a more specific configuration argument.


Reply to this email directly or view it on GitHub
#2729 (comment).

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@tgravescs
Copy link
Contributor

Yes, as @mridulm pointed out. This should not be settable by the users on yarn. It should automatically use the yarn approved directories. We have logic in there for setting the java.io.tmpdir in ClientBase. If this is added we would need to do something similar and not let the user override it.

@srowen
Copy link
Member

srowen commented Feb 6, 2015

It sounds like this is a wont-fix, given the discussion. Do you mind closing this PR?

@asfgit asfgit closed this in 24f358b Feb 17, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants