Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

default java.io.tmp location is too long #3215

Closed
or-shachar opened this issue Jun 19, 2017 · 33 comments
Closed

default java.io.tmp location is too long #3215

or-shachar opened this issue Jun 19, 2017 · 33 comments
Labels
team-Local-Exec Issues and PRs for the Execution (Local) team type: feature request untriaged

Comments

@or-shachar
Copy link
Contributor

I have an issue with files created at temp directory having too long pathname.
Please kindly see this SO question.

I managed to solve that by explicitly adding to my test target

"jvm_flags" = ["-Djava.io.tmpdir=/tmp"],

But that hurts the hermetical principle of testing.
I suggest we offer much shorter yet unique tmp location for tests
(somehting like /tmp/bazel-<unique-7-chars>)

WDYT?

@hlopko
Copy link
Member

hlopko commented Jun 26, 2017

Stack Overflow question got answered, closing the issue. I'd say asking only on Stack Overflow would be enough in this case. Glad you found the workaround :)

@hlopko hlopko closed this as completed Jun 26, 2017
@ittaiz
Copy link
Member

ittaiz commented Jun 26, 2017

@mhlopko what about the workaround hurting the hermetically of the test?

@hlopko
Copy link
Member

hlopko commented Jun 27, 2017

Sorry was too fast scanning the issue, I assumed it's just a dup of the SO.

@hlopko hlopko reopened this Jun 27, 2017
@hlopko hlopko added category: misc > misc P3 We're not considering working on this, but happy to review a PR. (No assignee) type: feature request labels Jun 27, 2017
@jianglai
Copy link

jianglai commented Dec 4, 2017

@mhlopko Has there been any progress on this front? This is also affecting Debian based distros, which has a much larger install base than just macOS. Running this test rule results in a file path that is 224-character long, and causes gpg-agent to fail due to "filename too long". gpg-agent is known to have the same 108-character restriction.

Note that it does not affect Ubuntu based distros, just plain debian based ones.

@hlopko
Copy link
Member

hlopko commented Dec 4, 2017

Pinging @philwo as the assigned owner :)

@philwo philwo assigned laszlocsomor and unassigned philwo Dec 4, 2017
@philwo
Copy link
Member

philwo commented Dec 4, 2017

@laszlocsomor is our designated expert on all things related to temporary directories. Reassigning.

@laszlocsomor
Copy link
Contributor

laszlocsomor commented Dec 4, 2017

Starting from Bazel 0.8.0, every action has:

  • TMPDIR envvar on Linux/macOS
  • TEMP and TMP envvars on Windows.

They point to safe temp dirs that the action can use. Ironically, the reason I implemented that feature was exactly Files.createTempFile. :)

@jianglai
Copy link

jianglai commented Dec 4, 2017

@laszlocsomor Can you elaborate a bit on how to actually solve the issue? Looking at 0.8.0 release log, should I just add --sandbox_tmpfs_path=/tmp to bazel test? Or do I have to make some changes to the code as well?

EDITED: Sorry I misread a previous release doc. (Just did ctrl+F on the doc :-P)

@laszlocsomor
Copy link
Contributor

@jianglai , I'm sorry, I made the same mistake as @mhlopko earlier, thinking the issue was about something else... Anyway, are you also trying to create a short temp dir? The shortest I can get on Linux with Bazel 0.8.0 is:

  $ cat WORKSPACE 
workspace(name = 'w')

  $ cat BUILD
genrule(
    name = "x",
    outs = ["x.txt"],
    cmd = "echo $$TMPDIR > $@",
)

  $ bazel --output_base=/tmp/b build //:x.txt &>/dev/null && cat bazel-genfiles/x.txt
/tmp/b/bazel-sandbox/6463594547116701987/execroot/w/tmp

@jianglai
Copy link

jianglai commented Dec 4, 2017

@laszlocsomor yes we need the temp dir to be with in 108 chars.

We are using:

cwd = File.createTempFile(TEMP_FILE_PREFIX, "", null);

It uses /usr/local/google/home/jianglai/.cache/bazel/_bazel_jianglai/ad33e24b04732429442ef546daa88f07/bazel-sandbox/7366477988851648032/execroot/domain_registry/_tmp/75f39134e66273333c988fe3f1047940/gpgtest661830835710385500/ as the tmp directory base, which is way too long.

Do you suggest us to change null to $TMPDIR? Would that compromise the hermeticity of the tests?

@laszlocsomor
Copy link
Contributor

Do you suggest us to change null to $TMPDIR?

Yes, please try that.

Would that compromise the hermeticity of the tests?

What scenario do you have mind?

If I understand the docs correctly, createTempFile guarantees that the file name it returns is unique:

If this method returns successfully then it is guaranteed that:

  1. The file denoted by the returned abstract pathname did not exist before this method was invoked, and
  2. Neither this method nor any of its variants will return the same abstract pathname again in the current invocation of the virtual machine.

(source: https://docs.oracle.com/javase/8/docs/api/java/io/File.html#createTempFile-java.lang.String-java.lang.String-java.io.File-)

@jianglai
Copy link

jianglai commented Dec 5, 2017

What scenario do you have mind?

I was mostly concerned with if using $TMPDIR as the base directory will result in other process having access to the temp directory. createTempFile only guarantees that the JVM instance would not create filename collisions, but what if another process deletes or alters the file/directory just created by createTempFile before it is supposed to be used by the code? But this seems to be also possible if I just use the default temp base, unless those directories are special and only writable by processes spawned by bazel?

I don't quite understand how sandboxing works, from the post here it seems to be mostly restricting bazel's access to other files, not other processes' access to bazel files in the sandbox? What would be the benefits of using null compared to $TMPDIR? If everything is the same, why don't bazel just set the default temp base to $TMPDIR to begin with?

In any case, chance of such a collision is slim. I'll give $TMPDIR a try and report back.

@jianglai
Copy link

jianglai commented Dec 6, 2017

@laszlocsomor I tried $TMPDIR and it gives the same path as null, which kinda make sense as null is supposed to use the default temp directory. I tried adding bazel tmp directory too long jvm_flags = ["-Djava.io.tmpdir=/tmp"] as suggested by the OP to the test rule and it works on macOS. I'll try tomorrow on Debian.

I found this SO post, which shows that for macOS, /tmp is always added to the writable locations when sandboxing. There're some similar logic for Linux, so maybe that'll work?

In any case, do you recommend the jvm_flags trick? Any drawbacks?

@ittaiz
Copy link
Member

ittaiz commented Dec 6, 2017 via email

@jianglai
Copy link

jianglai commented Dec 6, 2017

@ittaiz I don't that is the concern here, createTempFile guarantees that it will pick a non-existing file every time it runs.

@laszlocsomor Some findings:

The command I run:

File.createTempFile("gpgtest", "", filepath):

I tried to set filepath to either null or new File(System.getenv("TMPDIR")), also setting tags = ["local"] to disable sandboxing.

Sandbox, null:
/usr/local/google/home/jianglai/.cache/bazel/_bazel_jianglai/ad33e24b04732429442ef546daa88f07/bazel-sandbox/2343690639230753867/execroot/domain_registry/_tmp/75f39134e66273333c988fe3f1047940/gpgtest2082485420151221649/

Sandbox, $TMPDIR:
/usr/local/google/home/jianglai/.cache/bazel/_bazel_jianglai/ad33e24b04732429442ef546daa88f07/bazel-sandbox/1852775272454687165/execroot/domain_registry/tmp/gpgtest7104388210170836031/

Local, null:
/usr/local/google/home/jianglai/.cache/bazel/_bazel_jianglai/ad33e24b04732429442ef546daa88f07/execroot/nomulus/_tmp/75f39134e66273333c988fe3f1047940/gpgtest7739720270355475465/

Local, $TMPDIR:
/usr/local/google/home/jianglai/.cache/bazel/_bazel_jianglai/ad33e24b04732429442ef546daa88f07/execroot/domain_registry/tmp152e_6078fe747943ce50/gpgtest5628049330197111788/

Sandboxing always results in a path that is too long. Without sandboxing, if I set filepath to null, the resulting path has nomulus in it, which is the folder that I run in. That is obviously less desirable because it can be too long as well. So it seems the choice is to disable sandbox and use $TMPDIR, which produces a path that contains domain_registry, the (deterministic) workspace name.

As I mentioned before, setting jvm_flags = ["-Djava.io.tmpdir=/tmp"] in the test rule also results in a short path that is usable, but that also get around sandboxing. More importantly, it makes the assumption that /tmp is accessible, which I'd rather not make.

It is really unfortunate that I had to forgo sandboxing to get a short enough path...

@philwo
Copy link
Member

philwo commented Dec 6, 2017

We should fix this and I very very much recommend against disabling sandboxing. Any weird hack is better than disabling sandboxing, which not only means that your action can no longer run on a remote execution platform, it can also accidentally wipe your home folder, use information from untracked sources, ... it also means using a vastly inferior process management that might leave processes running in the background after your action ends.

When using the sandbox with the "required option" --sandbox_tmpfs_path=/tmp to ensure a hermetic /tmp, we could easily set TMPDIR to /tmp, as every sandboxed action will then get its own fresh and empty /tmp directory.

When not using that option, we could create a temporary folder below /tmp and then use that as the value for TMPDIR.

I don't follow why using /tmp would be less hermetic than the current path? I mean, yes, in theory processes could write to your securely generated temporary folder that has a randomized name in /tmp - but the sandbox is not meant to (and cannot) protect against malicious acts. It is meant as a safety net so that you don't accidentally wipe your files and depend on untracked state.

@laszlocsomor What do you think?

@jianglai
Copy link

jianglai commented Dec 6, 2017

@philwo

When using the sandbox with the "required option" --sandbox_tmpfs_path=/tmp to ensure a hermetic /tmp, we could easily set TMPDIR to /tmp, as every sandboxed action will then get its own fresh and empty /tmp directory.

It'll be more desirable if sandbox_tmpfs_path can be specified in a test rule other than on the command line.

So the currently suggested solution is to not disable sandboxing, but to create tmp folders under /tmp, either with the jvm_flags or directly specify that when using createTempFile, it seems? Until bazel reworks its $TMPDIR to a shorter one.

I don't follow why using /tmp would be less hermetic than the current path? I mean, yes, in theory processes could write to your securely generated temporary folder that has a randomized name in /tmp - but the sandbox is not meant to (and cannot) protect against malicious acts. It is meant as a safety net so that you don't accidentally wipe your files and depend on untracked state.

You are right, I mis-interpreted the intention of sandboxing.

@laszlocsomor
Copy link
Contributor

laszlocsomor commented Dec 7, 2017

@jianglai : Thanks for your patience and for the info about the status quo. I discussed this issue in person with @philwo, @dslomov, and some others, here are my conclusions.

I can't think of a satisfactory mechanism in current Bazel to achieve what you need. Therefore I think we need a new feature: a means for the user to control the temp directory path.

Specifying a temp path in the build rule would make the rule platform- and machine-dependent, ruling out remote execution and building on other platforms. I don't think that's a worthy tradeoff.

Thinking about remote execution, it seems reasonable to assume that the machine wouldn't allow accessing /tmp and that neither Bazel nor the user would know where exactly the machine has a writable temp directory, so they wouldn't have control over the directory's path's length. Therefore actions that need special temp paths would have to run locally.

I prefer adding a new flag to Bazel, e.g. --local_tmp_root, that lets the user specify a local path, the temp directory root. For locally running actions Bazel would create per-action temp directories under that path. You can set java_test.local=1 to enforce running the test locally, thus get a short temp path. We can consider more room for customization, perhaps with a flag that limits the total temp path for an action to a certain length; Bazel would use the legroom to make the directory name unique.

Finally, I think it should be a flag rather than an environment variable, because flags are explicit and you can set them in the .bazelrc file, while envvars are implicit, are harder to discover (that they affect Bazel) and to document (where would you look for it?).

How does that sound? What do others think -- @philwo , @dslomov ?

@jianglai
Copy link

jianglai commented Dec 7, 2017

@laszlocsomor, @philwo, @dslomov : Looks good to me in general. Some comments/questions below:

Thinking about remote execution, it seems reasonable to assume that the machine wouldn't allow accessing /tmp and that neither Bazel nor the user would know where exactly the machine has a writable temp directory, so they wouldn't have control over the directory's path's length. Therefore actions that need special temp paths would have to run locally.

Agreed, hardcooding is always less desirable.

I prefer adding a new flag to Bazel, e.g. --local_tmp_root, that lets the user specify a local path, the temp directory root. For locally running actions Bazel would create per-action temp directories under that path. You can set java_test.local=1 to enforce running the test locally, thus get a short temp path. We can consider more room for customization, perhaps with a flag that limits the total temp path for an action to a certain length; Bazel would use the legroom to make the directory name unique.

I generally prefer this approach. I'd just set --local_tmp_root=$TMPDIR and use the system provided base temp directory. But isn't setting java_test.lcoal=1 disabling sandboxing?

I probably don't understand all the intricacies of bazel, but why can't it by default, or through a flag, use the system temp directory as the tmp base? TMPDIR is a POSIX standard envvar, and Windows has TEMP. It seems reasonable enough to have a flag like --use_system_tmp_root and to just use that (of course bazel would need to add a randomized subfolder under the system tmp directory as the true tmp root exposed to the jvm). This boolean flag will have the advantage that the user doesn't have to specify anything, which would work best out-of-the-box. You can just say in your README.md something like "run bazel test --use_system_tmp_root //foo/bar/...".

Finally, I think it should be a flag rather than an environment variable, because flags are explicit and you can set them in the .bazelrc file, while envvars are implicit, are harder to discover (that they affect Bazel) and to document (where would you look for it?).

Partially agree that whether to override the default tmp root should be configured by a flag, but like I said in the comment above, it seems to be a reasonable expectation that the system would provide some sort of tmp directory that bazel can query and use (instead of having the user provide one), if the user set the flag.

@CydeWeys
Copy link

CydeWeys commented Dec 7, 2017

Is there any better solution than adding the following option to our BUILD rule to fix the issue temporarily?

jvm_flags = ["-Djava.io.tmpdir=/tmp"]

We know it's not the best long term solution, but until --local_tmp_root is ready, is there any better stopgap?

@laszlocsomor
Copy link
Contributor

I'm not aware of any better workarounds.

jianglai added a commit to google/nomulus that referenced this issue Dec 13, 2017
Detailed discussion can be found on github (bazelbuild/bazel#3215). On newer Debian systems as well as on macOS, the temporary folder created to store gpg configs are too long, which results in an error:

gpg: can't connect to the agent: File name too long

This CL makes affected tests use /tmp as temp folder base. Other options are discussed in the github issue, but this one seems to be the mostly non-harmful one. Note that this change only affect the FOSS build, as internal builds are not affected by the temp directory issue, probably due to the internal build tool using a different base temp directory.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=178803914
@jianglai
Copy link

jianglai commented Feb 8, 2018

Just want to make sure that I understand the fix. It seems that --local_tmp_root does not end up being implemented? With bazel 0.10.0, I just used File.createTempFile("gpgtest", "", new File(System.getenv("TMPDIR"))), and got a temp file under /tmp/gpgtestXXXXXXXXXX. Supposedly /tmp is the default if TMPDIR envvar is not set. Am I doing this correctly?

@laszlocsomor
Copy link
Contributor

It seems that --local_tmp_root does not end up being implemented?

Correct. Sorry for not communicating that more clearly.

Supposedly /tmp is the default if TMPDIR envvar is not set. Am I doing this correctly?

Yes you are, and yes, it is how you said.

@ittaiz
Copy link
Member

ittaiz commented Feb 9, 2018

Just to make sure- if I used to specify “-Djava.io.tmpdir=/tmp“ in many test rules then I don’t need to do it anymore?

@laszlocsomor
Copy link
Contributor

laszlocsomor commented Feb 9, 2018

@ittaiz : On Linux/macOS: no, you no longer need to specify that flag:

// Do not use `fallbackTmpDir`, use `/tmp` instead. This way if the user didn't export TMPDIR
// in their environment, Bazel will still set a TMPDIR that's Posixy enough and plays well
// with heavily path-length-limited scenarios, such as the socket creation scenario that
// motivated https://github.com/bazelbuild/bazel/issues/4376.
p = "/tmp";

// Do not use `fallbackTmpDir`, use `/tmp` instead. This way if the user didn't export TMPDIR
// in their environment, Bazel will still set a TMPDIR that's Posixy enough and plays well
// with heavily path-length-limited scenarios, such as the socket creation scenario that
// motivated https://github.com/bazelbuild/bazel/issues/4376.
p = "/tmp";

On Windows, Bazel falls back to the tmp-in-execroot, so you either have to export TMP or TEMP, or use -Djava.io.tmpdir=C:/tmp:

@ittaiz
Copy link
Member

ittaiz commented Feb 12, 2018

Btw,
From an unscientific check I did with a few tens of colleagues using Mac OS (and without the same provisioning mechanism) they all had TMPDIR defined.
Could it be that TMPDIR is defined by default in OS X?
If so, any chance for the cmd line flag that was mentioned? We have “/tmp” sprinkled in hundreds of BUILD files due to this issue

@laszlocsomor
Copy link
Contributor

laszlocsomor commented Feb 12, 2018

Yes, it's possible that TMPDIR is defined by default, not only on OSX but on Linux too.

Do you see any practical difference between adding a flag to everyone's ~/.bazelrc (assuming Bazel supported the flag), and adding export TMPDIR=/tmp to everyone's ~/.bash_profile? I believe they'd have the same desired effect, would amount to the same work on your side, but the latter would require no work from the Bazel team. WDYT?

@ittaiz
Copy link
Member

ittaiz commented Feb 12, 2018 via email

@laszlocsomor
Copy link
Contributor

OK, I'll implement the flag then.

@laszlocsomor laszlocsomor reopened this Feb 12, 2018
bazel-io pushed a commit that referenced this issue Feb 13, 2018
Add new flag called `--local_tmp_root`, which (if
specified) tells Bazel what temp directory should
locally executed actions use.

Fixes #4621
Related to #3215

RELNOTES[NEW]: The new "--local_tmp_root=<path>" flag allows specifying the temp directory for locally executed actions.

Change-Id: Ice69a5e63d0bf4d3b5c9ef4dbdd1ed1c5025f85e
PiperOrigin-RevId: 185509555
bazel-io pushed a commit that referenced this issue Feb 16, 2018
*** Reason for rollback ***

There's already a --test_tmpdir flag, and Java tests don't pick up this new one. More info: #4621 (comment)

*** Original change description ***

tmpdir,local-exec: implement --local_tmp_root

Add new flag called `--local_tmp_root`, which (if
specified) tells Bazel what temp directory should
locally executed actions use.

Fixes #4621
Related to #3215

RELNOTES[NEW]: The new "--local_tmp_root=<path>" flag allows specifying the temp directory for locally executed actions.

Change-Id: Ice69a5e63d0bf4d3b5c9ef4dbdd1ed1c5025f85e
PiperOrigin-RevId: 185982705
@laszlocsomor
Copy link
Contributor

FYI, I decided to roll back the --local_tmp_root flag, see: #4621 (comment)

For the record, there's already a --test_tmpdir flag, which allows going down to a 42-character-long temp root, see #4621 (comment)

I'll discuss with the team how to go forward.

@laszlocsomor laszlocsomor reopened this Feb 16, 2018
@or-shachar
Copy link
Contributor Author

So we haven't visited this issue for a while. Any updates?

@meisterT meisterT added team-Local-Exec Issues and PRs for the Execution (Local) team untriaged and removed P3 We're not considering working on this, but happy to review a PR. (No assignee) category: misc > misc labels May 12, 2020
@jmmv
Copy link
Contributor

jmmv commented May 15, 2020

I generally agree with the sentiment that Bazel should be using TMPDIR-compliant paths. Unfortunately, Bazel as currently implemented, loves long paths. Addressing this specific issue here would add inconsistency, and it doesn't seem trivial in this context. I also suspect several things would break given the current model in Bazel.

We have encountered this too in some tests and have fixed it by making any such affected test use /tmp directly, and then using mkdtemp or mkstemp to create directories or files safely within that. /tmp is open for writes from the sandbox, so this works.

Also, given that Unix domain socket names are restricted in length, any test that wants to create them must ensure that they fit within those limits. Relying on any environment variable or the cwd configured by Bazel to be "short enough" is bound to fail in some cases, so I'd say that we shouldn't get into that business and that any tests relying on this specific limit must explicitly account for it.

@jmmv jmmv closed this as completed May 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
team-Local-Exec Issues and PRs for the Execution (Local) team type: feature request untriaged
Projects
None yet
Development

No branches or pull requests

9 participants