Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Permission denied issue when writing to /tmp/.tensorboard-info #2010

Closed
tete1030 opened this issue Mar 14, 2019 · 24 comments · Fixed by #2131
Closed

Permission denied issue when writing to /tmp/.tensorboard-info #2010

tete1030 opened this issue Mar 14, 2019 · 24 comments · Fixed by #2131

Comments

@tete1030
Copy link

  • TensorBoard version (from pip package, also printed out when running tensorboard)
    1.13.1
  • TensorFlow version if different from TensorBoard
    1.13.1
  • OS Platform and version (e.g., Linux Ubuntu 16.04)
    Ubuntu 16.04
  • Python version (e.g. 2.7, 3.5)
    2.7

Please describe the bug as clearly as possible, and if possible provide a minimal example (code, data, and/or command line) to reproduce the issue. Thanks!

The writing of tensorboard info files introduced in #1806 can cause permission problem under multi-user scenario. It directly create .tensorboard-info directory under /tmp as in

path = os.path.join(tempfile.gettempdir(), ".tensorboard-info")
. If the dir has already been created by an user, it will not be writable to other users.

image

@tete1030 tete1030 changed the title Permission denied issue when creating /tmp/.tensorboard-info Permission denied issue when writing to /tmp/.tensorboard-info Mar 14, 2019
@wchargin
Copy link
Contributor

Hi @tete1030! Thanks for the clear report. This is a good point—I’d
considered the multi-user case and determined that it wouldn’t be a
problem for reading files in this directory, but didn’t realize that
it would not be possible to write new files in the directory.

I think that the following patch should suffice, at least on Unices:

diff --git a/tensorboard/manager.py b/tensorboard/manager.py
index a86c010b..92f7601f 100644
--- a/tensorboard/manager.py
+++ b/tensorboard/manager.py
@@ -235,6 +235,7 @@ def _get_info_dir():
   The directory will be created if it does not exist.
   """
   path = os.path.join(tempfile.gettempdir(), ".tensorboard-info")
+  old_umask = os.umask(0o000)
   try:
     os.makedirs(path)
   except OSError as e:
@@ -242,6 +243,8 @@ def _get_info_dir():
       pass
     else:
       raise
+  finally:
+    os.umask(old_umask)
   return path
 

I’ll have to test this on Windows. If you’re looking for a quick fix,
you should be able to patch your TensorBoard install as above. (Or just
chmod a+w /tmp/.tensorboard-info, which will work until the next time
that /tmp/ is cleared.)

@Harshini-Gadige
Copy link

@tete1030 Any update on this ?
@wchargin Please let me know if you want me to keep this issue open until Windows test ?

@tete1030
Copy link
Author

tete1030 commented Apr 5, 2019

Sorry that I didn't reply. The patch works great and I have not encountered this problem again.

@wchargin
Copy link
Contributor

wchargin commented Apr 5, 2019

Great—glad to hear that the patch is working, @tete1030 (and sorry for
the inconvenience).

@hgadig: Yes, please keep this open.

@lebrice
Copy link

lebrice commented Apr 18, 2019

Hey there, just stumbled upon this, I just thought I'd mention that this patch doesn't fix the issue from the non-sudo user perspective. I added the patch first, but It did not fix the problem, as I do not have write access on that directory anyway. I'm gonna try and change the ".tensorboard-info" name to something unique, and hope this works. I'm thinking it might be nice to be able to customize this location, on a per-user basis ?

@wchargin
Copy link
Contributor

@lebrice, could you clarify what you mean by the “non-sudo user
perspective”? I can see from @tete1030’s screenshot that they’re not
running as root or with sudo. Does your user account not have write
access to $TMPDIR?

(To be clear, the patch needs to be applied before the info directory
is first created. If you’ve created a write-restricted info directory as
root by running TensorBoard without this patch, then yes, you’ll need to
remove it or change its mode.)

@noisychannel
Copy link

noisychannel commented Apr 18, 2019

My workaround while this issue is resolved involves setting the TMPDIR environment variable. Since tensorboard uses tempfile which will respect user set environment variables (TMPDIR, TMP, etc.), this is possible. Make sure that the directory which TMPDIR points to exists!

You can change and test the new temp directory by running the following (which sets it to /tmp/$USER instead of /tmp).

export TMPDIR=/tmp/$USER; mkdir -p $TMPDIR; python -c "import tempfile; print(tempfile.gettempdir())"

A safe tensorboard invocation is:

export TMPDIR=/tmp/$USER; mkdir -p $TMPDIR; tensorboard --logdir $LOGDIR

@noisychannel
Copy link

noisychannel commented Apr 18, 2019

Also, for @wchargin, the issue being referenced as the multi-user scenario is:

  1. User A starts tensorboard and hence /tmp/.tensorboard-info is owned by that user.
  2. When user B invokes tensorboard, it will crash since it cannot create files in /tmp/.tensorboard-info which is owned by user A. This ownership cannot be changed without sudo access.

@wchargin
Copy link
Contributor

wchargin commented Apr 18, 2019

Also, for @wchargin, the issue being referenced as the multi-user
scenario is:

Right, I understand this; thanks. (That’s what this issue is about.) And
this should be fixed by the patch above, though changing TMPDIR is
also certainly a valid workaround.

@dby2017
Copy link

dby2017 commented Apr 19, 2019

@wchargin I have encountered this problem and used the method you provided. I still can't solve it.

@tete1030
Copy link
Author

@wchargin I have encountered this problem and used the method you provided. I still can't solve it.

Make sure you have deleted the old /tmp/.tensorboard-info, or use chmod a+w /tmp/.tensorboard-info. They both require root privilege.

Be aware if you have only patched tensorboard installed in one environment (e.g. a conda env, or locally installed package), other user with unpatched env could also cause this problem

wchargin added a commit that referenced this issue Apr 19, 2019
Summary:
Fixes #2010.

This patch is primarily directed at Unices. On Windows, the underlying
problem should not be an issue, because the default temporary directory
seems to be user-scoped (e.g., `C:\Users\wchargin\AppData\Local\Temp`).

Test Plan:
Unit tests added. Before this change, the first test has an assertion
error (`'0o755' != '0o777'`) and the second test has an I/O error trying
to write to a non-writable directory. After this change, all tests pass.

Tested only on Linux.

wchargin-branch: info-dir-mode
wchargin added a commit that referenced this issue Apr 19, 2019
…eful (#2131)

Summary:
Fixes #2010.

This patch is primarily directed at Unices. On Windows, the underlying
problem should not be an issue, because the default temporary directory
seems to be user-scoped (e.g., `C:\Users\wchargin\AppData\Local\Temp`).

Test Plan:
Unit tests added. Before this change, the first test has an assertion
error (`'0o755' != '0o777'`) and the second test has an I/O error trying
to write to a non-writable directory. After this change, all tests pass.

Tested only on Linux.

wchargin-branch: info-dir-mode
@minygd
Copy link

minygd commented May 1, 2019

@wchargin I have encountered this problem and used the method you provided. I still can't solve it.

Make sure you have deleted the old /tmp/.tensorboard-info, or use chmod a+w /tmp/.tensorboard-info. They both require root privilege.

Be aware if you have only patched tensorboard installed in one environment (e.g. a conda env, or locally installed package), other user with unpatched env could also cause this problem

So does it means that in one Linux Server, only one user can use tensorboard freely? And if others want to use it, he/she must get root privilege to fix the problem?

@wchargin
Copy link
Contributor

wchargin commented May 1, 2019

@minygd: This issue has been fixed, and the fix will be in the next
TensorBoard release. You can get that release now by installing the
latest version of tb-nightly, or you can wait for TensorBoard 1.14 to
be released (in the next few weeks, probably).

@reactivetype
Copy link

@wchargin I am using ubuntu with tb-nightly 1.14.0a20190506 and I still have the issue.

File "envs/tf-nightly/lib/python3.6/site-packages/tensorboard/manager.py", line 269, in write_info_file
    with open(_get_info_file_path(), "w") as outfile:
PermissionError: [Errno 13] Permission denied: '/tmp/.tensorboard-info/pid-10141.info'

@wchargin
Copy link
Contributor

wchargin commented May 7, 2019

@reactivetype: Could you please run

pip freeze 2>&1 | grep -e tensor -e tf- -e tb-; printf '%s\n' ---; python -c 'print(__import__("inspect").getsource(__import__("tensorboard.manager").manager._get_info_dir))'; printf '%s\n' ---; stat /tmp/.tensorboard-info/

and post the full output here?

@reactivetype
Copy link

reactivetype commented May 8, 2019

@wchargin thanks for replying. here is the output:

$ pip freeze 2>&1 | grep -e tensor -e tf- -e tb-; printf '%s\n' ---; python -c 'print(__import__("inspect").getsource(__import__("t ensorboard.manager").manager._get_info_dir))'; printf '%s\n' ---; stat /tmp/.tensorboard-info/
tb-nightly==1.14.0a20190506
tf-estimator-nightly==1.14.0.dev2019042901
tf-nightly-gpu==1.14.1.dev20190506
---
def _get_info_dir():
  """Get path to directory in which to store info files.

  The directory returned by this function is "owned" by this module. If
  the contents of the directory are modified other than via the public
  functions of this module, subsequent behavior is undefined.

  The directory will be created if it does not exist.
  """
  path = os.path.join(tempfile.gettempdir(), ".tensorboard-info")
  try:
    os.makedirs(path)
  except OSError as e:
    if e.errno == errno.EEXIST and os.path.isdir(path):
      pass
    else:
      raise
  else:
    os.chmod(path, 0o777)
  return path

---
  File: '/tmp/.tensorboard-info/'
  Size: 4096            Blocks: 8          IO Block: 4096   directory
Device: fc00h/64512d    Inode: 9961546     Links: 2
Access: (0775/drwxrwxr-x)  Uid: ( 1001/a00447759)   Gid: ( 1001/a00447759)
Access: 2019-04-02 21:38:07.367113664 -0400
Modify: 2019-04-22 12:22:51.280130190 -0400
Change: 2019-04-22 12:22:51.280130190 -0400
 Birth: -

@wchargin
Copy link
Contributor

wchargin commented May 8, 2019

@reactivetype: Thanks. It looks like your .tensorboard-info directory
was created with an older version of TensorBoard, before the fix went
in. You’ll need to manually change its permissions just this once:

sudo chmod 777 /tmp/.tensorboard-info

@reactivetype
Copy link

@wchargin Makes sense. Thank you very much!

wchargin added a commit that referenced this issue May 16, 2019
Summary:
Users often report problems that depend on environment-specific
configuration. Rather than asking them to find all this information and
enter it into an issue template manually, we can ask them to run a
script and paste its output verbatim into the issue. Furthermore, we can
detect and suggest fixes to common problems, such as #1907 and #2010.
The script can grow as we see fit to add new diagnoses and suggestions.

Test Plan:
Open to suggestions about how much automated testing there should be.
The structure of the script makes it robust to errors in each individual
diagnosis (in the worst case, it prints a stack trace and continues to
the next one), and I tested that the main framework works in Python 2
and 3 on Linux and in Python 3 on Windows.

To simulate a bad hostname, add

```python
socket.getfqdn = lambda: b"\xc9".decode("utf-8")
```

to the top of the script.

To simulate a bad `.tensorboard-info` directory and test the quoting
behavior, run

    export TMPDIR="$(mktemp -d)/uh oh/" &&
    mkdir -p "${TMPDIR}/.tensorboard-info" &&
    chmod 000 "${TMPDIR}/.tensorboard-info" &&
    python ./tensorboard/tools/diagnose_me.py

wchargin-branch: diagnose-me
wchargin added a commit that referenced this issue May 16, 2019
Summary:
Users often report problems that depend on environment-specific
configuration. Rather than asking them to find all this information and
enter it into an issue template manually, we can ask them to run a
script and paste its output verbatim into the issue. Furthermore, we can
detect and suggest fixes to common problems, such as #1907 and #2010.
The script can grow as we see fit to add new diagnoses and suggestions.

Test Plan:
The script is designed to be robust to errors in each individual
diagnosis: in the worst case, it prints a stack trace and continues to
the next one. I’ve manually tested that the main framework works in
Python 2 and 3 on Linux and in Python 3 on Windows.

Automated testing of this script is possible, but would take a fair
number of CPU cycles to run tests (setting up virtualenvs and Conda,
installing and importing TensorFlow many times). Given that this script
is never a production dependency and is explicitly designed to be run in
a discussion context, light testing seems reasonable.

To simulate a bad hostname, add

```python
socket.getfqdn = lambda: b"\xc9".decode("utf-8")
```

to the top of the script.

To simulate a bad `.tensorboard-info` directory and test the quoting
behavior, run

```shell
export TMPDIR="$(mktemp -d)/uh oh/" &&
mkdir -p "${TMPDIR}/.tensorboard-info" &&
chmod 000 "${TMPDIR}/.tensorboard-info" &&
python ./tensorboard/tools/diagnose_tensorboard.py
```

To cross-check the autoidentification logic:

```
$ python tensorboard/tools/diagnose_tensorboard.py |
> awk '/version / { print $NF }'
e093841ffaea564cb2410e0b430bd0c552ada208
$ git hash-object tensorboard/tools/diagnose_tensorboard.py
e093841ffaea564cb2410e0b430bd0c552ada208
$ git rev-parse HEAD:tensorboard/tools/diagnose_tensorboard.py
e093841ffaea564cb2410e0b430bd0c552ada208
$ git cat-file blob e093841ffaea564cb2410e0b430bd0c552ada208 |
> diff -u tensorboard/tools/diagnose_tensorboard.py - | wc -l
0
```

wchargin-branch: diagnose-me
@zhixuanli
Copy link

Just try

sudo chmod 777 /tmp/.tensorboard-info/

@alwynmathew
Copy link

alwynmathew commented Oct 21, 2019

@reactivetype: Thanks. It looks like your .tensorboard-info directory
was created with an older version of TensorBoard, before the fix went
in. You’ll need to manually change its permissions just this once:

sudo chmod 777 /tmp/.tensorboard-info

hey @wchargin I have installed tensorboard 1.14 but I face the same issues. Seems like I have a .tensorboard-info directory created with an older version of TensorBoard. As I'm not a sudoer, I dont have permission to delete or change the permission of /tmp/.tensorboard-info/. Is there a way to fix it without going to our admin requesting to delete or change the permission of /tmp/.tensorboard-info/?

@wchargin
Copy link
Contributor

wchargin commented Oct 21, 2019

@alwynmathew: Who owns the /tmp/.tensorboard-info directory? (Find out
with stat --format=%U /tmp/.tensorboard-info.) That user—probably
whoever created the directory—can chmod it or remove it.

@SimonSelg
Copy link

SimonSelg commented Dec 24, 2019

@alwynmathew: I have encountered the same problem while working on a shared machine. Since the admin is on christmas break, I had to find another solution.

Turns out, tensorboard uses tempfile.tempdir to get the path to the temporary directory (usually /tmp/). In this directory the .tensorboard folder is created.

It is possible to specify the temp directory returned by tempfile.tempdir by setting the environment variable TMPDIR.

Now we can simply force tensorboard to use another directory then /tmp/.tensorboard/, like /tmp/.selgs/.tensorboard:

mkdir /tmp/.selgs
env TMPDIR=/tmp/.selgs tensorboard 

@wchargin
Copy link
Contributor

Yep, using TMPDIR in this format is specifically encouraged by
POSIX
. That’s a good workaround; thanks for sharing!

(You probably want to use either /tmp/selgs or /tmp/.selgs in both
lines, and you can also just use TMPDIR=/tmp/selgs tensorboard to set
an environment variable for one command; no need to use env, which
spawns an extra process.)

@alwynmathew
Copy link

@alwynmathew: I have encountered the same problem while working on a shared machine. Since the admin is on christmas break, I had to find another solution.

Turns out, tensorboard uses tempfile.tempdir to get the path to the temporary directory (usually /tmp/). In this directory the .tensorboard folder is created.

It is possible to specify the temp directory returned by tempfile.tempdir by setting the environment variable TMPDIR.

Now we can simply force tensorboard to use another directory then /tmp/.tensorboard/, like /tmp/.selgs/.tensorboard:

mkdir /tmp/selgs
env TMPDIR=/tmp/.selgs tensorboard 

Sorry @SimonSelg , I was in Christmas break too. @wchargin latest solution seems elegant with just one command. I had a tough time fixing my issues, I tracked down the owner of the dir with stat --format=%U /tmp/.tensorboard-info then pinged him to change the permission. Hope you found the solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.