Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Stemcell 3262.12] Permissions for /tmp are 700 instead of 770 #97

Closed
voelzmo opened this issue Sep 16, 2016 · 5 comments
Closed

[Stemcell 3262.12] Permissions for /tmp are 700 instead of 770 #97

voelzmo opened this issue Sep 16, 2016 · 5 comments
Labels

Comments

@voelzmo
Copy link
Contributor

voelzmo commented Sep 16, 2016

We've seen permissions for /tmp being wrong for many VMs that have been updated to use stemcell 3262.12. Instead of 770, it has 700.

Here an example from the agent logs of a Director provisioned with bosh-init on a 3262.12 stemcell:

It seems like first the monit jobs are started

2016-09-15_11:50:35.74294 [HTTPS Dispatcher] 2016/09/15 11:50:35 INFO - POST /agent
2016-09-15_11:50:35.74295 [MBus Handler] 2016/09/15 11:50:35 INFO - Received request with action start
2016-09-15_11:50:35.74295 [MBus Handler] 2016/09/15 11:50:35 DEBUG - Payload

Then the agent is re-started

2016-09-15_11:55:51.34013 [main] 2016/09/15 11:55:51 DEBUG - Starting agent

Then some chmodding happens on /tmp and the directory which is later bind-mounted to /tmp

2016-09-15_11:55:59.15746 [File System] 2016/09/15 11:55:59 DEBUG - Symlinking oldPath /var/vcap/data/sys with newPath /var/vcap/sys
2016-09-15_11:55:59.15746 [File System] 2016/09/15 11:55:59 DEBUG - Making dir /var/vcap/data/tmp with perm 493
2016-09-15_11:55:59.15749 [Cmd Runner] 2016/09/15 11:55:59 DEBUG - Running command: chown root:vcap /tmp
2016-09-15_11:55:59.15828 [Cmd Runner] 2016/09/15 11:55:59 DEBUG - Stdout: 
2016-09-15_11:55:59.15828 [Cmd Runner] 2016/09/15 11:55:59 DEBUG - Stderr: 
2016-09-15_11:55:59.15829 [Cmd Runner] 2016/09/15 11:55:59 DEBUG - Successful: true (0)
2016-09-15_11:55:59.15832 [Cmd Runner] 2016/09/15 11:55:59 DEBUG - Running command: chmod 0770 /tmp
2016-09-15_11:55:59.15886 [Cmd Runner] 2016/09/15 11:55:59 DEBUG - Stdout: 
2016-09-15_11:55:59.15887 [Cmd Runner] 2016/09/15 11:55:59 DEBUG - Stderr: 
2016-09-15_11:55:59.15888 [Cmd Runner] 2016/09/15 11:55:59 DEBUG - Successful: true (0)
2016-09-15_11:55:59.15891 [Cmd Runner] 2016/09/15 11:55:59 DEBUG - Running command: chmod 0700 /var/tmp
2016-09-15_11:55:59.15946 [Cmd Runner] 2016/09/15 11:55:59 DEBUG - Stdout: 
2016-09-15_11:55:59.15947 [Cmd Runner] 2016/09/15 11:55:59 DEBUG - Stderr: 
2016-09-15_11:55:59.15947 [Cmd Runner] 2016/09/15 11:55:59 DEBUG - Successful: true (0)
2016-09-15_11:55:59.15950 [Cmd Runner] 2016/09/15 11:55:59 DEBUG - Running command: mkdir -p /var/vcap/data/root_tmp
2016-09-15_11:55:59.16018 [Cmd Runner] 2016/09/15 11:55:59 DEBUG - Stdout: 
2016-09-15_11:55:59.16018 [Cmd Runner] 2016/09/15 11:55:59 DEBUG - Stderr: 
2016-09-15_11:55:59.16019 [Cmd Runner] 2016/09/15 11:55:59 DEBUG - Successful: true (0)
2016-09-15_11:55:59.16022 [Cmd Runner] 2016/09/15 11:55:59 DEBUG - Running command: chmod 0700 /var/vcap/data/root_tmp
2016-09-15_11:55:59.16075 [Cmd Runner] 2016/09/15 11:55:59 DEBUG - Stdout: 
2016-09-15_11:55:59.16075 [Cmd Runner] 2016/09/15 11:55:59 DEBUG - Stderr: 
2016-09-15_11:55:59.16076 [Cmd Runner] 2016/09/15 11:55:59 DEBUG - Successful: true (0)

At that point in time, the directory /var/vcap/data/root_tmp is not mounted to /tmp yet

2016-09-15_11:55:59.16076 [File System] 2016/09/15 11:55:59 DEBUG - Reading file /proc/mounts
2016-09-15_11:55:59.16082 [File System] 2016/09/15 11:55:59 DEBUG - Read content
2016-09-15_11:55:59.16083 ********************
2016-09-15_11:55:59.16083 sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0
2016-09-15_11:55:59.16083 proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
2016-09-15_11:55:59.16083 udev /dev devtmpfs rw,relatime,size=3818716k,nr_inodes=954679,mode=755 0 0
2016-09-15_11:55:59.16083 devpts /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
2016-09-15_11:55:59.16084 tmpfs /run tmpfs rw,nosuid,noexec,relatime,size=765856k,mode=755 0 0
2016-09-15_11:55:59.16084 /dev/xvda1 / ext4 rw,relatime,data=ordered 0 0
2016-09-15_11:55:59.16084 none /sys/fs/cgroup tmpfs rw,relatime,size=4k,mode=755 0 0
2016-09-15_11:55:59.16084 none /sys/fs/fuse/connections fusectl rw,relatime 0 0
2016-09-15_11:55:59.16084 none /sys/kernel/debug debugfs rw,relatime 0 0
2016-09-15_11:55:59.16085 none /sys/kernel/security securityfs rw,relatime 0 0
2016-09-15_11:55:59.16085 none /run/lock tmpfs rw,nosuid,nodev,noexec,relatime,size=5120k 0 0
2016-09-15_11:55:59.16085 none /run/shm tmpfs rw,nosuid,nodev,relatime 0 0
2016-09-15_11:55:59.16085 none /run/user tmpfs rw,nosuid,nodev,noexec,relatime,size=102400k,mode=755 0 0
2016-09-15_11:55:59.16085 none /sys/fs/pstore pstore rw,relatime 0 0
2016-09-15_11:55:59.16086 rpc_pipefs /run/rpc_pipefs rpc_pipefs rw,relatime 0 0
2016-09-15_11:55:59.16086 /dev/xvdb2 /var/vcap/data ext4 rw,relatime,data=ordered 0 0
2016-09-15_11:55:59.16086 /dev/xvdb2 /var/log ext4 rw,relatime,data=ordered 0 0
2016-09-15_11:55:59.16086 tmpfs /var/vcap/data/sys/run tmpfs rw,relatime,size=1024k 0 0
2016-09-15_11:55:59.16087 /dev/xvdb2 /tmp ext4 rw,relatime,data=ordered 0 0
2016-09-15_11:55:59.16087 /dev/xvdb2 /var/tmp ext4 rw,relatime,data=ordered 0 0
2016-09-15_11:55:59.16087 /dev/xvdf1 /var/vcap/store ext4 rw,relatime,data=ordered 0 0
2016-09-15_11:55:59.16087 

I can't really find in the agent logs now when that happened. On the VM itself it seems, however, that the bind-mount happened some time

# mount
/dev/xvda1 on / type ext4 (rw)
proc on /proc type proc (rw,noexec,nosuid,nodev)
sysfs on /sys type sysfs (rw,noexec,nosuid,nodev)
none on /sys/fs/cgroup type tmpfs (rw)
none on /sys/fs/fuse/connections type fusectl (rw)
none on /sys/kernel/debug type debugfs (rw)
none on /sys/kernel/security type securityfs (rw)
udev on /dev type devtmpfs (rw,mode=0755)
devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=0620)
tmpfs on /run type tmpfs (rw,noexec,nosuid,size=10%,mode=0755)
none on /run/lock type tmpfs (rw,noexec,nosuid,nodev,size=5242880)
none on /run/shm type tmpfs (rw,nosuid,nodev)
none on /run/user type tmpfs (rw,noexec,nosuid,nodev,size=104857600,mode=0755)
none on /sys/fs/pstore type pstore (rw)
rpc_pipefs on /run/rpc_pipefs type rpc_pipefs (rw)
/dev/xvdb2 on /var/vcap/data type ext4 (rw)
/var/vcap/data/root_log on /var/log type none (rw,noexec,nosuid,nodev,bind)
tmpfs on /var/vcap/data/sys/run type tmpfs (rw,size=1m)
/var/vcap/data/root_tmp on /tmp type none (rw,noexec,nosuid,nodev,bind)
/var/vcap/data/root_tmp on /var/tmp type none (rw,noexec,nosuid,nodev,bind)
/dev/xvdf1 on /var/vcap/store type ext4 (rw)

And in /tmp we have some postgres .lock file which breaks Director updates

# ls -la /tmp/
total 12
drwx------  2 root vcap 4096 Sep 16 14:17 .
drwxr-xr-x 23 root root 4096 Sep 15 11:55 ..
srwxrwxrwx  1 vcap vcap    0 Sep 15 11:50 .s.PGSQL.5432
-rw-------  1 vcap vcap   56 Sep 15 11:50 .s.PGSQL.5432.lock

# cat /tmp/.s.PGSQL.5432.lock
16322
/var/vcap/store/postgres-9.4
1473940236
5432
/tmp

Note that the files were created before the above things in the agent log happened?

For comparison, here is some output for a stemcell where it actually works

Creation of /tmp and some chmodding

2016-09-16_13:44:02.44369 [File System] 2016/09/16 13:44:02 DEBUG - Making dir /var/vcap/data/tmp with perm 493
2016-09-16_13:44:02.44491 [Cmd Runner] 2016/09/16 13:44:02 DEBUG - Running command: chown root:vcap /tmp
2016-09-16_13:44:02.44580 [Cmd Runner] 2016/09/16 13:44:02 DEBUG - Stdout:
2016-09-16_13:44:02.44581 [Cmd Runner] 2016/09/16 13:44:02 DEBUG - Stderr:
2016-09-16_13:44:02.44581 [Cmd Runner] 2016/09/16 13:44:02 DEBUG - Successful: true (0)
2016-09-16_13:44:02.44583 [Cmd Runner] 2016/09/16 13:44:02 DEBUG - Running command: chmod 0770 /tmp
2016-09-16_13:44:02.44657 [Cmd Runner] 2016/09/16 13:44:02 DEBUG - Stdout:
2016-09-16_13:44:02.44658 [Cmd Runner] 2016/09/16 13:44:02 DEBUG - Stderr:
2016-09-16_13:44:02.44658 [Cmd Runner] 2016/09/16 13:44:02 DEBUG - Successful: true (0)
2016-09-16_13:44:02.44658 [Cmd Runner] 2016/09/16 13:44:02 DEBUG - Running command: chmod 0700 /var/tmp
2016-09-16_13:44:02.44723 [Cmd Runner] 2016/09/16 13:44:02 DEBUG - Stdout:
2016-09-16_13:44:02.44724 [Cmd Runner] 2016/09/16 13:44:02 DEBUG - Stderr:
2016-09-16_13:44:02.44724 [Cmd Runner] 2016/09/16 13:44:02 DEBUG - Successful: true (0)
2016-09-16_13:44:02.44724 [Cmd Runner] 2016/09/16 13:44:02 DEBUG - Running command: mkdir -p /var/vcap/data/root_tmp
2016-09-16_13:44:02.44953 [Cmd Runner] 2016/09/16 13:44:02 DEBUG - Stdout:
2016-09-16_13:44:02.44953 [Cmd Runner] 2016/09/16 13:44:02 DEBUG - Stderr:
2016-09-16_13:44:02.44954 [Cmd Runner] 2016/09/16 13:44:02 DEBUG - Successful: true (0)
2016-09-16_13:44:02.44956 [Cmd Runner] 2016/09/16 13:44:02 DEBUG - Running command: chmod 0700 /var/vcap/data/root_tmp
2016-09-16_13:44:02.45021 [Cmd Runner] 2016/09/16 13:44:02 DEBUG - Stdout:
2016-09-16_13:44:02.45021 [Cmd Runner] 2016/09/16 13:44:02 DEBUG - Stderr:
2016-09-16_13:44:02.45022 [Cmd Runner] 2016/09/16 13:44:02 DEBUG - Successful: true (0)

some chmodding and bind-mounting

2016-09-16_13:44:02.45043 [Cmd Runner] 2016/09/16 13:44:02 DEBUG - Running command: mount /var/vcap/data/root_tmp /tmp -o nodev -o noexec -o nosuid --bind
2016-09-16_13:44:02.45175 [Cmd Runner] 2016/09/16 13:44:02 DEBUG - Stdout:
2016-09-16_13:44:02.45176 [Cmd Runner] 2016/09/16 13:44:02 DEBUG - Stderr:
2016-09-16_13:44:02.45176 [Cmd Runner] 2016/09/16 13:44:02 DEBUG - Successful: true (0)
2016-09-16_13:44:02.45177 [Cmd Runner] 2016/09/16 13:44:02 DEBUG - Running command: chown root:vcap /tmp
2016-09-16_13:44:02.45268 [Cmd Runner] 2016/09/16 13:44:02 DEBUG - Stdout:
2016-09-16_13:44:02.45269 [Cmd Runner] 2016/09/16 13:44:02 DEBUG - Stderr:
2016-09-16_13:44:02.45269 [Cmd Runner] 2016/09/16 13:44:02 DEBUG - Successful: true (0)
2016-09-16_13:44:02.45271 [Cmd Runner] 2016/09/16 13:44:02 DEBUG - Running command: chmod 0770 /tmp
2016-09-16_13:44:02.45339 [Cmd Runner] 2016/09/16 13:44:02 DEBUG - Stdout:
2016-09-16_13:44:02.45339 [Cmd Runner] 2016/09/16 13:44:02 DEBUG - Stderr:
2016-09-16_13:44:02.45340 [Cmd Runner] 2016/09/16 13:44:02 DEBUG - Successful: true (0)

Then some more chmodding and actual bind-mounting

2016-09-16_13:44:02.45350 [File System] 2016/09/16 13:44:02 DEBUG - Reading file /proc/mounts
2016-09-16_13:44:02.45353 [File System] 2016/09/16 13:44:02 DEBUG - Read content
2016-09-16_13:44:02.45354 ********************
2016-09-16_13:44:02.45354 sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0
2016-09-16_13:44:02.45354 proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
2016-09-16_13:44:02.45354 udev /dev devtmpfs rw,relatime,size=4077376k,nr_inodes=1019344,mode=755 0 0
2016-09-16_13:44:02.45354 devpts /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
2016-09-16_13:44:02.45355 tmpfs /run tmpfs rw,nosuid,noexec,relatime,size=817588k,mode=755 0 0
2016-09-16_13:44:02.45355 /dev/vda1 / ext4 rw,relatime,data=ordered 0 0
2016-09-16_13:44:02.45355 none /var/lib/ureadahead/debugfs debugfs rw,relatime 0 0
2016-09-16_13:44:02.45355 none /sys/fs/cgroup tmpfs rw,relatime,size=4k,mode=755 0 0
2016-09-16_13:44:02.45356 none /sys/fs/fuse/connections fusectl rw,relatime 0 0
2016-09-16_13:44:02.45356 none /sys/kernel/debug debugfs rw,relatime 0 0
2016-09-16_13:44:02.45356 none /sys/kernel/security securityfs rw,relatime 0 0
2016-09-16_13:44:02.45356 none /run/lock tmpfs rw,nosuid,nodev,noexec,relatime,size=5120k 0 0
2016-09-16_13:44:02.45356 none /run/shm tmpfs rw,nosuid,nodev,relatime 0 0
2016-09-16_13:44:02.45357 none /run/user tmpfs rw,nosuid,nodev,noexec,relatime,size=102400k,mode=755 0 0
2016-09-16_13:44:02.45357 none /sys/fs/pstore pstore rw,relatime 0 0
2016-09-16_13:44:02.45357 rpc_pipefs /run/rpc_pipefs rpc_pipefs rw,relatime 0 0
2016-09-16_13:44:02.45357 /dev/vda3 /var/vcap/data ext4 rw,relatime,data=ordered 0 0
2016-09-16_13:44:02.45358 /dev/vda3 /var/log ext4 rw,relatime,data=ordered 0 0
2016-09-16_13:44:02.45358 tmpfs /var/vcap/data/sys/run tmpfs rw,relatime,size=1024k 0 0
2016-09-16_13:44:02.45358 /dev/vda3 /tmp ext4 rw,relatime,data=ordered 0 0
2016-09-16_13:44:02.45358
2016-09-16_13:44:02.45358 ********************
2016-09-16_13:44:02.45359 [Cmd Runner] 2016/09/16 13:44:02 DEBUG - Running command: mount /var/vcap/data/root_tmp /var/tmp -o nodev -o noexec -o nosuid --bind
2016-09-16_13:44:02.45478 [Cmd Runner] 2016/09/16 13:44:02 DEBUG - Stdout:
2016-09-16_13:44:02.45479 [Cmd Runner] 2016/09/16 13:44:02 DEBUG - Stderr:
2016-09-16_13:44:02.45479 [Cmd Runner] 2016/09/16 13:44:02 DEBUG - Successful: true (0)
2016-09-16_13:44:02.45479 [Cmd Runner] 2016/09/16 13:44:02 DEBUG - Running command: chown root:vcap /var/tmp
2016-09-16_13:44:02.45568 [Cmd Runner] 2016/09/16 13:44:02 DEBUG - Stdout:
2016-09-16_13:44:02.45569 [Cmd Runner] 2016/09/16 13:44:02 DEBUG - Stderr:
2016-09-16_13:44:02.45569 [Cmd Runner] 2016/09/16 13:44:02 DEBUG - Successful: true (0)
2016-09-16_13:44:02.45569 [Cmd Runner] 2016/09/16 13:44:02 DEBUG - Running command: chmod 0770 /var/tmp
2016-09-16_13:44:02.45633 [Cmd Runner] 2016/09/16 13:44:02 DEBUG - Stdout:
2016-09-16_13:44:02.45633 [Cmd Runner] 2016/09/16 13:44:02 DEBUG - Stderr:
2016-09-16_13:44:02.45634 [Cmd Runner] 2016/09/16 13:44:02 DEBUG - Successful: true (0)

Monit start seems to happen only after that:

2016-09-16_13:44:17.80942 [Action Dispatcher] 2016/09/16 13:44:17 INFO - Running sync action start

any ideas what happened there?

@voelzmo
Copy link
Contributor Author

voelzmo commented Sep 16, 2016

cc @tylerschultz @holgero @beyhan

@voelzmo
Copy link
Contributor Author

voelzmo commented Sep 16, 2016

After some more investigation:

  • we have co-located some job that re-starts the agent
  • when the agent runs a second time, it also re-sets the permissions to 700 in setupTmpDir
  • it doesn't run the code for mounting again, which in stemcell 3262.12 means it also didn't change permissions to 770, see
    if !mounted && err == nil {
    // mount
    err = bindMounter.Mount(mountSource, mountPoint, "-o", "nodev", "-o", "noexec", "-o", "nosuid")
    if err != nil {
    return bosherr.WrapErrorf(err, "Bind mounting %s dir over %s", mountSource, mountPoint)
    }
    // change permissions for mount point
    err = p.changeTmpDirPermissions(mountPoint)
    if err != nil {
    return err
    }
    } else if err != nil {
  • this seems to be fixed on develop, because permission change is now independent of bind-mounting, see https://github.com/cloudfoundry/bosh-agent/blob/develop/platform/linux_platform.go#L821-L839

@dpb587-pivotal
Copy link
Contributor

Nice investigation! I agree with what you've discovered. So, to recap, the issue happens to be fixed on bosh-agent's develop, but we should consider backporting those fixes onto 3262.x because, while most environments won't hit this initially, if the agent crashes or is intentionally restarted we will start seeing this strange and incorrect behavior during standard BOSH lifecycles.

Sounds like something @cppforlife will care about.

@cppforlife
Copy link
Contributor

I believe this could be closed at this point? @voelzmo @dpb587-pivotal

@voelzmo
Copy link
Contributor Author

voelzmo commented Oct 19, 2016

yeah, this is fixed in most recent stemcells.

@voelzmo voelzmo closed this as completed Oct 19, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants