Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

osutil/sys, client: add sys.RunAsUidGid, use it for auth.json #4983

Closed
wants to merge 3 commits into from

Conversation

chipaca
Copy link
Contributor

@chipaca chipaca commented Apr 4, 2018

If /home is on NFS, trying to read, write or remove the user's
auth.json as root fails.

This PR adds code to drop privileges to the owner of the file before
attempting operating on it.

This fixes lp:1761193.

If /home is on NFS, trying to read, write or remove the user's
auth.json as root fails.

This PR adds code to drop privileges to the owner of the file before
attempting operating on it.
@chipaca
Copy link
Contributor Author

chipaca commented Apr 4, 2018

Wellp, that didn't work. back to the drawing board.


ch <- f()

// From Go 1.10, if we exit without unlocking, the thread is killed \o/
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tried to restore the uid/gid and then runtime.UnlockOSThread() instead of exiting?

// RunAsUidGid starts a goroutine, pins it to the OS thread, calls setuid and
// setgid, and runs the function.
func RunAsUidGid(uid UserID, gid GroupID, f func() error) error {
ch := make(chan error)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, and a buffered channel instead of unbufferred one.

@chipaca chipaca reopened this Apr 6, 2018
@chipaca
Copy link
Contributor Author

chipaca commented Apr 6, 2018

we're back, baby!

@codecov-io
Copy link

codecov-io commented Apr 6, 2018

Codecov Report

Merging #4983 into master will increase coverage by 0.05%.
The diff coverage is 67.85%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #4983      +/-   ##
==========================================
+ Coverage   79.12%   79.18%   +0.05%     
==========================================
  Files         478      485       +7     
  Lines       35230    35950     +720     
==========================================
+ Hits        27876    28467     +591     
- Misses       5157     5234      +77     
- Partials     2197     2249      +52
Impacted Files Coverage Δ
client/login.go 64.1% <67.85%> (-3.12%) ⬇️
cmd/snap/cmd_userd.go 66.66% <0%> (-19.05%) ⬇️
testutil/lowlevel.go 88.78% <0%> (-3.9%) ⬇️
interfaces/builtin/all.go 80.3% <0%> (-3.83%) ⬇️
overlord/snapstate/storehelpers.go 80.76% <0%> (-1.74%) ⬇️
overlord/ifacestate/handlers.go 63.25% <0%> (-1.48%) ⬇️
store/errors.go 71.18% <0%> (-0.95%) ⬇️
client/snap_op.go 80.88% <0%> (-0.79%) ⬇️
cmd/snap-update-ns/utils.go 93.93% <0%> (-0.33%) ⬇️
cmd/snap/cmd_run.go 65.14% <0%> (-0.23%) ⬇️
... and 34 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2091aae...ef022e8. Read the comment docs.

Copy link
Collaborator

@bboozzoo bboozzoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yay!

return nil
}
return sys.RunAsUidGid(uid, gid, func() error {
if err := os.MkdirAll(filepath.Dir(targetFile), 0700); err != nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, just return the error directly? :-)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, the diff formatting confused me, never mind :)

return r1, r2, err
}

type UnrecoverableError struct {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please document this type.

Copy link
Collaborator

@zyga zyga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments inline.

Err error
}

func (e UnrecoverableError) Error() string {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please document this error.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing docs :)

func retryRawSyscall(trap, a1, a2, a3 uintptr) (r1, r2 uintptr, err syscall.Errno) {
ms := &syscall.Timespec{Nsec: 1000}
for i := 0; i < 30; i++ {
r1, r2, err = syscall.RawSyscall(trap, a1, a2, a3)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a need to use RawSyscall vs Syscall? The difference, AFAIK, is that raw syscall blocks directly and Syscall will ping back to the runtime to say its about to enter and has just left a system call. If there's a need then please document why we use RawSyscall here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trusting the go standard library on the distinction here (syscall.Setreuid et al use RawSyscall). The difference is not documented and I don't know enough of the internals to have a good opinion.

@@ -107,3 +109,79 @@ func FcntlGetFl(fd int) (int, error) {
}
return int(flags), nil
}

func retryRawSyscall(trap, a1, a2, a3 uintptr) (r1, r2 uintptr, err syscall.Errno) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this function warrants a comment

ruid := Getuid()
rgid := Getgid()

// do the setregid first =)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you document if the order is significant and if so, why, please?

// carefully restore thread state.
defer runtime.UnlockOSThread()

ruid := Getuid()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While reading I read the code for those. Is there any reason we are not using getres{u,g}id()? It uses one less system call.

Drop retryRawSyscall (we don't do this for other syscalls, yet),
document UnrecoverableError, and make RunAsUidGid defer-less.
Copy link
Collaborator

@zyga zyga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a nitpick.

// from the docs:
// until the goroutine exits or calls UnlockOSThread, it will
// always execute in this thread, and no other goroutine can.
// that last bit means it's safe to setuid/setgid in here, as no
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick, That

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm torn between fixing it now, or landing it now and fixing it later :-)

@chipaca
Copy link
Contributor Author

chipaca commented Apr 27, 2018

After much poking around, I don't think this is safe to land as is.

It all boils down to golang/go#20676, which is fixed in 1.10 but not before.

Given that this is for use from snap and not snapd, and that contention should be low, we could make it very very unlikely to hit the bug. But with the number of users we have, very unlikely is still quite likely.

The options to fix lp:1761193 seem to me to be:

  1. write something in cgo that calls clone and then does the syscalls, and either calls a function back (via an id, as you can't pass pointers around), or does the file opening and returns the descriptor,
  2. ignore the contentious contention and land this,
  3. move to 1.10, or
  4. do the file i/o via a helper (e.g. runuser+cat).

the first one gives me hives, the second will cause very weird bugs, the third is likely a year away, so ... maybe I do the fourth?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants