Fix sandbox permission errors with Mesos 1.6.0 #28
Conversation
Does this degrade gracefully to older versions of Mesos? |
This simplifies debugging of Thermos runner issues significantly, as there is just one relevant log file to look at. For further details, please see the prior discussion at https://reviews.apache.org/r/58609/
For example, rather than just a "Permission denied" one gets a full stacktrace of which access or operation caused the issue.
Rather than taking "/path/to/mesos_task_dir/sandbox" as a constructor argument, the sandbox objects are now passed just "/path/to/mesos_task_dir" as this simplifies a few operations that need the folder created by Mesos. The "sandbox" subfolder is still created with the same name, but this is now only handled internally. There is no user-visible behaviour change.
Mesos 1.6.0 creates sandboxes with permissions 750 rather than 755. This breaks the assumption of Thermos that non-privileged processes can still read the sandbox content. This change is necessary even when container images are used via the Mesos containerizer, as certain processes such as the stdout log rotation continue to run outside of the image.
64a764a
to
13f5cff
Compare
Given our default Aurora and Thermos setup where the executor is launched as
With this change, we now let Thermos change the owner of I have verified that this works in 1.5 and 1.6 by running the end-to-end tests with both versions. I therefore think that this is degrades gracefully as expected. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this fix!
Giving notice to other PMC members in the interest of getting 0.21.0 out the door: Given that this PR has been open for 27 days and is needed for our release, I'm calling for lazy consensus. If no one reviews or opposes this patch by Friday, September 7th 2018 at 12 PM San Francisco time, I'll be merging it and cutting 0.21.0-RC1. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks for testing backward compatibility.
This change makes Aurora compatible with Mesos 1.6.0 which has changed the permission of sandbox folders (see https://issues.apache.org/jira/browse/MESOS-8332).
Please have a look at the individual commits for further details and descriptions.