Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Polyfill OpenBSD unveil for Linux. #490

Merged
merged 1 commit into from
Jul 18, 2022
Merged

Conversation

The-King-of-Toasters
Copy link
Contributor

@The-King-of-Toasters The-King-of-Toasters commented Jul 16, 2022

I'm at this stage of debugging right now, so I'm taking a break and coming back with fresh eyes. I haven't fuzzed the API extensively, but I have confirmed it works on Linux 5.18 and OpenBSD 7.1. I've included a build tool that reads the path & permissions from stdin and calls unveil, before restricting privileges and execing a program.

Closes #485.

@The-King-of-Toasters
Copy link
Contributor Author

Just realised that this impl doesn't actually drop privileges like OpenBSD. Currently rewriting.

Copy link
Owner

@jart jart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good. I could definitely merge this change, and then play around with it, write some unit tests, and try to break it. I left some quick comments though. Please take a look.

/*
* Long living state for landlock calls.
* The bits are set at runtime to handle future API additions.
* As of 5.19, the latest abi is v2.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to define an ABI? Cosmopolitan doesn't support and won't support dynamic linking. However we do support threads, so you may want to consider surrounding access to this static memory using pthread_mutex_lock which needn't be initialized. See https://github.com/jart/cosmopolitan/blob/master/libc/rand/rand64.c as an example, as well as its unit tests. https://github.com/jart/cosmopolitan/blob/master/test/libc/rand/rand64_test.c

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ABI here is future capabilities that landlock supports. Creating a ruleset with unknown (to landlock) features returns EINVAL. Thus the plan is:

  • Create a u64 mask with all features enabled for the latest landlock abi.
  • Get the abi version on first call (could be done with a constructor?).
  • Mask off feature bits if the abi isn't the latest.
  • binand path_beneath_attr with the feature mask.

I totally forgot about threading, I'll try and get the state locked.

@jart
Copy link
Owner

jart commented Jul 17, 2022

Oh also, welcome as a new contributor. We need to go through one quick hurdle before we can proceed. Could you please email Justine Tunney jtunney@gmail.com and say that you intend to assign her the copyright to the changes you contribute to Cosmopolitan? Please send the email under your legal name rather than anonymously. See CONTRIBUTING.md for further details. We make the ask because it helps us ship the tiniest most legally permissive prim and proper binaries. You also only need to do it one time, so any future changes you send us can be quickly merged.

@The-King-of-Toasters
Copy link
Contributor Author

After doing some testing, I'm not impressed with the Landlock API. From the documentation you're supposed to create one ruleset fd, add all your rules to it, set the no_new_privs bit, and call landlock_restrict_self. To me it seemed that we should store a single ruleset fd for the life of the process/thread, and keep adding rules to it. Unfortunately, there are some issues:

  • The locking part doesn't actually kick in until landlock_restrict_self is called, so you can mess about with the FS like normal.
  • It seems that if you issue multiple rules for the same path, the one with the most permissions will be used.

If we can't use the same ruleset fd, why not create a new ruleset per unveil and restrict it at the end? Well it seems that doing this is the equivalent of wiping away the existing rules and starting fresh again - the last unveil takes precedence. I'm not sure if this is due to the no_new_privs bit (since we need to call it before restricting) or if it's just a limitation of landlock.

Anyway, I've been banging my head a lot today and I'm taking a break. Feel free to merge & play around if you'd like.

@jart
Copy link
Owner

jart commented Jul 17, 2022

Try not to worry too much. It's OK if we implement a subset of functionality, since we're still making the full Landlock API available. The main thing we need to do, is explain in the javadoc comment how to "navigate the subset" of functionality that will cause your calls to work on both Linux and BSD.

That's the best high-level advice I can offer right now. I have limited experience with unveil() and Landlock. I'm going to fix that as soon as this is merged. It's OK by me to merge this in an incomplete state. We can iterate. I'm looking forward to helping you write tests on this.

@The-King-of-Toasters
Copy link
Contributor Author

My testing has shown that the multiple ruleset approach is non-optimal, for a couple reasons:

  • Originally you had 64 rulesets per-process (including children, which inherit the parent's), but kernel 5.19 is reducing it to 16.
  • Merging multiple rulesets performs an intersection - if a path is found in one set and not the other, it is dropped. See this comment in the kernel sources.

The only workaround for this approach to be usable would be to create a copy of every unveiled path in a structure of arrays like so:

struct unveil_list {
	unsigned char *buf;
	const char *paths[];
	/* Each byte can be treated as a packed array of four `u2`s,
	 * where 0=r, 1=w, 2=x and 3=c.
	 */
	unsigned char *bits;
	size_t npaths;
};

At initialisation, a ruleset is generated for the life of the process. Since landlock_restrict_self has the effect of clearing out the existing rules, so we can use the same fd (the ruleset limit is still enforced though). Every call to unveil pushes the path and permissions onto the list (reallocated as needed), and npaths is incremented. Then, we loop through the list to call landlock_add_rule for each path, which would require opening up the paths again and regenerating the permission bits from bits. When unveiling is disabled, the list is freed and the ruleset fd is closed.

There are some obvious issues with this workaround too:

  • Userspace cannot guarantee the existence of paths like the kernel can. Since we need to reopen the paths, we would have to either skip the ones that are missing or return an error.
  • Getting access to the permission bits requires complicated (for C) bit shifting.
  • It's possible each path is PATH_MAX long (4096), which for 16 paths is 64K.

So, it seems the only reasonable approach would be the one that exists now: have a single ruleset per process and enforce it when unveiling is disabled. As discussed, the problems for this approach are:

  • Enforcement is applied at the end, so the caller has to:
    • Pinky promise that they won't access any other files before disabling.
    • Remember to actually disable unveiling.
    • Hope that they don't get exploited before disabling.
  • On OpenBSD, the unveil promise can be removed, which is equivalent to disabling unveiling. Thus the Linux seccomp wrapper would have to poke the landlock wrapper's state and disable it before repledging.

@jart
Copy link
Owner

jart commented Jul 18, 2022

Intersection? Don't you mean union? At least that's the behavior I've been observing in the tests I've been writing.

TEST(unveil, overlappingDirectories_inconsistentBehavior) {
  SPAWN();
  ASSERT_SYS(0, 0, makedirs("f1/f2", 0755));
  ASSERT_SYS(0, 0, extract("/zip/life.elf", "f1/f2/life.elf", 0755));
  ASSERT_SYS(0, 0, unveil("f1", "x"));
  ASSERT_SYS(0, 0, unveil("f1/f2", "r"));
  ASSERT_SYS(0, 0, unveil(0, 0));
  if (IsOpenbsd()) {
    // OpenBSD favors the most restrictive policy
    SPAWN();
    ASSERT_SYS(0, 0, stat("f1/f2/life.elf", &st));
    ASSERT_SYS(EACCES, -1,
               execv("f1/f2/life.elf", (char *[]){"f1/f2/life.elf", 0}));
    EXITS(0);
  } else {
    // Landlock (Linux) uses the union of policies
    SPAWN();
    ASSERT_SYS(0, 0, stat("f1/f2/life.elf", &st));
    execv("f1/f2/life.elf", (char *[]){"f1/f2/life.elf", 0});
    EXITS(42);
  }
  EXITS(0);
}

@jart
Copy link
Owner

jart commented Jul 18, 2022

As for the other concerns you mentioned, like the limit of 16, I'm not sure I follow and I'm not sure we need a workaround. It would help me understand better if you could help me write tests that demonstrate the behavior. As far as I can tell, Landlock works reasonably well. So long as we document the ways it's different from OpenBSD and can reliably show errors when things like limits are hit.

I'm going to merge this so I can push the tests I wrote. Could you please send another pull request afterwards helping me improve on this?

@jart jart marked this pull request as ready for review July 18, 2022 09:11
@jart jart merged commit 1c6b5c0 into jart:master Jul 18, 2022
@The-King-of-Toasters
Copy link
Contributor Author

Sure. I was spitballing about finding a way to preserve the OpenBSD semantics on Linux.

When I tried writing my own tests, I found that certain paths (like /proc/self/status) were being opened and denied. Can you point me to the locations where these are opened so that the tests can unveil on init?

jart added a commit that referenced this pull request Jul 18, 2022
@jart
Copy link
Owner

jart commented Jul 18, 2022

The only thing that reads /proc/self/status is IsDebuggerPresent() in libc/intrin/isdebuggerpresent.c. If the system call has an EACCES or EPERM error then it should ignore it and just keep chugging along. Shouldn't need to be unveiled.

@gnoack
Copy link

gnoack commented Jul 23, 2022

Hi! I'm the author of the go-landlock library, so I may be able to help a bit here.

I think you came to the right conclusion of only enforcing Landlock on the final unveil(NULL, NULL) -- you can't build up and apply an allow-list incrementally with Landlock -- the idea is that every landlock_restrict_self call will result in narrowing down the set of permitted file system operations.

For building up the ruleset, you have the option to do it directly when the path is passed to unveil() (that means you'll have an extra open file descriptor though), or to just store the path somewhere on the side and delay the building of the ruleset for later (this means that the file might not be in the exact same place any more by the time you build the ruleset...)...

If you want to go really fancy you could probably emulate the behaviour of OpenBSD's incremental unveiling in user space in the libc... but that might be a bit too much... %-)

I'm also leaving some more concrete points as comments on the code, in the hope that it helps.

@jart
Copy link
Owner

jart commented Jul 23, 2022

Thanks for taking a look @gnoack! There's been a few more changes since this PR so be sure to look at the files at head. I've also updated the pledge.com tool on the website https://justine.lol/pledge/ to include a -v unveil flag. I've used it successfully to sandbox programs like vim and emacs across multiple distros, including ones based on glibc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FR] Unveil support with Landlock polyfill
3 participants