Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

πŸ›(chg): don't crash when user has more than 16 groups on macOS #573

Closed
wants to merge 1 commit into from

Conversation

vegerot
Copy link
Contributor

@vegerot vegerot commented Mar 20, 2023

πŸ›(chg): don't crash when user has more than 16 groups on macOS

Summary:
D43676809 (OSS: 49ee17f) introduced a check for whether the chg server and
sl process have different permissions. We run getgroups in two different
places:

  1. using os.getgroups in sl client in Python
  2. using libc's unistd/getgroups in chg server in C

Essentially the way this works is we call os.getgroups in the client (Python)
and unistd/getgroups on the chg server (C). Then, the server compares the
results. If they're the same, then both processes have the same permissions,
otherwise they might be different.

However, getgroups on macOS has a key limitation that it will only return a
max of NGROUPS_MAX groups (16 groups on my machine, defined in
sys/param.h). You can remove this limit by defining
_DARWIN_UNLIMITED_GETGROUPS or _DARWIN_C_SOURCE1.

os.getgroups in Python is compiled with _DARWIN_C_SOURCE2 but chg does
not, meaning that the result of getgroups is different in our Python and C
code.

This commit defines _DARWIN_UNLIMITED_GETGROUPS for chg so getgroups will
have consistent behavior. (the alternative is patching CPython to unset
_DARWIN_C_SOURCE which is worse)

Special thanks to @strager for helping me debug this πŸ‘πŸ»

Test plan:

  1. have user with <=16 groups

  2. run sl --version

  3. confirm it works

  4. have user with >16 groups

  5. run sl --version

  6. notice sl failing with error "chg: abort: too many redirections."

  7. apply this patch

  8. repeat steps 4-5

  9. confirm it works

Co-authored-by: strager Strager.nds@gmail.com

Footnotes

  1. https://opensource.apple.com/source/xnu/xnu-3247.1.106/bsd/man/man2/getgroups.2.auto.html ↩

  2. https://github.com/python/cpython/blob/28d369e070652e8f2274101d72131e3140dfadf7/configure.ac#L267 ↩

@vegerot
Copy link
Contributor Author

vegerot commented Mar 20, 2023

I would like to add a test to prevent future regressions but don't know an easy way to write a reliable test for this. test-fb-ext-grpcheck just mocks os.getgroups but I'd need an actual user to confirm the difference _DARWIN_UNLIMITED_GETGROUPS makes

@vegerot vegerot force-pushed the pr573 branch 2 times, most recently from 1d83bc7 to 327cd92 Compare March 20, 2023 20:11
@@ -1669,6 +1669,12 @@ def cythonize(*args, **kwargs):
"depends": [versionhashpath],
"include_dirs": ["contrib/chg"] + include_dirs,
"extra_args": filter(None, cflags + chgcflags + [STDC99, WALL, PIC]),
# chg uses libc::unistd/getgroups() to check that chg and the
# sl cli have the same permissions (see 49ee17f2bfad).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I'd write D43676809 instead of the open source Git commit hash (49ee17f2bfad).

Copy link
Contributor Author

@vegerot vegerot Mar 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

off-topic: sapling doesn't support referencing commits by Revision:

❯ sl show D43676809
abort: unknown revision 'D43676809'!

so I need to sl show the git commit and find the revision#.
that'd be a neat feature to add, but it'd only make sense in open source FB repos

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made a similar change to the commit message too

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it'd only make sense in open source FB repos

LLVM too!

# however, on macOS, getgroups() is limited to NGROUPS_MAX (16) groups by default.
# we can work around this by defining _DARWIN_UNLIMITED_GETGROUPS
# see https://opensource.apple.com/source/xnu/xnu-3247.1.106/bsd/man/man2/getgroups.2.auto.html
"macros": [("_DARWIN_UNLIMITED_GETGROUPS", "1")]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also write a comment in the calls to getgroups? I'm thinking about the situation where a programmer ports the chg client (or server) code to Rust. They'll see a call to getgroups and just call libc::getgroups without checking that it matches the other half. This would have a chance of reintroducing the bug (because there are no tests to detect it).

In my codebases, I would write # NOTE[macos-getgroups]: ... here, and in the getgroups calls I would write # See NOTE[macos-getgroups]. to link to this note. I don't know what the convention is for the Sapling codebase.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. I would rather just write a test, but having to create a user and add groups to it in the sl test environment sounds very tricky to do without polluting the user's machine

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly the commit message goes much more in-depth than even this comment. Should I put the commit message in some docs somewhere and reference it here and with the getgroups calls?

Summary:
D43676809 (OSS: 49ee17f) introduced a check for whether the chg server and
sl process have different permissions.  We run `getgroups` in two different
places:
1. using `os.getgroups` in sl client in Python
2. using libc's `unistd/getgroups` in chg server in C

Essentially the way this works is we call `os.getgroups` in the client (Python)
and `unistd/getgroups` on the chg server (C).  Then, the server compares the
results.  If they're the same, then both processes have the same permissions,
otherwise they might be different.

However, `getgroups` on macOS has a key limitation that it will only return a
max of `NGROUPS_MAX` groups (16 groups on my machine, defined in
`sys/param.h`).  You can remove this limit by defining
`_DARWIN_UNLIMITED_GETGROUPS` or `_DARWIN_C_SOURCE`[^1].

`os.getgroups` in Python is compiled with `_DARWIN_C_SOURCE`[^2] but chg does
not, meaning that the result of `getgroups` is different in our Python and C
code.

This commit defines `_DARWIN_UNLIMITED_GETGROUPS` for chg so `getgroups` will
have consistent behavior. (the alternative is patching CPython to unset
`_DARWIN_C_SOURCE` which is worse)

Special thanks to @strager for helping me debug this πŸ‘πŸ»

Test plan:
1. have user with <=16 groups
2. run `sl --version`
3. confirm it works

4. have user with >16 groups
5. run `sl --version`
6. notice sl failing with error "chg: abort: too many redirections."

7. apply this patch
8. repeat steps 4-5
9. confirm it works

[^1]: https://opensource.apple.com/source/xnu/xnu-3247.1.106/bsd/man/man2/getgroups.2.auto.html
[^2]: https://github.com/python/cpython/blob/28d369e070652e8f2274101d72131e3140dfadf7/configure.ac#L267


Co-authored-by: strager <Strager.nds@gmail.com>
@facebook-github-bot
Copy link
Contributor

@quark-zju has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@quark-zju merged this pull request in 5656ac6.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants