broker: deduplicate extra/owner extra groups to prevent UNIQUE-constraint failure on offline retry#1496
Conversation
…ed UserInfo Build a set of group names already in `authInfo.UserInfo.Groups` before appending configured `extraGroups` and `ownerExtraGroups`, so we never add the same group twice when the cached UserInfo already contains it. On the online auth path the calling code resets `authInfo.UserInfo.Groups` to a fresh provider response before `finishAuth` runs, so the unconditional append never produces duplicates. On the offline path that reset is skipped (no provider call possible), so `authInfo` is loaded from cache *with the previous extras already present* and the append silently doubles them. The daemon then does a transactional `DELETE; INSERT VALUES (?, ?), (?, ?), ...` against `users_to_local_groups` and the duplicates collide intra-transaction on the `(uid, group_name)` UNIQUE constraint, rejecting the user at GDM and rolling back the local groups for them entirely. Reproducible with `owner_extra_groups = sudo, docker, dialout` in broker.conf: sign in online (works, groups recorded once), `nmcli networking off`, sign in again -> "Authentication failure" + "UNIQUE constraint failed: users_to_local_groups.uid, users_to_local_groups.group_name" in `journalctl -u authd`.
nooreldeenmansour
left a comment
There was a problem hiding this comment.
Thanks a lot @yetanotheralex, nice work!
I've left a few small comments. Please add a test in broker_test.go that covers this fix :)
A few other things:
- Please refer to https://canonical.com/legal/contributors to sign the CLA
- Could you please sign your commit? I recommend using an SSH key for that: https://docs.github.com/en/authentication/managing-commit-signature-verification/telling-git-about-your-signing-key#telling-git-about-your-ssh-key
| // every offline retry would append another copy of the configured local groups, which then | ||
| // explodes the daemon's UNIQUE constraint on users_to_local_groups (uid, group_name) because | ||
| // handleUsersToLocalGroupsUpdate does DELETE-then-INSERT and the duplicates collide intra-tx. | ||
| existing := make(map[string]struct{}, len(authInfo.UserInfo.Groups)) |
There was a problem hiding this comment.
| existing := make(map[string]struct{}, len(authInfo.UserInfo.Groups)) | |
| existingGroups := make(map[string]struct{}, len(authInfo.UserInfo.Groups)) |
| for _, g := range authInfo.UserInfo.Groups { | ||
| existing[g.Name] = struct{}{} | ||
| } | ||
| addExtra := func(name string) { |
There was a problem hiding this comment.
The old paths used to log Adding extra group %q for the normal path, and Adding owner extra group %q for the owner path. To preserve those, I suggest passing a groupLabel string and building the log line inside the function:
| addExtra := func(name string) { | |
| addConfiguredGroup := func(name, groupLabel string) { |
addConfiguredGroup is also more descriptive than addExtra.
Update L950 to log.Debugf(context.Background(), "Adding %s %q", groupLabel, name) and call it as addConfiguredGroup(g, "extra group") / addConfiguredGroup(g, "owner extra group").
| // Build a set of group names already in the user info (e.g. from cached/loaded auth state) | ||
| // so we don't append a configured extra/owner group that's already in the slice. Without this, | ||
| // every offline retry would append another copy of the configured local groups, which then | ||
| // explodes the daemon's UNIQUE constraint on users_to_local_groups (uid, group_name) because | ||
| // handleUsersToLocalGroupsUpdate does DELETE-then-INSERT and the duplicates collide intra-tx. |
There was a problem hiding this comment.
The comment is overly descriptive, I suggest keeping it short, perhaps something like
// Deduplicate to avoid UNIQUE constraint violations on offline loginsyour commit message is already descriptive enough, so that's sufficient imo
Fixes #1495.
What
Broker.finishAuthnow deduplicatesb.cfg.extraGroupsandb.cfg.ownerExtraGroupsagainstauthInfo.UserInfo.Groupsbefore appending them, so configured local groups can no longer be added more than once when the cachedauthInfoalready contains them.Why
On the online auth path the calling code resets
authInfo.UserInfo.Groupsto a fresh provider response beforefinishAuthruns, so the unconditionalappendnever produces duplicates. On the offline path that reset is skipped (no provider call possible), soauthInfois loaded from cache with the previous extras already present and the append silently doubles them.The daemon (
authd) then does a transactionalDELETE; INSERT VALUES (?, ?), (?, ?), …againstusers_to_local_groupsand the duplicate rows collide intra-transaction on the(uid, group_name)UNIQUE constraint. The user is rejected at GDM and (worse) ends up with no local groups recorded at all once the transaction rolls back.See #1495 for the full root-cause writeup.
Test
Manually verified on a single-user laptop with
owner_extra_groups = sudo, docker, dialout:Groupsslice containssudo,docker,dialoutexactly once.id <user>matches.nmcli networking off, sign in again: succeeds. Cached slice unchanged. Daemon'susers_to_local_groupsrows unchanged.Without this patch, step 2 produces the
UNIQUE constraint failed: users_to_local_groups.uid, users_to_local_groups.group_nameerror and step 3 never gets started. Patch has been running on my daily driver for ~24h with no regressions.Out of scope (suggested follow-up)
The daemon-side
handleUsersToLocalGroupsUpdatewould benefit fromINSERT … ON CONFLICT DO NOTHING(or its own dedupe pass) so that a misbehaving broker can't take user provisioning down. Happy to file a separate issue for that if useful.