Skip to content

Commit

Permalink
Add GUID unmigration to revert AD modifications in v2.7.5
Browse files Browse the repository at this point in the history
Squashed commit of the following:

commit 5b32df6
Author: Nicholas Flynt <nicholas.flynt@suse.com>
Date:   Thu Aug 17 11:59:35 2023 -0400

    Turns out the token.userPrincipal.UID is not normally set

commit 064526f
Author: Nicholas Flynt <nicholas.flynt@suse.com>
Date:   Thu Aug 17 11:12:17 2023 -0400

    Pull token fields from the ldap attributes instead of the old user

commit e33bba9
Author: Nicholas Flynt <nicholas.flynt@suse.com>
Date:   Thu Aug 17 10:11:57 2023 -0400

    Outdent returns to make drone happy

commit 6c084df
Author: Nicholas Flynt <nicholas.flynt@suse.com>
Date:   Thu Aug 17 09:01:45 2023 -0400

    Squashed commit of the following:

    commit 3db22eb
    Merge: 8039207 552fb84
    Author: Nicholas Flynt <nicholas.flynt@suse.com>
    Date:   Thu Aug 17 08:57:01 2023 -0400

        Merge branch 'uuid-unmigration' of github.com:nflynt/rancher into uuid-unmigration

    commit 8039207
    Author: Nicholas Flynt <nicholas.flynt@suse.com>
    Date:   Thu Aug 17 08:56:53 2023 -0400

        tiny, tiny fix to logging

    commit 552fb84
    Merge: ea68517 99a1814
    Author: nflynt <nicholas.flynt@suse.com>
    Date:   Thu Aug 17 07:39:00 2023 -0400

        Merge pull request rancher#30 from crobby/migrationreview31

        Outdent else blocks to make lint happy

    commit 99a1814
    Author: Chad Roberts <chad.roberts@suse.com>
    Date:   Thu Aug 17 05:00:47 2023 -0400

        Outdent else blocks to make lint happy

    commit ea68517
    Author: Nicholas Flynt <nicholas.flynt@suse.com>
    Date:   Wed Aug 16 20:28:14 2023 -0400

        Apply exponential retry logic to GRB and Token migrations

        Also, like *RTBs, these are considered non-fatal if a permanent
        error of some sort occurs. We continue to migrate the user anyway.

    commit 4a2ae0b
    Author: Nicholas Flynt <nicholas.flynt@suse.com>
    Date:   Wed Aug 16 19:24:42 2023 -0400

        For CRTB/PRTBs, rework error handling to gracefully retry

        In particular, this treats internal errors (usually related to
        webhook timeouts) as transient, and retries them with a little bit
        of exponential backoff.

        Furthermore, after reviewing some scenarios with Michael, we've
        decided to consider non-internal errors from the webhook as
        non-fatal in terms of continuing to process the individual user.
        There are a few situations where old bindings to disabled templates
        would otherwise block users from migrating, and this permits those
        to have a better chance of overall success.

    commit 35d647c
    Author: Nicholas Flynt <nicholas.flynt@suse.com>
    Date:   Wed Aug 16 16:58:50 2023 -0400

        When merging user tokens, copy over all relevant principal fields

        These aren't used for anything that I'm aware of, so this is really
        more just for consistency, since we want the two to be fully paired.

    commit f3e8094
    Author: Nicholas Flynt <nicholas.flynt@suse.com>
    Date:   Wed Aug 16 16:52:15 2023 -0400

        Cleanup error handling, consider AD retrieval to be a harder error

    commit 90f2ec1
    Merge: ffcec58 b56138b
    Author: Nicholas Flynt <nicholas.flynt@suse.com>
    Date:   Wed Aug 16 16:13:28 2023 -0400

        Merge branch 'uuid-unmigration' of github.com:nflynt/rancher into uuid-unmigration

    commit ffcec58
    Author: Nicholas Flynt <nicholas.flynt@suse.com>
    Date:   Wed Aug 16 16:13:10 2023 -0400

        ... once. Add the DN-based principal once.

    commit b56138b
    Merge: 78a66e0 bfb7176
    Author: nflynt <nicholas.flynt@suse.com>
    Date:   Wed Aug 16 15:47:45 2023 -0400

        Merge pull request rancher#29 from crobby/migrationreview25

        Store skipped/missing user count in configmap and do not store the actual list on the authconfig object

    commit 78a66e0
    Merge: edf3535 df507b5
    Author: nflynt <nicholas.flynt@suse.com>
    Date:   Wed Aug 16 15:47:24 2023 -0400

        Merge pull request rancher#28 from crobby/migrationreview24

        Remove unnecessary json marshal/unmarshal

    commit edf3535
    Merge: b93e6d0 12020af
    Author: nflynt <nicholas.flynt@suse.com>
    Date:   Wed Aug 16 15:47:10 2023 -0400

        Merge pull request rancher#27 from crobby/migrationreview23

        Give the job pod a chance to come up before tailing the log

    commit b93e6d0
    Merge: a2c2acb 58a0a1d
    Author: nflynt <nicholas.flynt@suse.com>
    Date:   Wed Aug 16 15:46:52 2023 -0400

        Merge pull request rancher#26 from crobby/migrationreview22

        Now using AuthConfig annotation as source of truth to block login during migration

    commit a2c2acb
    Author: Nicholas Flynt <nicholas.flynt@suse.com>
    Date:   Wed Aug 16 15:46:06 2023 -0400

        Rework allowed user migration to handle duplicates and missing users

    commit bfb7176
    Author: Chad Roberts <chad.roberts@suse.com>
    Date:   Wed Aug 16 14:38:22 2023 -0400

        Store skipped/missing user count in configmap and do not store the actual list on the authconfig object

    commit df507b5
    Author: Chad Roberts <chad.roberts@suse.com>
    Date:   Wed Aug 16 13:38:39 2023 -0400

        Remove unnecessary json marshal/unmarshal

    commit 12020af
    Author: Chad Roberts <chad.roberts@suse.com>
    Date:   Wed Aug 16 13:01:18 2023 -0400

        Give the job pod a chance to come up before tailing the log

    commit 58a0a1d
    Author: Chad Roberts <chad.roberts@suse.com>
    Date:   Wed Aug 16 12:50:57 2023 -0400

        Now using AuthConfig annotation as source of truth to block login during migration

    commit 3ef3fb0
    Author: Nicholas Flynt <nicholas.flynt@suse.com>
    Date:   Wed Aug 16 12:27:23 2023 -0400

        Wait to do the AuthConfig principals until after updating users

        This kicks off some rancher-side tasks based on the updated list,
        and we'd really like to make sure that those user changes have
        been made in advance just for sanity purposes.

    commit b29bfb8
    Author: Nicholas Flynt <nicholas.flynt@suse.com>
    Date:   Wed Aug 16 12:25:30 2023 -0400

        When collecting duplicates, we need to track the workunit index

    commit df0307e
    Author: Nicholas Flynt <nicholas.flynt@suse.com>
    Date:   Wed Aug 16 09:23:47 2023 -0400

        Have the dry run guard writing new principal IDs

        This is mostly just to make the code clearer and more obvious.
        The safety is redundant, as the dry run also blocks making changes
        to the user object later.

    commit 59bafdf
    Merge: 2dd5250 2473062
    Author: nflynt <nicholas.flynt@suse.com>
    Date:   Wed Aug 16 09:12:08 2023 -0400

        Merge pull request rancher#25 from crobby/migrationreview21

        Append copy of user rather than pointer to duplicate list

    commit 2473062
    Author: Chad Roberts <chad.roberts@suse.com>
    Date:   Wed Aug 16 08:00:41 2023 -0400

        append copy of user rather than pointer to duplicate list

    commit 2dd5250
    Author: Nicholas Flynt <nicholas.flynt@suse.com>
    Date:   Tue Aug 15 16:48:34 2023 -0400

        Explicitly check to see if AD is disabled, and exit success in this case

    commit 4a3aa80
    Author: Nicholas Flynt <nicholas.flynt@suse.com>
    Date:   Tue Aug 15 16:00:25 2023 -0400

        Actually *use* the final migration status

    commit 255ef68
    Author: Nicholas Flynt <nicholas.flynt@suse.com>
    Date:   Tue Aug 15 15:36:19 2023 -0400

        Add uuid-unmigration script, prevent AD logins during execution

        Squashed commit of the following:

        commit c2bb101
        Author: Nicholas Flynt <nicholas.flynt@suse.com>
        Date:   Tue Aug 15 15:13:12 2023 -0400

            Add a generic failure status, defer restoring logins on failure states

        commit f9c0398
        Author: Nicholas Flynt <nicholas.flynt@suse.com>
        Date:   Tue Aug 15 13:21:29 2023 -0400

            Permit retries (with backoff) when opening the LDAP connection

            Previously we were considering a failure during open (initial or
            otherwise) to be a hard, script-ending, permanent failure. That's
            frankly a bit silly, networks can be tempermental, so this fixes
            that somewhat.

            Notably, I can't seem to find any way to check the status of the
            connection on the lConn object, so we're tracking that manually
            using a tiny little state object. If there's a cleaner way to
            inspect this state I am all ears, but I don't think it's a majorly
            big deal.

            (Elsewhere in Rancher we don't try to share the ldap connection
            generally, but here it is a big performance boost, so it is worth
            the extra trouble.)

        commit b293d62
        Author: Nicholas Flynt <nicholas.flynt@suse.com>
        Date:   Tue Aug 15 12:54:43 2023 -0400

            Rework token logic to mirror *RTBs

            This both collects and processes tokens that the old logic would
            have missed, and is also considerably more efficient, now needing
            to scan the list of workunits and the list of tokens just once.

        commit fcd2b34
        Merge: 005f102 3bdea12
        Author: nflynt <nicholas.flynt@suse.com>
        Date:   Tue Aug 15 12:12:36 2023 -0400

            Merge pull request rancher#24 from crobby/migrationreview17

            Fixing names to make ci happy

        commit 3bdea12
        Author: Chad Roberts <chad.roberts@suse.com>
        Date:   Tue Aug 15 12:09:22 2023 -0400

            Fixing names to make ci happy

        commit 005f102
        Author: Nicholas Flynt <nicholas.flynt@suse.com>
        Date:   Tue Aug 15 12:01:31 2023 -0400

            Missing users are Infof, not Errorf

        commit 540e494
        Author: Nicholas Flynt <nicholas.flynt@suse.com>
        Date:   Tue Aug 15 11:10:27 2023 -0400

            Don't create/update the configmap object in dry run mode

            What part of "dry run" did we forget, hrm?

        commit 9ced565
        Author: Nicholas Flynt <nicholas.flynt@suse.com>
        Date:   Tue Aug 15 11:00:51 2023 -0400

            If the config map is not found, it's fine. (Panic otherwise.)

        commit 80ea848
        Author: Nicholas Flynt <nicholas.flynt@suse.com>
        Date:   Tue Aug 15 10:53:30 2023 -0400

            Add logic to migrate list of allowed users

        commit c12dcef
        Merge: 33f494a ce1feb4
        Author: nflynt <nicholas.flynt@suse.com>
        Date:   Tue Aug 15 09:25:53 2023 -0400

            Merge pull request rancher#23 from crobby/migrationreview14

            Another round of updates

        commit 33f494a
        Merge: b897e47 e944b57
        Author: Nicholas Flynt <nicholas.flynt@suse.com>
        Date:   Tue Aug 15 09:13:15 2023 -0400

            Merge branch 'uuid-unmigration' of github.com:nflynt/rancher into uuid-unmigration

        commit b897e47
        Author: Nicholas Flynt <nicholas.flynt@suse.com>
        Date:   Tue Aug 15 09:12:51 2023 -0400

            Rework CRTB,PRTB collection, add GRB migration logic

        commit ce1feb4
        Author: Chad Roberts <chad.roberts@suse.com>
        Date:   Tue Aug 15 07:15:24 2023 -0400

            Echoing the set options at the end of the banner

        commit 089412c
        Author: Chad Roberts <chad.roberts@suse.com>
        Date:   Tue Aug 15 06:44:43 2023 -0400

            Adding additional information to README

        commit a7c9484
        Author: Chad Roberts <chad.roberts@suse.com>
        Date:   Tue Aug 15 06:38:19 2023 -0400

            Include agent image location in banner

        commit 8854263
        Author: Chad Roberts <chad.roberts@suse.com>
        Date:   Mon Aug 14 16:31:44 2023 -0400

            Mirror script status to authconfig

        commit 5bc29d5
        Author: Chad Roberts <chad.roberts@suse.com>
        Date:   Mon Aug 14 12:50:13 2023 -0400

            Update script status codes

        commit e944b57
        Merge: 14c5f72 80e928b
        Author: nflynt <nicholas.flynt@suse.com>
        Date:   Mon Aug 14 11:36:58 2023 -0400

            Merge pull request rancher#22 from crobby/migrationreview13

            More updates

        commit 14c5f72
        Merge: a3e85de 516bdeb
        Author: Nicholas Flynt <nicholas.flynt@suse.com>
        Date:   Mon Aug 14 11:36:03 2023 -0400

            Merge branch 'uuid-unmigration' of github.com:nflynt/rancher into uuid-unmigration

        commit a3e85de
        Author: Nicholas Flynt <nicholas.flynt@suse.com>
        Date:   Mon Aug 14 11:35:46 2023 -0400

            Break out migration logic into a bunch of smaller files

        commit 80e928b
        Author: Chad Roberts <chad.roberts@suse.com>
        Date:   Mon Aug 14 10:51:39 2023 -0400

            Use configmap cache instead of client

        commit 516bdeb
        Merge: a899779 f8369c8
        Author: nflynt <nicholas.flynt@suse.com>
        Date:   Mon Aug 14 10:13:56 2023 -0400

            Merge pull request rancher#21 from crobby/migrationreview12

            Display banner before doing version check

        commit f8369c8
        Author: Chad Roberts <chad.roberts@suse.com>
        Date:   Mon Aug 14 10:12:31 2023 -0400

            Display banner before doing version check

        commit a899779
        Author: nflynt <nicholas.flynt@suse.com>
        Date:   Mon Aug 14 10:08:24 2023 -0400

            Update cleanup/ad-guid-README.md

            Co-authored-by: Michael Bolot <michael.bolot@suse.com>

        commit 4d09212
        Merge: c110ae9 92483fa
        Author: nflynt <nicholas.flynt@suse.com>
        Date:   Mon Aug 14 09:58:56 2023 -0400

            Merge pull request rancher#19 from crobby/migrationreview9

            Removing unused error type check

        commit 92483fa
        Author: Chad Roberts <chad.roberts@suse.com>
        Date:   Mon Aug 14 09:51:18 2023 -0400

            Removing unused error type check

        commit c110ae9
        Author: Nicholas Flynt <nicholas.flynt@suse.com>
        Date:   Thu Aug 10 19:51:16 2023 -0400

            goimports the things

        commit 7691146
        Merge: 44d2375 6453484
        Author: Nicholas Flynt <nicholas.flynt@suse.com>
        Date:   Thu Aug 10 19:19:39 2023 -0400

            Merge branch 'uuid-unmigration' of github.com:nflynt/rancher into uuid-unmigration

        commit 6453484
        Merge: baf84bf 50286a2
        Author: nflynt <nicholas.flynt@suse.com>
        Date:   Thu Aug 10 19:19:32 2023 -0400

            Merge pull request rancher#18 from crobby/migrationreview7

            Fixing error checking

        commit 44d2375
        Author: Nicholas Flynt <nicholas.flynt@suse.com>
        Date:   Thu Aug 10 19:13:58 2023 -0400

            Use wait's exponential backoff primitive instead of manual sleeps

        commit 50286a2
        Author: Chad Roberts <chad.roberts@suse.com>
        Date:   Thu Aug 10 16:27:48 2023 -0400

            Fixing error checking

        commit baf84bf
        Author: Nicholas Flynt <nicholas.flynt@suse.com>
        Date:   Thu Aug 10 15:39:13 2023 -0400

            Only yell if the user is doing a non-dry-run on v2.7.5

        commit eed1416
        Merge: 9a71e38 ad00983
        Author: Nicholas Flynt <nicholas.flynt@suse.com>
        Date:   Thu Aug 10 15:36:53 2023 -0400

            Merge branch 'uuid-unmigration' of github.com:nflynt/rancher into uuid-unmigration

        commit 9a71e38
        Author: Nicholas Flynt <nicholas.flynt@suse.com>
        Date:   Thu Aug 10 15:36:08 2023 -0400

            Cleanup timeout messaging, lower job start timeout to 5 minutes

            I misunderstood the bash logic when I first extended that to one
            hour. 5 minutes for an agent download is somewhat more sensible.

        commit ad00983
        Merge: 4e18baa 344a05d
        Author: nflynt <nicholas.flynt@suse.com>
        Date:   Thu Aug 10 15:34:29 2023 -0400

            Merge pull request rancher#17 from crobby/migrationreview6

            Additional changes after review

        commit 344a05d
        Author: Chad Roberts <chad.roberts@suse.com>
        Date:   Thu Aug 10 14:16:55 2023 -0400

            Adding version check for v2.7.5 before doing anything

        commit 682444d
        Author: Chad Roberts <chad.roberts@suse.com>
        Date:   Thu Aug 10 13:50:05 2023 -0400

            Fix-up README for updated usage

        commit 4e18baa
        Author: Nicholas Flynt <nicholas.flynt@suse.com>
        Date:   Thu Aug 10 14:54:15 2023 -0400

            Spawn relevant resources in the cattle-system namespace

        commit f96eb3a
        Author: Nicholas Flynt <nicholas.flynt@suse.com>
        Date:   Thu Aug 10 14:12:33 2023 -0400

            Move the YAML configuration file into the bash script

            This dodges the whole "fetch it from a weird URL" thing, and also
            makes the script a self-contained single file, which is much nicer
            for support to deal with.

        commit 275f42b
        Merge: 4c98764 b99cab4
        Author: nflynt <nicholas.flynt@suse.com>
        Date:   Thu Aug 10 11:16:41 2023 -0400

            Merge pull request rancher#16 from crobby/migrationreview5

            More post review updates

        commit b99cab4
        Author: Chad Roberts <chad.roberts@suse.com>
        Date:   Thu Aug 10 09:53:57 2023 -0400

            Fixing up handling of command line options and args

        commit 4f6da40
        Author: Chad Roberts <chad.roberts@suse.com>
        Date:   Thu Aug 10 07:49:20 2023 -0400

            Fixing up LdapFoundDuplicateGUID name

        commit 9f577f6
        Author: Chad Roberts <chad.roberts@suse.com>
        Date:   Thu Aug 10 07:31:20 2023 -0400

            Adding percentage done indicator to status config map

        commit 43f19e4
        Author: Chad Roberts <chad.roberts@suse.com>
        Date:   Thu Aug 10 07:06:02 2023 -0400

            Adding lists of special status users to configmap

        commit fa9979e
        Author: Chad Roberts <chad.roberts@suse.com>
        Date:   Thu Aug 10 06:33:46 2023 -0400

            Adding rancher-cleanup label to all cleanup objects

        commit 4c98764
        Merge: 2d59ac6 c301303
        Author: nflynt <nicholas.flynt@suse.com>
        Date:   Wed Aug 9 17:38:29 2023 -0400

            Merge pull request rancher#15 from crobby/migrationreview4

            Post review updates

        commit c301303
        Author: Chad Roberts <chad.roberts@suse.com>
        Date:   Wed Aug 9 17:33:39 2023 -0400

            Updated isGUID function

        commit 2d59ac6
        Merge: c0cdc07 86330c6
        Author: nflynt <nicholas.flynt@suse.com>
        Date:   Wed Aug 9 17:14:48 2023 -0400

            Merge pull request rancher#14 from crobby/migrationreview3

            Migration review updates 3

        commit c0cdc07
        Author: Nicholas Flynt <nicholas.flynt@suse.com>
        Date:   Wed Aug 9 17:12:22 2023 -0400

            Log if we need to skip a CRTB/PRTB due to the user not existing

            This feels like the safer option versus applying permissions that
            none of the users we've collected actually have, even with the
            GUID/DN matching. This situation should be relatively uncommon,
            as Rancher usually cleans these up when users are deleted, but
            with the GUID duplicate bug I'm not sure how successful that will
            have been in practice. Best to be safe (and noisy)

        commit 86330c6
        Author: Chad Roberts <chad.roberts@suse.com>
        Date:   Wed Aug 9 17:09:05 2023 -0400

            Updating SA permissions for nonResourceURLs

        commit 4ae2d58
        Author: Chad Roberts <chad.roberts@suse.com>
        Date:   Wed Aug 9 12:12:19 2023 -0400

            Seeding README, adding script banner

        commit f8c941b
        Author: Chad Roberts <chad.roberts@suse.com>
        Date:   Wed Aug 9 11:20:10 2023 -0400

            Token collection checking userID and now setting userID and label for token updates

        commit e742102
        Author: Chad Roberts <chad.roberts@suse.com>
        Date:   Wed Aug 9 11:03:04 2023 -0400

            Adding additional dry-run logging information

        commit dc46114
        Author: Nicholas Flynt <nicholas.flynt@suse.com>
        Date:   Wed Aug 9 16:57:02 2023 -0400

            Rework CRTB/PRTB collection to check usernames, run through list once

            There are still nested for loops in here, but they are a bit more
            hidden :P

        commit ad32ccd
        Merge: ccb0b84 cb98c12
        Author: Nicholas Flynt <nicholas.flynt@suse.com>
        Date:   Wed Aug 9 12:52:25 2023 -0400

            Merge branch 'uuid-unmigration' of github.com:nflynt/rancher into uuid-unmigration

        commit ccb0b84
        Author: Nicholas Flynt <nicholas.flynt@suse.com>
        Date:   Wed Aug 9 12:50:27 2023 -0400

            Break out the user modification flow into separate functions

            This mostly cleans up the main loop, but it also separates concerns
            and makes the smaller bits of logic easier to find and follow.

        commit aa41893
        Author: Nicholas Flynt <nicholas.flynt@suse.com>
        Date:   Wed Aug 9 12:19:08 2023 -0400

            Move user principal printing into its respective utility function

        commit ef909ab
        Author: Nicholas Flynt <nicholas.flynt@suse.com>
        Date:   Wed Aug 9 12:12:05 2023 -0400

            Respect the adConfig's UserObjectClass when performing a GUID lookup

            This is for parity with the auth provider; most AD configurations
            shouldn't have changed this from the default.

        commit 3963205
        Author: Nicholas Flynt <nicholas.flynt@suse.com>
        Date:   Wed Aug 9 11:44:10 2023 -0400

            Consider multiple users with the same GUID as a hard error

            This shouldn't be possible in practice, so it almost certainly
            indicates either a configuration error, or something wrong on the
            AD side of things. Either way we will refuse to process any user
            that trips this logic, and complain about it quite loudly.

        commit 0cebb89
        Author: Nicholas Flynt <nicholas.flynt@suse.com>
        Date:   Wed Aug 9 11:27:24 2023 -0400

            We don't need the scope, so simplify -> getExternalId

        commit da7ef22
        Author: Nicholas Flynt <nicholas.flynt@suse.com>
        Date:   Wed Aug 9 11:11:41 2023 -0400

            Start the scaledContext. Don't give it managers it doesn't need

        commit a60b144
        Author: Nicholas Flynt <nicholas.flynt@suse.com>
        Date:   Wed Aug 9 10:34:25 2023 -0400

            Remove the ratelimiting exception. Prefer safety over speed

            We need to check the performance ramifications of this during
            testing, but considering that we will almost certainly be iterating
            over hundreds of users, we should probably let k8s itself rate
            limit us so we don't overwhelm whatever is running the control
            plane. That might otherwise be a nasty situation, especially for
            stuff like AKS and GKE.

        commit 16715df
        Author: Nicholas Flynt <nicholas.flynt@suse.com>
        Date:   Wed Aug 9 10:32:57 2023 -0400

            For bonus safety, redundantly check for dryRun here

            The logic up top should make this check unnecessary, but we want
            to be extra certain that in dryRun mode no changes are made, so
            we'll explicitly guard on it every time. This protects the code
            less from itself, and more from future modifications.

        commit cb98c12
        Merge: e17d56f 4d2f735
        Author: nflynt <nicholas.flynt@suse.com>
        Date:   Wed Aug 9 10:20:06 2023 -0400

            Merge pull request rancher#13 from crobby/migrationreview2

            More updates based on review comments

        commit 4d2f735
        Author: Chad Roberts <chad.roberts@suse.com>
        Date:   Tue Aug 8 10:17:38 2023 -0400

            More updates based on review comments

        commit e17d56f
        Author: Nicholas Flynt <nicholas.flynt@suse.com>
        Date:   Mon Aug 7 16:38:59 2023 -0400

            EscapeUUID -> escapeUUID

        commit 139ce3c
        Author: Nicholas Flynt <nicholas.flynt@suse.com>
        Date:   Mon Aug 7 16:37:34 2023 -0400

            Relocate environment variable use to the agent-specific code path

        commit 795c94b
        Author: Nicholas Flynt <nicholas.flynt@suse.com>
        Date:   Mon Aug 7 16:33:13 2023 -0400

            Remove unnecessary namespace from cluster role definitions

        commit 01ea868
        Author: Nicholas Flynt <nicholas.flynt@suse.com>
        Date:   Mon Aug 7 16:30:53 2023 -0400

            One minute is *awfully optimistic.* Let's be more realistic

        commit b9d4487
        Merge: 17250da 0efbb02
        Author: nflynt <nicholas.flynt@suse.com>
        Date:   Mon Aug 7 16:21:42 2023 -0400

            Merge pull request rancher#12 from crobby/migrationreview

            Update based on review comments

        commit 0efbb02
        Author: Chad Roberts <chad.roberts@suse.com>
        Date:   Mon Aug 7 15:55:46 2023 -0400

            Update based on review comments

        commit 17250da
        Author: Nicholas Flynt <nicholas.flynt@suse.com>
        Date:   Mon Aug 7 10:29:05 2023 -0400

            Don't hide the migration script from windows agents

            ... which in hindsight are probably somewhat likely to be using
            the Active Directory auth provider.

        commit cadf021
        Merge: 9b8fd58 3926f7b
        Author: nflynt <nicholas.flynt@suse.com>
        Date:   Mon Aug 7 08:18:10 2023 -0400

            Merge pull request rancher#11 from crobby/migrateimports

            Fixing imports

        commit 3926f7b
        Author: Chad Roberts <chad.roberts@suse.com>
        Date:   Sat Aug 5 07:45:25 2023 -0400

            Fixing imports

        commit 9b8fd58
        Merge: de38ffe 26dd505
        Author: nflynt <nicholas.flynt@suse.com>
        Date:   Fri Aug 4 17:10:43 2023 -0400

            Merge pull request rancher#10 from crobby/dntokens

            Fix tokens going to local principal

        commit 26dd505
        Author: Chad Roberts <chad.roberts@suse.com>
        Date:   Fri Aug 4 17:08:20 2023 -0400

            Fix tokens going to local principal

        commit de38ffe
        Author: Nicholas Flynt <nicholas.flynt@suse.com>
        Date:   Fri Aug 4 15:36:12 2023 -0400

            Cleanup debug/info logs somewhat

        commit 1581b5d
        Merge: 5dfcda0 29c87eb
        Author: nflynt <nicholas.flynt@suse.com>
        Date:   Fri Aug 4 14:56:22 2023 -0400

            Merge pull request rancher#9 from crobby/linter2

            More cleaning up lint

        commit 29c87eb
        Author: Chad Roberts <chad.roberts@suse.com>
        Date:   Fri Aug 4 14:54:40 2023 -0400

            More cleaning up lint

        commit 5dfcda0
        Merge: a119663 d37ef2f
        Author: nflynt <nicholas.flynt@suse.com>
        Date:   Fri Aug 4 14:49:55 2023 -0400

            Merge pull request rancher#8 from crobby/linter

            Cleaning up lint

        commit d37ef2f
        Author: Chad Roberts <chad.roberts@suse.com>
        Date:   Fri Aug 4 14:47:44 2023 -0400

            Cleaning up lint

        commit a119663
        Author: Nicholas Flynt <nicholas.flynt@suse.com>
        Date:   Fri Aug 4 14:38:46 2023 -0400

            Add an option to automatically delete missing-guid users

            This is only available when running the standalone script. At Rancher
            startup this option is set to false, so missing users will be logged
            instead and require manual intervention.

        commit 60f31f8
        Merge: 7e620d5 9d82578
        Author: nflynt <nicholas.flynt@suse.com>
        Date:   Fri Aug 4 13:22:56 2023 -0400

            Merge pull request rancher#7 from crobby/0805-migration

            Update migration start logic so an automated run will only happen if another run has not completed

        commit 9d82578
        Author: Chad Roberts <chad.roberts@suse.com>
        Date:   Fri Aug 4 12:12:56 2023 -0400

            Update migration start logic so an automated run will only happen if another run has not completed

        commit 7e620d5
        Merge: 30c9f64 6c352a5
        Author: nflynt <nicholas.flynt@suse.com>
        Date:   Fri Aug 4 11:26:52 2023 -0400

            Merge pull request rancher#4 from crobby/migrateatstart

            Add guid migration to rancher startup

        commit 30c9f64
        Merge: b9aa392 72895b4
        Author: nflynt <nicholas.flynt@suse.com>
        Date:   Fri Aug 4 11:10:58 2023 -0400

            Merge pull request rancher#5 from crobby/0803-migration

            Make sure annotations/labels are not nil

        commit 72895b4
        Author: Chad Roberts <chad.roberts@suse.com>
        Date:   Thu Aug 3 16:58:56 2023 -0400

            Make sure annotations/labels are not nil

        commit b9aa392
        Merge: 79762cb 7546cdf
        Author: nflynt <nicholas.flynt@suse.com>
        Date:   Fri Aug 4 10:43:30 2023 -0400

            Merge pull request rancher#6 from crobby/0804-migration

            Fix crtb, prtb collection and add token collection/migration

        commit 7546cdf
        Author: Chad Roberts <chad.roberts@suse.com>
        Date:   Fri Aug 4 08:59:54 2023 -0400

            Fix crtb, prtb collection and add token collection/migration

        commit 79762cb
        Author: Nicholas Flynt <nicholas.flynt@suse.com>
        Date:   Thu Aug 3 18:00:53 2023 -0400

            Collect CRTBs and PRTBs in a single pass

        commit b6b6085
        Merge: 3de5aa3 b3acab9
        Author: nflynt <nicholas.flynt@suse.com>
        Date:   Thu Aug 3 11:44:13 2023 -0400

            Merge pull request rancher#3 from crobby/0802-2migration

            Adding annotation/labels for migrated objects also blocking login while migration is active

        commit b3acab9
        Author: Chad Roberts <chad.roberts@suse.com>
        Date:   Thu Aug 3 11:37:16 2023 -0400

            Update role for SA

        commit 673e765
        Author: Chad Roberts <chad.roberts@suse.com>
        Date:   Thu Aug 3 09:33:45 2023 -0400

            Blocking login while migration is running

        commit 6c352a5
        Author: Chad Roberts <chad.roberts@suse.com>
        Date:   Wed Aug 2 13:42:33 2023 -0400

            Add guid migration to rancher startup

        commit 840c5a7
        Author: Chad Roberts <chad.roberts@suse.com>
        Date:   Wed Aug 2 12:20:41 2023 -0400

            Adding annotation/labels for migrated objects

        commit 3de5aa3
        Merge: 5dc7bd7 04ea1ce
        Author: nflynt <nicholas.flynt@suse.com>
        Date:   Wed Aug 2 09:57:48 2023 -0400

            Merge pull request rancher#2 from crobby/0802migration

            Fix status function and use user copies in workUnit slices

        commit 04ea1ce
        Author: Chad Roberts <chad.roberts@suse.com>
        Date:   Tue Aug 1 18:02:19 2023 -0400

            Fixing status function and using copies of users in workUnit slices

        commit 5dc7bd7
        Author: Nicholas Flynt <nicholas.flynt@suse.com>
        Date:   Tue Aug 1 16:29:15 2023 -0400

            Skip over configmap updates for now, just to get the script running

        commit ac3afe6
        Author: Nicholas Flynt <nicholas.flynt@suse.com>
        Date:   Tue Aug 1 16:19:52 2023 -0400

            Massively overhaul main loop, check for and handle duplicate users

            This is largely untested because I'm having some trouble with the
            configmaps code, but I wanted to get this committed before I start
            troubleshooting

        commit 5295f8f
        Merge: 29f9332 552e73f
        Author: nflynt <nicholas.flynt@suse.com>
        Date:   Tue Aug 1 08:58:41 2023 -0400

            Merge pull request rancher#1 from crobby/tokenunmigrate

            Additional unmigration functionality

        commit 552e73f
        Author: Chad Roberts <chad.roberts@suse.com>
        Date:   Mon Jul 31 13:22:26 2023 -0400

            Additional unmigration functionality

        commit 29f9332
        Author: Nicholas Flynt <nicholas.flynt@suse.com>
        Date:   Mon Jul 31 17:30:10 2023 -0400

            Actually perform the GUID -> DN migration on the happy path

            And it works too! Thank goodness. Now we mostly need to clean up the
            logic and handle a few dozen edge cases.

        commit 62a6747
        Author: Nicholas Flynt <nicholas.flynt@suse.com>
        Date:   Mon Jul 31 12:53:43 2023 -0400

            Cleanup the logs a bit, flatten the central logic with early exits

        commit ac20a2c
        Author: Nicholas Flynt <nicholas.flynt@suse.com>
        Date:   Mon Jul 31 09:58:54 2023 -0400

            Switch to using the scaledContext for everything

            Since it can do all the lookups we need, it seems silly to setup
            and use two different interfaces to the same underlying datastore.
            The UnstructuredClient is the only way we can read AD configuration
            right now, and we need that info, so let's stick to that method.

        commit 18b39d3
        Author: Nicholas Flynt <nicholas.flynt@suse.com>
        Date:   Fri Jul 28 17:38:27 2023 -0400

            First pass at migration scaffolding, enough to do GUID -> DN lookups

            There is still much work to do, but at the very least we can read
            the relevant auth configuration details from k8s and use those
            details to make LDAP queries, and that's nearly all of what we need
            to perform the migration.
  • Loading branch information
nflynt authored and crobby committed Aug 25, 2023
1 parent e2c55c6 commit e5af62a
Show file tree
Hide file tree
Showing 10 changed files with 1,955 additions and 13 deletions.
65 changes: 65 additions & 0 deletions cleanup/ad-guid-README.md
@@ -0,0 +1,65 @@
# Active Directory GUID -> DN reverse migration utility

**It is recommended to take a snapshot of Rancher before performing this in the event that a restore is required.**


## Critical Notes
* This script will delete and recreate CRTBs/PRTBs/GRBs, which may cause issues with tools (like terraform) which maintain external state. The original object names are stored in an annotation on the new objects.
* It is recommended to use this script on Rancher v2.7.6 - running this on v2.7.5 may produce performance issues
* This script requires that the Active Directory service account has permissions to read all users known to Rancher.


## Purpose

In order to reverse the effects of migrating Active Directory principalIDs to be based on GUID rather than DN this
utility is required. It can be run manually via Rancher Agent, or it will automatically run inside Rancher at startup
time if no previous run is detected.
This utility will:
* Remove any users that were duplicated during the original migration toward GUID-based principalIDs in Rancher 2.7.5
* Update objects that referenced a GUID-based principalID to reference the correct distinguished name based principalID


## Detailed description

This utility will go through all Rancher users and perform an Active Directory lookup using the configured service account to
get the user's distinguished name. Next, it will perform lookups inside Rancher for all the user's Tokens,
ClusterRoleTemplateBindings, ProjectRoleTemplateBindings, and GlobalRoleBindings. If any of those objects, including the user object
itself are referencing a principalID based on the GUID of that user, those objects will be updated to reference
the distinguished name-based principalID (unless the utility is run with -dry-run, in that case the only results
are log messages indicating the changes that would be made by a run without that flag).

This utility will also detect and correct the case where a single ActiveDirectory GUID is mapped to multiple Rancher
users. That condition was likely caused by a race in the original migration to use GUIDs and resulted in a second
Rancher user being created. This caused Rancher logins to fail for the duplicated user. The utility remedies
that situation by mapping any tokens and bindings to the original user before removing the newer user, which was
created in error.


## Requirements

A Rancher environment that has Active Directory set up as the authentication provider. For any environment where
Active Directory is not the authentication provider, this utility will take no action and will exit immediately.


## Usage via Rancher Agent

```bash
./ad-guid-unmigration.sh <AGENT IMAGE> [--dry-run] [--delete-missing]
```
* The Agent image can be found at: docker.io/rancher/rancher-agent:v2.7.6
* The --dry-run flag will run the migration utility, but no changes to Rancher data will take place. The potential changes will be indicated in the log file.
* The --delete-missing flag will delete Rancher users that can not be found by looking them up in Active Directory. If --dry-run is set, that will prevent users from being deleted regardless of this flag.


## Additional notes
* The utility will create a configmap named `ad-guid-migration` in the `cattle-system` namespace. This configmap contains
a data entry with a key named "ad-guid-migration-status". If the utility is currently active, that status will be
set to "Running". After the utility has completed, the status will be set to "Finished". If a run is interrupted
prior to completion, that configmap will retain the status of "Running" and subsequent attempts to run the script will
immediately exit. In order to allow it to run again, you can either edit the configmap to remove that key or you can
delete the configmap entirely.

* When migrating ClusterRoleTemplateBindings, ProjectRoleTemplateBindings, and GlobalRoleBindings it is necessary to perform the action
as a delete/create rather than an update. **This may cause issues if you use tooling that relies on the names of the objects**.
When a ClusterRoleTemplateBinding or a ProjectRoleTemplateBinding is migrated to a new name, the newly created object
will contain a label, "ad-guid-previous-name", that will have a value of the name of the object that was deleted.
265 changes: 265 additions & 0 deletions cleanup/ad-guid-unmigration.sh
@@ -0,0 +1,265 @@
#!/bin/bash
# set -x
set -e

# Text to display in the banner
banner_text="This utility will go through all Rancher users and perform an Active Directory lookup using
the configured service account to get the user's distinguished name. Next, it will perform lookups inside Rancher
for all the user's Tokens, ClusterRoleTemplateBindings, and ProjectRoleTemplateBindings. If any of those objects,
including the user object itself are referencing a principalID based on the GUID of that user, those objects will be
updated to reference the distinguished name-based principalID (unless the utility is run with -dry-run, in that case
the only results are log messages indicating the changes that would be made by a run without that flag).
This utility will also detect and correct the case where a single ActiveDirectory GUID is mapped to multiple Rancher
users. That condition was likely caused by a race in the original migration to use GUIDs and resulted in a second
Rancher user being created. This caused Rancher logins to fail for the duplicated user. The utility remedies
that situation by mapping any tokens and bindings to the original user before removing the newer user, which was
created in error.
It is also important to note that migration of ClusterRoleTemplateBindings and ProjectRoleTemplateBindings require
a delete/create operation rather than an update. This will result in new object names for the migrated bindings.
A label with the former object name will be included in the migrated bindings.
The Rancher Agent image to be used with this utility can be found at rancher/rancher-agent:v2.7.6
It is recommended that you perform a Rancher backup prior to running this utility."

CLEAR='\033[0m'
RED='\033[0;31m'

# cluster resources, including the service account used to run the script
cluster_resources_yaml=$(cat << 'EOF'
apiVersion: v1
kind: ServiceAccount
metadata:
name: cattle-cleanup-sa
namespace: cattle-system
labels:
rancher-cleanup: "true"
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: cattle-cleanup-binding
labels:
rancher-cleanup: "true"
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cattle-cleanup-role
subjects:
- kind: ServiceAccount
name: cattle-cleanup-sa
namespace: cattle-system
---
apiVersion: batch/v1
kind: Job
metadata:
name: cattle-cleanup-job
namespace: cattle-system
labels:
rancher-cleanup: "true"
spec:
backoffLimit: 6
completions: 1
parallelism: 1
selector:
template:
metadata:
creationTimestamp: null
spec:
containers:
- env:
- name: AD_GUID_CLEANUP
value: "true"
#dryrun - name: DRY_RUN
#dryrun value: "true"
#deletemissing - name: AD_DELETE_MISSING_GUID_USERS
#deletemissing value: "true"
image: agent_image
imagePullPolicy: Always
command: ["agent"]
name: cleanup-agent
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
dnsPolicy: ClusterFirst
restartPolicy: OnFailure
schedulerName: default-scheduler
securityContext: {}
serviceAccountName: cattle-cleanup-sa
terminationGracePeriodSeconds: 30
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: cattle-cleanup-role
labels:
rancher-cleanup: "true"
rules:
- apiGroups:
- '*'
resources:
- '*'
verbs:
- '*'
- nonResourceURLs:
- '*'
verbs:
- '*'
EOF
)

# Agent image to use in the yaml file
agent_image="$1"

show_usage() {
if [ -n "$1" ]; then
echo -e "${RED}👉 $1${CLEAR}\n";
fi
echo "Usage: $0 AGENT_IMAGE [OPTIONS]"
echo ""
echo "Options:"
echo -e "\t-h, --help Display this help message"
echo -e "\t-n, --dry-run Display the resources that would be updated without making changes"
echo -e "\t-d, --delete-missing Permanently remove user objects whose GUID cannot be found in Active Directory"
}

display_banner() {
local text="$1"
local border_char="="
local text_width=$(($(tput cols)))
local border=$(printf "%${text_width}s" | tr " " "$border_char")

echo "$border"
printf "%-${text_width}s \n" "$text"
echo "$border"
echo "Dry run: $dry_run"
echo "Delete missing: $delete_missing"
echo "Agent image: $agent_image"
if [[ "$dry_run" = true ]] && [[ "$delete_missing" = true ]]
then
echo "Setting the dry-run option to true overrides the delete-missing option. NO CHANGES WILL BE MADE."
fi
echo "$border"
}

OPTS=$(getopt -o hnd -l help,dry-run,delete-missing -- "$@")
if [ $? != 0 ]; then
show_usage "Invalid option"
exit 1
fi

eval set -- "$OPTS"

dry_run=false
delete_missing=false

while true; do
case "$1" in
-h | --help)
show_usage
exit 0
;;
-n | --dry-run)
dry_run=true
shift
;;
-d | --delete-missing)
delete_missing=true
shift
;;
--)
shift
break
;;
*)
show_usage "Invalid option"
exit 1
;;
esac
done

shift "$((OPTIND - 1))"
# Ensure AGENT_IMAGE is provided
if [ $# -lt 1 ]; then
show_usage "AGENT_IMAGE is a required argument"
exit 1
fi

display_banner "${banner_text}"

if [ "$dry_run" != true ]
then
# Check the Rancher version before doing anything.
# If it is v2.7.5, make it clear that configuration is not the recommended way to run this utility.
rancher_version=$(kubectl get settings server-version --template='{{.value}}')
if [ "$rancher_version" = "v2.7.5" ]; then
echo -e "${RED}IT IS NOT RECOMMENDED TO RUN THIS UTILITY AGAINST RANCHER VERSION v2.7.5${CLEAR}"
echo -e "${RED}IF RANCHER v.2.7.5 RESTARTS AFTER RUNNING THIS UTILITY, IT WILL UNDO THE EFFECTS OF THIS UTILITY.${CLEAR}"
echo -e "${RED}IF YOU DO WANT TO RUN THIS UTILITY, IT IS RECOMMENDED THAT YOU MAKE A BACKUP PRIOR TO CONTINUING.${CLEAR}"
read -p "Do you want to continue? (y/n): " choice
if [[ ! $choice =~ ^[Yy]$ ]]; then
echo "Exiting..."
exit 0
fi
fi
fi


read -p "Do you want to continue? (y/n): " choice
if [[ ! $choice =~ ^[Yy]$ ]]; then
echo "Exiting..."
exit 0
fi

# apply the provided rancher agent
yaml=$(sed -e 's=agent_image='"$agent_image"'=' <<< $cluster_resources_yaml)

if [ "$dry_run" = true ]
then
# Uncomment the env var for dry-run mode
yaml=$(sed -e 's/#dryrun // ' <<< "$yaml")
elif [ "$delete_missing" = true ]
then
# Instead uncomment the env var for missing user cleanup
yaml=$(sed -e 's/#deletemissing // ' <<< "$yaml")
fi

echo "$yaml" | kubectl apply -f -

# Get the pod ID to tail the logs
retry_interval=1
max_retries=10
retry_count=0
pod_id=""
while [ $retry_count -lt $max_retries ]; do
pod_id=$(kubectl --namespace=cattle-system get pod -l job-name=cattle-cleanup-job -o jsonpath="{.items[0].metadata.name}")
if [ -n "$pod_id" ]; then
break
else
sleep $retry_interval
((retry_count++))
fi
done

# 600 is equal to 5 minutes, because the sleep interval is 0.5 seconds
job_start_timeout=600

declare -i count=0
until kubectl --namespace=cattle-system logs $pod_id -f
do
if [ $count -gt $job_start_timeout ]
then
echo "Timeout reached, check the job by running kubectl --namespace=cattle-system get jobs"
echo "To cleanup manually, you can run:"
echo " kubectl --namespace=cattle-system delete serviceaccount,job -l rancher-cleanup=true"
echo " kubectl delete clusterrole,clusterrolebinding -l rancher-cleanup=true"
exit 1
fi
sleep 0.5
count+=1
done

# Cleanup after it completes successfully
echo "$yaml" | kubectl delete -f -
12 changes: 9 additions & 3 deletions cmd/agent/main.go
Expand Up @@ -24,16 +24,18 @@ import (
"github.com/docker/docker/client"
"github.com/hashicorp/go-multierror"
"github.com/mattn/go-colorable"
"github.com/rancher/remotedialer"
"github.com/rancher/wrangler/pkg/signals"
"github.com/sirupsen/logrus"

"github.com/rancher/rancher/pkg/agent/clean"
"github.com/rancher/rancher/pkg/agent/clean/adunmigration"
"github.com/rancher/rancher/pkg/agent/cluster"
"github.com/rancher/rancher/pkg/agent/node"
"github.com/rancher/rancher/pkg/agent/rancher"
"github.com/rancher/rancher/pkg/features"
"github.com/rancher/rancher/pkg/logserver"
"github.com/rancher/rancher/pkg/rkenodeconfigclient"
"github.com/rancher/remotedialer"
"github.com/rancher/wrangler/pkg/signals"
"github.com/sirupsen/logrus"
)

var (
Expand Down Expand Up @@ -78,6 +80,10 @@ func main() {
bindingErr = multierror.Append(bindingErr, err)
}
err = bindingErr
} else if os.Getenv("AD_GUID_CLEANUP") == "true" {
dryrun := os.Getenv("DRY_RUN") == "true"
deleteMissingUsers := os.Getenv("AD_DELETE_MISSING_GUID_USERS") == "true"
err = adunmigration.UnmigrateAdGUIDUsers(nil, dryrun, deleteMissingUsers)
} else {
err = run(ctx)
}
Expand Down

0 comments on commit e5af62a

Please sign in to comment.