Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop using PATs for Azure Repos (Attempt 2: Electric Boogaloo) #294

Merged
merged 17 commits into from
Mar 10, 2021

Conversation

mjcheetham
Copy link
Collaborator

@mjcheetham mjcheetham commented Feb 22, 2021

Add a new setting that enables the Azure Repos provider to return the Azure Active Directory/Microsoft Account access tokens to Git directly, rather than exchanging them for an Azure DevOps Personal Access Token.

The default mode is to continue using PATs at the moment. The disable PATs set the following config:

git config --global credential.azreposPATMode false

New sub-commands are added to the git-credential-manager-core azure-repos command to list and override the binding of user accounts to Azure DevOps organisations.

To avoid overcomplicating things, and a combinatorial explosion of user binding options we support the following user bindings, in order of most preferred:

1. Git remote URL

[remote "origin"]
    url = https://user%40example.com@{org}.visualstudio.com/project/_git/repo

This must be set manually by the user as GCM doesn't have enough information about which remote was used for a particular request to be able to update the remote URL with the user.

GCM doesn't directly look this up (except when issuing a azure-repos list --show-remotes command) as Git itself will send us the username in the get request.

We ignore the username value when the URL hostname is dev.azure.com as Azure DevOps decided to use the userinfo part of their clone URLs to include the organisation name (e.g. https://org@dev.azure.com/org/...) - as a result these username values cannot be trusted.

The user must encode any special characters like @ as %40.

2. Organisation binding in local Git configuration

[credential "azrepos:org/{org}"]
    username = user@example.com

If no username value was provided by Git (from the remote URL), or we are unable to use the provided username value (because it's a dev.azure.com hostname), then we consult Git for any username bound to the Azure DevOps organisation name.

3. Organisation binding in global Git configuration

We explicitly check Git's local (above) and then global configurations for an Azure DevOps user-organisation binding because GCM will automatically update only these two config "levels" after a successful credential use (store).

"Sign-in" (after a call to store)

Here is a summary of the state of user-organisation bindings before and after signing in with a user "A":

  Description
A User to sign-in
B Another user
- No user
Global Local ➡️ Global Local
- - ➡️ A -
- A ➡️ A -
- B ➡️ A -
A - ➡️ A -
A A ➡️ A -
A B ➡️ A -
B - ➡️ B A
B A ➡️ B A
B B ➡️ B A

"Sign-out" (after a call to erase)

Here is a summary of the state of user-organisation bindings before and after signing-out of an organisation:

  Description
U User
X Do not inherit (valid in local only)
- No user
Global Local ➡️ Global Local
- - ➡️ - -
- U ➡️ - -
- X ➡️ - -
U - ➡️ U X
U U ➡️ U X
U X ➡️ U X

When we are called to store/erase and we're not inside a Git repository (such as when called by some external tool that doesn't follow what Git does exactly) any writes to the --local configuration will silently fail.

This replaces attempt 1 that uses our own file for storing data.

@mjcheetham mjcheetham force-pushed the users-in-config branch 5 times, most recently from 31dbd00 to dba2aae Compare February 23, 2021 11:58
@mjcheetham mjcheetham marked this pull request as ready for review February 23, 2021 13:15
Copy link
Contributor

@vtbassmatt vtbassmatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! 🎉

docs/configuration.md Outdated Show resolved Hide resolved
docs/usage.md Outdated Show resolved Hide resolved

## User Accounts

In versions of Git Credential Manager that support Azure Access Tokens, the user
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth calling out from which version this is supported?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't know until a release is made, and the version number is dynamically assigned based on Git commit height and hash 😉

I'd be looking to change this after the release to include a version number.

be remembered.

The first time you clone, fetch or push from/to an Azure DevOps organization you
will be prompted to sign-in and select a user account. Git Credential Manager
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know on the Azure DevOps side, they say "identity" for the thing that represents a user. Genuine question, is that the same thing as "account" here (and if so, should we use their terminology)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can say "identity" if we want to be consistent with Azure DevOps.

In AAD there are users and accounts (an account != user.. one can span tenants and clouds, the other is in a particular cloud+tenant).

In Git there is only a "username".

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess technically, you're selecting an Azure account when you sign-in, and not an Azure DevOps identity.. Just happens that Azure accounts are mapped to an AzDO identity.

Aside: there's also a FUN thing that if you've never signed-in to the web front-end of a particular Azure DevOps organisation, you cannot call any AzDevOps APIs because the "identity" doesn't exist yet - it's only materialised on first sign-in on the web. 🙄

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you've never signed-in to the web front-end of a particular Azure DevOps organisation, you cannot call any AzDevOps APIs

Let me know if you are seeing this - it means that someone forgot to turn on a feature flag.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this was the case back when GCM Core was first out. I remember hitting this exact issue and having to include support for surfacing human readable error messages (telling you do go log in via the web first) from Azure DevOps' REST API endpoints.

Copy link
Collaborator

@dscho dscho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reviewed this PR purely with my Git for Windows maintainer hat on: how certain can we be that this PR minimizes the risk of breakages in case the feature flag is not turned on (i.e. in case we run without credential.azreposPatMode=false)?

Therefore, I did not review the tests, I did not review new methods/classes/interfaces, and I focused on the IsPersonalAccessTokenMode() code paths.

Theoretically, there would be a bit room to minimize above-mentioned risk: a couple of refactorings (such as renaming methods, extracting the GetAzureAccessTokenAsync() method from GenerateCredentialAsync(), etc) could be delayed, the authority cache could potentially be delayed, too.

At this point, however, I would be even warier of trying to disentangle those (good!) changes from this PR: I have tested (quite a bit) the resulting GCM Core with and without the feature flag turned on, and would be loathe to repeat that testing again (and would probably be less diligent about it due to sheer impatience about re-testing something I already tested, anyway).

So my vote is to go forward with it, merge it, release a new GCM Core version immediately. That way, we always have the escape hatch to downgrade GCM Core to GCM 2.0.374-beta (which was released, i.e. it is essentially GCM Core without this here PR).

Copy link

@mbarati0040 mbarati0040 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@derrickstolee derrickstolee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One technical change and a few doc comments. What a major effort!

docs/configuration.md Show resolved Hide resolved
docs/environment.md Outdated Show resolved Hide resolved
docs/azrepos-users-and-tokens.md Outdated Show resolved Hide resolved
When we moved to use the System.CommandLine library for command line
parsing, we neglected to update the exception handling to match the new
model.
Fix a bug in contruction of the remote URI when no username is provided,
but the caller wishes to include the username.
Improve the handling of input arguments with missing required fields
(e.g., protocol and host).
Replace the unused `ProgramData` and `Xdg` Git configuration level
enumeration members with an `Unknown` member. We never directly used
any of those, and don't really care(!)
Remove some unused Git configuration extension methods for querying for
entries based on 'split' keys (section.scope.property).

These are only used in tests!
Introduce a wrapper type representing a single entry in Git's
configuration, which is used in the `Enumerate` callback.
Teach `GitConfiguration::Enumerate` to parse results that include the
'level' of the Git config entry (--show-scope).
Expose the `Split` method from the `GitConfigurationKeyComparer` as a
`TrySplit` method and use this implementation for splitting keys in all
existing instances.

Also introduce a `GitConfiguration::Enumerate` extension method that
filters based on section and property name parts of Git config entries.
Change the way we interact with Git configuration so that we specify the
level filter in each method call, instead of requesting the
`GitConfiguration` object perform the filtering.
Add the ability to list the Git remotes for the current repository, as
well as resolve the current repository path.
Add a utility to extract the Azure DevOps organisation name from a
remote URL.
Add a new settings to the Azure Repos provider that instructs GCM to
return the Azure access token directly, rather than use that token to
generate a new Azure DevOps Personal Access Token (PAT).

At the moment the only indication as to what user account a user wants
to use is via the userinfo parts of the remote URL. This must be set
manually.

The default configuration is to continue to use PATs, for now.
Add a cache of the Azure backing authority for Azure DevOps orgs.
This cache is only consulted when the credential type is "oauth"
and not "pat".

We use Git's configuration as the persistence mechanism.
Add a command to enable clearing of the Azure authority cache manually.
Add a new binding manager component that can be used to 'bind' user
accounts to Azure DevOps organisations. This enables the Azure Repos
host provider to attempt silent authentication requests via MSAL - to
attempt to use an existing access token from the cache.

We only allow binding at the organisation level (and not any other
level) as this is the most common scenario: one user for an entire Azure
DevOps organisation.

If the user wishes to override the chosen user for a particular clone
they can do so by binding the user to the local repository
configuration, rather than the default global configuration.

Furthermore, if the user wishes to use a different user account for a
particular remote within a repository, they can set the username in the
remote URL.
Add commands to manually manager the user/org bindings for the Azure
Repos host provider.
Add some checks/guards against malformed data output from Git
configuration when enumerating all entries.

If we hit the unexpected end of the data stream we trace and stop
parsing.
Copy link
Contributor

@derrickstolee derrickstolee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the parsing updates. This satisfies my only concern.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants