Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Best way to authenticate against a git repository in a build process #1601

Closed
matthid opened this issue Jun 3, 2018 · 14 comments
Closed

Best way to authenticate against a git repository in a build process #1601

matthid opened this issue Jun 3, 2018 · 14 comments

Comments

@matthid
Copy link

matthid commented Jun 3, 2018

Description

On of our users reported fsprojects/Paket#3228

The basic problem is to authenticate against a tfs repository within a build process.

Can we just use System.Accesstoken? Is there documentation around that? Is there a credential manager we can use or implement ourself (like it is for nuget) or any other way to hook into the process?

@TingluoHuang
Copy link
Contributor

@matthid

  1. System.AccessToken is generated per VSTS build/release job, it's life time is equal to the job time, a new token will be generated on next job.
  2. Using System.AccessToken, you requests to VSTS will be authenticated as either "Project Collection Build Service identity" or "Project Build Service Identity" of your VSTS account base on your build/release definition setting.
  3. By default, that service account should have read permission to all repositories under your VSTS account, but customer can choose to deny certain repositories from Web UI. https://docs.microsoft.com/en-us/vsts/pipelines/scripts/git-commands?view=vsts
  4. System.AccessToken is available for Task by default, but using it in adhoc script (PowerShell, Commandline) require additional setting. https://docs.microsoft.com/en-us/vsts/pipelines/scripts/powershell?view=vsts#oauth

I don't think i have clear answer for you, but i try my best to provide you more context.

Personally, i would suggest customer always provide credential explicit instead of using cred manager in CI system.

@matthid
Copy link
Author

matthid commented Jun 3, 2018

Personally I think the ideal solution would be if we could just "prepare" the environment in a way such that we can call git <...> and it "just works". This way third party tools will work. Is there some way to do this?

  • Set environment variables?
  • wrapper script around git?
  • Credential manger (which git itself will ask and return the System.AccessToken)

Do any of the above options exist today?

I don't think letting users create PAT-Tokens and manually setting up things is very nice.

@TingluoHuang
Copy link
Contributor

System.AccessToken is for this, but as i said customer need to opt-in to allow ad-hoc script have access to System.AccessToken from environment variables.

I think here is what we can do to make it a better experience for customer:

  1. Check TF_Build environment variable exist or not, TF_Build will be set if the process is running as a child process of VSTS agent.
  2. After check TF_Build, check System_AccessToken exist or not, if it doesn't exist, you can print out a message like Your downstream operations may fail if it needs credential back to VSTS and provide a link to Doc about the opt-in experience. https://docs.microsoft.com/en-us/vsts/pipelines/scripts/powershell?view=vsts#oauth
  3. git -c http.extraheader="AUTHORIZATION: bearer <System_AccessToken>" clone https://github.com/microsoft/vsts-agent

@matthid
Copy link
Author

matthid commented Jun 4, 2018

but as i said customer need to opt-in to allow ad-hoc script have access to System.AccessToken from environment variables.

But that is only true for scripts not for vsts-tasks, correct? We actually have a vsts-task where we want to setup stuff in such a way that it "just works" as well.

@TingluoHuang
Copy link
Contributor

then, System.AccessToken is always there for a vsts-task, so you should be able to achieve your goal to make is "just work"

@matthid
Copy link
Author

matthid commented Jun 4, 2018

But can I make git use it when a 3rd party code is between the task and the git call:

  1. Vsts Task (with access to Access Token) -> Starts paket.exe -> Starts git.exe
    How can git access the repository and use the token in that scenario?

To elaborate: With Nuget we can setup a "dummy" credential provider and then

  1. Vsts Task setup credential manager (basically put a custom *CredentialProvider.exe in some path)
  2. Vsts Task (with access to Access Token) -> Starts thirdparty.exe -> nuget.exe -> Looks for credential managers and uses our installed one from step 1 -> Starts OurCredentialProvider.exe and uses token
  3. Remove credential manager

@TingluoHuang
Copy link
Contributor

@matthid
you can put the credential into user level git config.
git config --unset-all http.<your VSTS account>.extraheader
git config http.<your VSTS account>.extraheader "AUTHORIZATION: bearer <System_AccessToken>"
run git operations
git config --unset-all http.<your VSTS account>.extraheader

@matthid
Copy link
Author

matthid commented Jun 4, 2018

Thanks. we will take a look at that but I guess it should do.

@matthid matthid closed this as completed Jun 4, 2018
archibate added a commit to archibate/taichi that referenced this issue May 16, 2020
archibate added a commit to archibate/taichi that referenced this issue May 18, 2020
[skip ci] advanced

[skip ci] better format

[skip ci] extract to misc/format.py

[skip ci] use it

[skip ci] add clang format

[skip ci] use sudo

[skip ci] use python3

[skip ci] use git add .

[skip ci] checkout head

[skip ci] try

[skip ci] try again

[skip ci] help

[skip ci] wewe

[skip ci] username

[skip ci] adv api

[skip ci] pid

[skip ci] 957

[skip ci] sp

[skip ci] head

[skip ci] token

[skip ci] token

[skip ci] follow microsoft/azure-pipelines-agent#1601

[skip ci] bearer

[skip ci] ??

[skip ci] test

[skip ci] try again

[skip ci]

[skip ci] 2

[skip ci] f

[skip ci] we

[skip ci] fix

[skip ci] c

[skip ci] fi

[skip ci] t

[skip ci] u

[skip ci] hack

[skip ci] det

[skip ci] fx

[skip ci] a

[skip ci] m

[skip ci] as

[skip ci] d

[skip ci] ssh-key

[skip ci] enforce code format

[skip ci] trigger

[skip ci] simp

[skip ci] adv

[skip ci] format diff

[skip ci] f

[skip ci] add pip git

[skip ci] name

[skip ci] t

[skip ci] sss

[skip ci] add [skip fmt]

[skip ci] fix

[skip ci] [skip fmt] break

[skip ci] break 2

[skip ci] fix

[skip ci] fs

[skip ci] ha

[skip ci] 3

[skip ci] rea

[skip ci] f

[skip ci] master

[skip ci] c

[skip ci] fix

[skip ci] fetch

[skip ci] f

[skip ci] fix

[skip ci] squash

[skip ci] name

[skip ci] difs

[skip ci] save

[skip ci] save

[skip ci] fix fmt

[skip ci] real fix

[skip ci] [skip fmt] break

[skip ci] fix again
archibate added a commit to archibate/taichi that referenced this issue May 18, 2020
[skip ci] [skip fmt] add persubmit.yml

[skip ci] advanced

[skip ci] better format

[skip ci] extract to misc/format.py

[skip ci] use it

[skip ci] add clang format

[skip ci] use sudo

[skip ci] use python3

[skip ci] use git add .

[skip ci] checkout head

[skip ci] try

[skip ci] try again

[skip ci] help

[skip ci] wewe

[skip ci] username

[skip ci] adv api

[skip ci] pid

[skip ci] 957

[skip ci] sp

[skip ci] head

[skip ci] token

[skip ci] token

[skip ci] follow microsoft/azure-pipelines-agent#1601

[skip ci] bearer

[skip ci] ??

[skip ci] test

[skip ci] try again

[skip ci]

[skip ci] 2

[skip ci] f

[skip ci] we

[skip ci] fix

[skip ci] c

[skip ci] fi

[skip ci] t

[skip ci] u

[skip ci] hack

[skip ci] det

[skip ci] fx

[skip ci] a

[skip ci] m

[skip ci] as

[skip ci] d

[skip ci] ssh-key

[skip ci] enforce code format

[skip ci] trigger

[skip ci] simp

[skip ci] adv

[skip ci] format diff

[skip ci] f

[skip ci] add pip git

[skip ci] name

[skip ci] t

[skip ci] sss

[skip ci] add [skip fmt]

[skip ci] fix

[skip ci] [skip fmt] break

[skip ci] break 2

[skip ci] fix

[skip ci] fs

[skip ci] ha

[skip ci] 3

[skip ci] rea

[skip ci] f

[skip ci] master

[skip ci] c

[skip ci] fix

[skip ci] fetch

[skip ci] f

[skip ci] fix

[skip ci] squash

[skip ci] name

[skip ci] difs

[skip ci] save

[skip ci] save

[skip ci] fix fmt

[skip ci] real fix

[skip ci] [skip fmt] break

[skip ci] fix again

[skip ci] [skip fmt]

[skip ci] sfsf

[skip ci] works anyway

[skip ci] we
@danielmhair
Copy link

@TingluoHuang Thank you for posting your last solution! I can verify that by doing this fixes my issues with git log:

git config --unset-all http.<your VSTS account>.extraheader
git config http.<your VSTS account>.extraheader "AUTHORIZATION: bearer <System_AccessToken>"
# git log command here
git config --unset-all http.<your VSTS account>.extraheader

@steven-hyland
Copy link

steven-hyland commented Aug 23, 2022

I came across this issue when I was searching for a solution to my problem, so I'm going to leave my solution here in case it helps someone else someday.

The above solution from TingluoHuang did not work for me (UPDATE: it does work if the --global option is included. Read on and my other comments below for more context as to why). After adding that config setting, I continued to get errors like the below. I also tried several variations with different http.xyz and credential.xyz config settings and nothing worked. This could be because we are not using SSH keys and are just relying on the built-in Git for Windows credential manager, but I don't know for sure.

│ fatal: could not read Username for '[https://dev.azure.com'](https://dev.azure.com%27/): terminal
│ prompts disabled

In my case, I am running a Terraform script in my pipeline that references modules from other Git repositories (all repos are in ADO). So it's similar to matthid's case where there is an intermediary program invoking Git. Using one of the modules looks like this:

module "linux_web_app" {
  source              = "git::https://dev.azure.com/<omitted>/<omitted>/_git/<repo_name>?ref=v0.1"
  resource_group_name = var.resource_group_name
  service_name        = var.service_name
}

We have a few of these, and some of these also reference others in a nested way. On a developer's machine, this works well because the credential manager supplies the right creds to every connection. But in pipelines it didn't work because all our ADO projects/repositories are set to private. One option would be to pass in an ADO PAT, but I thought that should be unnecessary since the pipeline's access token is already available. At first, my solution was to overwrite the URL to include the token like so:

TOKEN="git::https://dev.azure.com"
REPLACEMENT="git::https://$(System.AccessToken)@dev.azure.com"
sed -i "s|$TOKEN|$REPLACEMENT|g" main.tf

This works fine but is a little clunky. It also doesn't work in the nested case, because when dependent modules are downloaded they don't have the token in the URLs, so then you have to go back and overwrite all instances in the module cache dir again, which is even more clunky.

In the end I found the insteadOf option in Git's config which works very well:

git config --global url.https://$(System.AccessToken)@dev.azure.com.insteadOf "https://dev.azure.com"
<rest of the script goes here>
git config --global --unset url.https://$(System.AccessToken)@dev.azure.com.insteadOf

This works very well in the pipeline and also preserves the behavior of just letting the credential manager do its thing when running on a developer's machine. Note that this requires the proper resources: and uses: blocks elsewhere in the pipeline.yml.

@GaTechThomas
Copy link

Be careful setting global git configs on agent machines, since global can cause side effects in other processes. Specifically, it injects the token of the current job at a level accessible by other processes (current and later), and that token is short lived - this means that its both a security issue and a breaking change for the other processes the moment the token is invalidated. Additionally, on rare occasion something breaks during a pipeline run that causes the token cleanup step to not be called (we MUST assume that any call can be the last call that occurs - think BSOD), which leaves git broken on the machine until hard cleanup is done (and a beast to diagnose what happened since it only fails on subsequent pipeline runs).

We had the same terraform git reference issue, and the solution that worked for us with not side effects was to:

  1. Perform the usual checkout of the terraform repo
  2. Find all git references across all terraform files in the local copy of the repo and replace the org name in the URL with the System.AccessToken.
  3. Perform terraform init and the usual subsequent commands

This works every time, is scoped to the current pipeline run, and does not have cross-process security implications.

One caveat: If the terraform has nested git references (i.e., one terraform repo references another repo that also has repo references) then it becomes much more difficult to deal with. However, the recommendation from hashicorp/terraform is that nested references should not be a practice - such practice is an indicator of other structural problems that should be resolved first.

@steven-hyland
Copy link

steven-hyland commented Aug 24, 2022

Thanks for the comment, but there's a few other things that affect what I think you're saying. Sorry if these weren't clear:

  • since a downstream program is invoking git, just setting the config locally doesn't work (or maybe it's a Terraform specific thing), so the --global option is required
  • this process is occurring on ephemeral build images that are recycled after every job, so the noted concerns about security or breaking other processes do not apply

We have the nested git references problem that you describe. We were initially doing your same steps to resolve, but the problem was that terraform init would break and fail because it downloads the other modules during that step. So the process was to run terraform init and fix up the URLs in a loop until it didn't break anymore. This is dumb, so we are very incentivized to find a cleaner approach, as I'm sure you can understand.

The current method does effectively the same thing - replacing the URL to include the token - just at the git config level instead of in each individual file. It just makes sense to me that if the git credential manager on my dev machine can handle this scenario fine, there's got to be a similarly simple way to make it work in CI.

The recommendation from HashiCorp is noted and thanks for mentioning it. I would be interested to know the specifics of the "structural problems" they mention if you have a link.

@steven-hyland
Copy link

steven-hyland commented Aug 24, 2022

Well either way, your comment got me thinking again, and after some experimentation I just discovered that the previous solution posted with the extraHeader options does seem work if I include the --global option. Without that it just continues prompting for a username/password, which it can't do because this is running in CI.

I guess that method and the one I posted are essentially equivalent, but after thinking about it more I do take your point about breaking other processes. Maybe someday in the future this will not be running on an ephemeral image and it will suddenly starting breaking other things. Future me won't like that very much. I'll go back and include an git config --global --unset command at the end of the jobs to account for that. I'll update my first comment above too. Thanks again for your comment - always something new to learn :)

@PoulNielsen
Copy link

We do this in a bash task during build (where we have 'Allow script to access oauth token' enable on the phase run on agent,
which 'Enables scripts and other processes launched by tasks to access the OAuth Token through the System.AccessToken variable')
So on a single line we do:

GIT_CONFIG_COUNT=1 GIT_CONFIG_KEY_0="http.extraHeader" GIT_CONFIG_VALUE_0="Authorization: Bearer $SYSTEM_ACCESSTOKEN" composer update

Or we could do the same with 'git' (which also works..)
GIT_CONFIG_COUNT=1 GIT_CONFIG_KEY_0="http.extraHeader" GIT_CONFIG_VALUE_0="Authorization: Bearer $SYSTEM_ACCESSTOKEN" git clone XXXX

Token is not 'unfolded' or shown in any files this way.
(From https://stackoverflow.com/questions/11262010/shell-variable-expansion-in-git-config)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants