Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gitlab: duplicate pipelines in gitlab merge requests #3373

Open
paulozava opened this issue May 3, 2023 · 14 comments
Open

Gitlab: duplicate pipelines in gitlab merge requests #3373

paulozava opened this issue May 3, 2023 · 14 comments
Labels
bug Something isn't working provider/gitlab

Comments

@paulozava
Copy link

paulozava commented May 3, 2023

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request. Searching for pre-existing feature requests helps us consolidate datapoints for identical requirements into a single place, thank you!
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request.
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment.

Overview of the Issue

We updated our Atlantis from v0.20.1 to v0.23.5 and since then we detect that the Atlantis steps are running in a different pipeline than the project, so we now have double pipelines.

Reproduction Steps

Logs

Environment details

  • Atlantis version: v0.23.5
  • Deployment method: ecs
  • Atlantis flags:

Atlantis server-side config file:

atlantis server --gitlab-hostname="$git_hostname" \
                         --gitlab-token="$git_token" \
                         --gitlab-user="$git_username" \
                         --gitlab-webhook-secret="$git_webhook" \
                         --checkout-strategy='merge' \
                         --enable-diff-markdown-format \
                         --repo-whitelist='*' \
                         --autoplan-file-list='**/*.tf,**/*.tfvars,**/*.hcl,*.tf,*.tfvars,*.hcl' \
                         --repo-config="/atlantis/repos.yaml"
repos:
- id: /.*/
  allowed_overrides: [apply_requirements, workflow]
  allow_custom_workflows: true
  pre_workflow_hooks:
    - run: python3 /scripts/inject_policy_checker.py
- id: /.*comp\/realestate.*/
  allowed_overrides: [apply_requirements, workflow]
  allow_custom_workflows: true
  pre_workflow_hooks:
    - run: python3 /scripts/inject_policy_checker.py
    - run: python3 /scripts/inject_infracost.py
  post_workflow_hooks:
    - run: |
        if [ ! -d "/tmp/$BASE_REPO_OWNER-$BASE_REPO_NAME-$PULL_NUM" ]; then
          exit 0
        fi

        infracost comment gitlab --repo $BASE_REPO_OWNER/$BASE_REPO_NAME \
                                 --merge-request $PULL_NUM \
                                 --path /tmp/$BASE_REPO_OWNER-$BASE_REPO_NAME-$PULL_NUM/'*'-infracost.json \
                                 --gitlab-server-url https://git.example.com \
                                 --gitlab-token $ATLANTIS_GITLAB_TOKEN \
                                 --behavior new

        rm -rf /tmp/$BASE_REPO_OWNER-$BASE_REPO_NAME-$PULL_NUM
- id: /.*infrastructure\/vv\/.*/
  allowed_overrides: [apply_requirements, workflow]
  allow_custom_workflows: true
  pre_workflow_hooks:
    - run: python3 /scripts/inject_policy_checker.py
    - run: python3 /scripts/inject_infracost.py
  post_workflow_hooks:
    - run: |
        if [ ! -d "/tmp/$BASE_REPO_OWNER-$BASE_REPO_NAME-$PULL_NUM" ]; then
          exit 0
        fi

        infracost comment gitlab --repo $BASE_REPO_OWNER/$BASE_REPO_NAME \
                                 --merge-request $PULL_NUM \
                                 --path /tmp/$BASE_REPO_OWNER-$BASE_REPO_NAME-$PULL_NUM/'*'-infracost.json \
                                 --gitlab-server-url https://git.example.com \
                                 --gitlab-token $ATLANTIS_GITLAB_TOKEN \
                                 --behavior new

        rm -rf /tmp/$BASE_REPO_OWNER-$BASE_REPO_NAME-$PULL_NUM
      description: Sending Gitlab comment with the table
- id: /.*comp\/motors\/.*/
  allowed_overrides: [apply_requirements, workflow]
  allow_custom_workflows: true
  pre_workflow_hooks:
    - run: python3 /scripts/inject_policy_checker.py
    - run: python3 /scripts/inject_infracost.py
  post_workflow_hooks:
    - run: |
        if [ ! -d "/tmp/$BASE_REPO_OWNER-$BASE_REPO_NAME-$PULL_NUM" ]; then
          exit 0
        fi

        infracost comment gitlab --repo $BASE_REPO_OWNER/$BASE_REPO_NAME \
                                 --merge-request $PULL_NUM \
                                 --path /tmp/$BASE_REPO_OWNER-$BASE_REPO_NAME-$PULL_NUM/'*'-infracost.json \
                                 --gitlab-server-url https://git.example.com \
                                 --gitlab-token $ATLANTIS_GITLAB_TOKEN \
                                 --behavior new

        rm -rf /tmp/$BASE_REPO_OWNER-$BASE_REPO_NAME-$PULL_NUM

Repo atlantis.yaml file:

version: 3
automerge: false
parallel_plan: true
parallel_apply: true
projects:
  - name: production
    dir: infrastructure
    terraform_version: v1.0.7
    workflow: production
    workspace: production
    apply_requirements:
      - approved
    autoplan:
      when_modified: [../**/*]
  - name: staging
    dir: infrastructure
    terraform_version: v1.0.7
    workflow: staging
    workspace: staging
    apply_requirements:
      - approved
    autoplan:
      when_modified: [../**/*]
  - name: terragrunt-example
    dir: infrastructure-terragrunt
    workflow: terragrunt
    terraform_version: v1.0.7
    apply_requirements:
      - approved
    autoplan:
      when_modified: [../**/*]
workflows:
  production:
    plan:
      steps:
        - run: rm -rf .terraform*
        - init:
            extra_args:
              - -backend-config=../backends/production.backend.tfvars
        - plan:
            extra_args:
              - -compact-warnings
              - -var 'region=us-west-1'
              - -var 'account=xxx'
              - -var 'role_name=atlantis'
    apply:
      steps:
        - apply
  staging:
    plan:
      steps:
        - run: rm -rf .terraform*
        - init:
            extra_args:
              - -backend-config=../backends/staging.backend.tfvars
        - plan:
            extra_args:
              - -compact-warnings
              - -var 'region=us-west-1'
              - -var 'account=xxx'
              - -var 'role_name=atlantis'
    apply:
      steps:
        - apply
  terragrunt:
    plan:
      steps:
        - env:
            name: TF_IN_AUTOMATION
            value: true
        - env:
            name: TERRAGRUNT_IAM_ROLE
            value: arn:aws:iam::xxx:role/atlantis
        - run: terragrunt init --terragrunt-source-update --terragrunt-non-interactive
        - run: terragrunt plan -out=$PLANFILE
    apply:
      steps:
        - env:
            name: TF_IN_AUTOMATION
            value: true
        - env:
            name: TERRAGRUNT_IAM_ROLE
            value: arn:aws:iam::xxx:role/atlantis
        - run: terragrunt apply $PLANFILE
@paulozava paulozava added the bug Something isn't working label May 3, 2023
@jamengual
Copy link
Contributor

did you check the other issues? there was a PR that fixed this a while back.

@paulozava
Copy link
Author

I didn't find anything similar open, but found this PR #2745

@paulozava
Copy link
Author

It is the same behaviour reported here #2743

@jamengual
Copy link
Contributor

@jukie do you have any ideas?

@michelmzs
Copy link
Contributor

michelmzs commented May 3, 2023

@paulozava Please, could you confirm the GitLab version are you using (gitlab.com or self-hosted)? And the .gitlab-ci.yaml config too.

@paulozava
Copy link
Author

Sure, we are using GitLab self-hosted at version 15.9.4-ee

The .gitlab-ci.yaml:

image: registry.corp.com/infrastructure/verticals/linters:latest

stages:
  - Lint
  - Deploy

YAML lint:
  stage: Lint
  script:
    - find . -name "*.yml" -o -name "*.yaml" | xargs -n 1 yamllint -d relaxed --no-warnings
  only:
    - merge_requests
  tags: [k8s-motors]

Kustomize Lint:
  stage: Lint
  variables:
    K8S_INFRASTRUCTURE_FILES_FOLDER: k8s-clusterconfig
  script:
    - kustomize build $K8S_INFRASTRUCTURE_FILES_FOLDER/staging > /dev/null
    - kustomize build $K8S_INFRASTRUCTURE_FILES_FOLDER/production > /dev/null
    - kustomize build argocd-apps/production > /dev/null
    - kustomize build argocd-apps/staging > /dev/null
  only:
    - merge_requests
  tags: [k8s-motors]

```

@rjmsilveira
Copy link

rjmsilveira commented Jun 29, 2023

Hi everyone,

Meanwhile I was able to perform some tests and identify why we get duplicate pipelines and I believe the problem may be the way the implementation of #2745 was done.

Let me try to explain.

If you have a project which contains workflows to create MR pipelines, the MR 2745 works just fine and Atlantis pipeline will appear in the same MR pipeline (as expected)

If you enable in gitlab MR options the setting "Enable merged results pipelines" and then push a commit, the MR pipeline will run in a different pipeline than the Atlantis which will now run a branch pipeline.
This in reality is not a big deal. The problem happens when you revert the setting above

When you revert it, you would expect the normal behavior (both Atlantis and the project pipeline in the same line) but in reality what happens is that the behavior above is still kept (2 separate pipelines)

From what I could understand the issue is with this bit of code which gets the head pipeline from the MR

mr, err := g.GetMergeRequest(pull.BaseRepo.FullName, pull.Num)
if err != nil {
	return err
}

The issue is that the api call above will return the Atlantis pipeline as being the head pipeline and mr.Pipeline.Source will be external

From that moment on, unless you trigger a new pipeline for the project, the last pipeline will always be the one from Atlantis which is identified as external and the pipelines will never "merge"

I was also able to test that if the function is just

func (g *GitlabClient) UpdateStatus(repo models.Repo, pull models.PullRequest, state models.CommitStatus, src string, description string, url string) error {
	gitlabState := gitlab.Pending
	switch state {
	case models.PendingCommitStatus:
		gitlabState = gitlab.Running
	case models.FailedCommitStatus:
		gitlabState = gitlab.Failed
	case models.SuccessCommitStatus:
		gitlabState = gitlab.Success
	}

	options := &gitlab.SetCommitStatusOptions{
		State:       gitlabState,
		Context:     gitlab.String(src),
		Description: gitlab.String(description),
		TargetURL:   &url,
	}

	_, _, err := g.Client.Commits.SetCommitStatus(repo.FullName, pull.HeadCommit, options)

	return err
}

Deliberately not passing refTarget seems to work for all cases and I guess Gitlab will always try to see what use-case it fits best? 🤷‍♂️

I don't have any use-case or example of when an event is anything else than a merge_request_event to test and would be cool if someone could present some examples of the pipelines that originally were being duplicated and that got solved with the MR above.

I also noticed that (at least for Gitlab) the supported Webhook events are Comments and Merge Request events and I am wondering if all the logic from the MR is ever going to be needed.

Thank you in advance for your time and help, and happy to propose an MR with any sort of fix but just want to better understand first the motivation for the code above and how we can make it work for all use-cases 😃

@jseiser
Copy link

jseiser commented Jul 11, 2023

We self host 15.11.2-ee and have this same problem.

@rjmsilveira
Copy link

Pinging @michelmzs (I should've done it sooner) as you were the author of the initial MR

@WarpRat
Copy link

WarpRat commented Oct 19, 2023

Is anyone currently working on this? This has been a pain point for us for a while. I'd be willing to take a shot at implementing a fix if nobody else has work in flight.

@jamengual
Copy link
Contributor

jamengual commented Oct 19, 2023 via email

@WarpRat
Copy link

WarpRat commented Oct 19, 2023

Yes, we're still seeing this issue on 0.26.0 with merged results pipelines enabled. Based on the description in this comment I'm not sure if it's even possible to have converged merged results pipelines and external status pipelines which would be ideal. However, we still see the behavior described in that comment where even disabling merged results pipelines still results in a merge request pipeline with our jobs and a separate branch pipeline with the Altantis jobs. If we don't use merge request pipelines at all then everything works as expected in the branch pipeline. If we have to stick to branch pipelines only we can do that but it sounds like it would be a minor fix to get things at least working in merge request pipelines again.

We're on Gitlab SaaS which is currently 16.5.

@michelmzs
Copy link
Contributor

michelmzs commented Oct 21, 2023

Hi everyone,

Meanwhile I was able to perform some tests and identify why we get duplicate pipelines and I believe the problem may be the way the implementation of #2745 was done.

Let me try to explain.

If you have a project which contains workflows to create MR pipelines, the MR 2745 works just fine and Atlantis pipeline will appear in the same MR pipeline (as expected)

If you enable in gitlab MR options the setting "Enable merged results pipelines" and then push a commit, the MR pipeline will run in a different pipeline than the Atlantis which will now run a branch pipeline. This in reality is not a big deal. The problem happens when you revert the setting above

When you revert it, you would expect the normal behavior (both Atlantis and the project pipeline in the same line) but in reality what happens is that the behavior above is still kept (2 separate pipelines)

From what I could understand the issue is with this bit of code which gets the head pipeline from the MR

mr, err := g.GetMergeRequest(pull.BaseRepo.FullName, pull.Num)
if err != nil {
	return err
}

The issue is that the api call above will return the Atlantis pipeline as being the head pipeline and mr.Pipeline.Source will be external

From that moment on, unless you trigger a new pipeline for the project, the last pipeline will always be the one from Atlantis which is identified as external and the pipelines will never "merge"

I was also able to test that if the function is just

func (g *GitlabClient) UpdateStatus(repo models.Repo, pull models.PullRequest, state models.CommitStatus, src string, description string, url string) error {
	gitlabState := gitlab.Pending
	switch state {
	case models.PendingCommitStatus:
		gitlabState = gitlab.Running
	case models.FailedCommitStatus:
		gitlabState = gitlab.Failed
	case models.SuccessCommitStatus:
		gitlabState = gitlab.Success
	}

	options := &gitlab.SetCommitStatusOptions{
		State:       gitlabState,
		Context:     gitlab.String(src),
		Description: gitlab.String(description),
		TargetURL:   &url,
	}

	_, _, err := g.Client.Commits.SetCommitStatus(repo.FullName, pull.HeadCommit, options)

	return err
}

Deliberately not passing refTarget seems to work for all cases and I guess Gitlab will always try to see what use-case it fits best? 🤷‍♂️

I don't have any use-case or example of when an event is anything else than a merge_request_event to test and would be cool if someone could present some examples of the pipelines that originally were being duplicated and that got solved with the MR above.

I also noticed that (at least for Gitlab) the supported Webhook events are Comments and Merge Request events and I am wondering if all the logic from the MR is ever going to be needed.

Thank you in advance for your time and help, and happy to propose an MR with any sort of fix but just want to better understand first the motivation for the code above and how we can make it work for all use-cases 😃

Sorry for the very late response, guys.

Removing the refTarget from setCommitStatus will work for Merge Result pipelines, but the behavior described in #2484 will occur again. Without the ref from head_pipeline, when a user checkout from a branch, the CommitStatus will be issued to the HEAD branch.

Also, I think that the duplicated pipeline problem not solved by the mentioned PRs is the race condition of GitLab CI Pipeline creation x Atlantis Webhook process, sometimes Atlantis will be faster than GitLab Pipeline creation, so the CommitStatus will be set in a different Pipeline ID.
There is a discussion about it in #3887

@jamengual
Copy link
Contributor

@X-Guardian @lukemassa

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working provider/gitlab
Projects
None yet
Development

No branches or pull requests

7 participants