Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix flaky CI with a complex test case #114

Merged
merged 82 commits into from
Jul 8, 2021
Merged

Fix flaky CI with a complex test case #114

merged 82 commits into from
Jul 8, 2021

Conversation

nebojsa-prodana
Copy link
Contributor

@nebojsa-prodana nebojsa-prodana commented Jul 2, 2021

Fix #103

We should change this test to better reflect a real case scenario. We've internally used as default the ProgressiveSync reported in #103 so this PR change the test to use that ProgressiveSync.

In addition, I changed a couple of helpers function to be more generic for when we want to add more tests with different combination of clusters.

TODO

This is a fork of #104 on commit cab8fff

maruina and others added 30 commits June 17, 2021 15:44
@maruina maruina changed the title Flaky ci Fix flaky CI with a complex test case Jul 5, 2021
@maruina
Copy link
Contributor

maruina commented Jul 5, 2021

We found multiple business logic issues:

  • 271996a when we now re-schedule for sync the progressing application as well so we can annotate them. This protect us against a case when someone is manually syncing from the ArgoCD UI and we were not annotating the app
  • 40824ef we were missing a corner case in the scheduling logic
  • fab6f11 where we were not setting Requeue: true
  • if the labels selector returns an empty selection, meaning apps == nil, we decided to make the stage as completed

Makefile Outdated
@@ -23,7 +23,7 @@ lint: fmt vet

# Run tests
test: tools generate fmt vet manifests
ginkgo -r --randomizeAllSpecs --randomizeSuites --failOnPending --cover -coverprofile=../coverage.out --trace --race --progress
ginkgo -reportPassed -r --failOnPending --cover -coverprofile=../coverage.out --trace --race --progress
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to remove `-reportPassed before merging

@maruina maruina mentioned this pull request Jul 5, 2021
3 tasks
Copy link
Contributor

@DimitarHristov111 DimitarHristov111 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We reviewed it together yesterday and the logic seems fine. We still need to verify that it works as expected using the snapshot image. There are a few comments that came up during the review and I left them below.

controllers/progressivesync_controller_test.go Outdated Show resolved Hide resolved
controllers/progressivesync_controller_test.go Outdated Show resolved Hide resolved
with:
# list of Docker images to use as base name for tags
images: |
ghcr.io/skyscanner/applicationset-progressive-sync
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're now pushing to Github packages when opening a PR. This allow us to pull a specific docker image for e2e testing before merging it into main.

@@ -23,7 +23,7 @@ lint: fmt vet

# Run tests
test: tools generate fmt vet manifests
ginkgo -r --randomizeAllSpecs --randomizeSuites --failOnPending --cover -coverprofile=../coverage.out --trace --race --progress
ginkgo -r --failOnPending --cover -coverprofile=../coverage.out --trace --race --progress
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the randomize options at least until we address #111
Until the controller isn't scoped to a single namespace, the ginkgo random options might cause different objects to pollute the test namespace.


// Get the Applications to update
scheduledApps := scheduler.Scheduler(apps, stage)
scheduledApps := scheduler.Scheduler(log, apps, stage)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The scheduler logging as proved to be quite valuable. I propose we keep it in this PR and we fix-forward in #115

@@ -745,10 +1104,10 @@ func TestSync(t *testing.T) {

testAppName := "foo-bar"

application, error := r.syncApp(ctx, testAppName)
application, err := r.syncApp(ctx, testAppName)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The IDE was complaining that error is a built-in package

Copy link
Collaborator

@sledigabel sledigabel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some small comments, overall LGTM

controllers/progressivesync_controller.go Outdated Show resolved Hide resolved
controllers/progressivesync_controller.go Show resolved Hide resolved
controllers/progressivesync_controller.go Show resolved Hide resolved
internal/utils/utils.go Show resolved Hide resolved
Copy link
Collaborator

@sledigabel sledigabel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@maruina maruina merged commit 7c57da0 into main Jul 8, 2021
@maruina maruina deleted the flaky-ci branch July 8, 2021 07:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Infinite loop when multiple stage targeting the same clusters
5 participants