Skip to content

Conversation

huikang
Copy link
Contributor

@huikang huikang commented Nov 30, 2020

Description

Update the metric service and e2e test after upgrading to operator-sdk v1.2

  • expose operator's metric service at port 8080
  • switch to ginkgo for test-e2e

/cc @ewolinetz @alanconway
/assign @periklis

Links

@huikang
Copy link
Contributor Author

huikang commented Dec 1, 2020

/retest

Copy link
Contributor

@periklis periklis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work regarding enabling the metrics endpoint again. I would like a couple of concerns for discussion:

  • Why do we need to migrate our test-suite to ginkgo?
  • Using the controller-runtime envtest package might disconnect our E2E test from a real API-Server IIRC, doesn't it?
  • Does the new testing approach ensure that CRs/Secrets are removed after each test?

In general the PR looks like to solve two things at a time, i.e. metrics and refactoring the E2E test suite. I might have missed something, but let's discuss if we can split these two changes in separate PRs.

Comment on lines +195 to +198
echo "Complete e2e olm test"
$(JUNITMERGE) $$(find $$JUNIT_REPORT_OUTPUT_DIR -iname "*.xml") > $(JUNIT_REPORT_OUTPUT_DIR)/junit.xml
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this temporary or to be replaced by merge-junit target?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is temporary and will be removed in the final version. I added this because sometimes I forgot to remove the tmp directory after each iteration.

@huikang
Copy link
Contributor Author

huikang commented Dec 1, 2020

/retest

1 similar comment
@huikang
Copy link
Contributor Author

huikang commented Dec 1, 2020

/retest

@huikang
Copy link
Contributor Author

huikang commented Dec 1, 2020

Awesome work regarding enabling the metrics endpoint again. I would like a couple of concerns for discussion:

* Why do we need to migrate our test-suite to ginkgo?

Since v1.0, operatro-sdk generates the suit-test.go based on ginkgo and gomega (https://sdk.operatorframework.io/docs/building-operators/golang/migration/#migrate-your-tests). To make the code repo consistent with the sdk, I think we'd better switch to ginkgo.

* Using the controller-runtime `envtest` package might disconnect our E2E test from a real API-Server IIRC, doesn't it?

Could you elaborate this? Or we can discuss this on google meeting.

* Does the new testing approach ensure that CRs/Secrets are removed after each test?

yes, secret and CRs are cleaned up after each iteration (https://github.com/openshift/elasticsearch-operator/pull/591/files#diff-b0fe4827e62a6e2e85531c9c733b581e143e23bed137002c15ec52d450c40319R44-R66)

In general the PR looks like to solve two things at a time, i.e. metrics and refactoring the E2E test suite. I might have missed something, but let's discuss if we can split these two changes in separate PRs.

The reason they are combined in a single PR is that since sdk 1.0 (https://sdk.operatorframework.io/docs/upgrading-sdk-version/v1.0.0/#library-changes), pkg/kube-metrics, pkg/metrics, pkg/test are all gone. Updating go.mod with one of metrics or test suite will break the CI. Therefore, it is inconvenient to split them into two PRs.

BTW, the reason I upgrade to v1.2 is because of an issue of packing service into CSV in v1.1 (https://groups.google.com/g/operator-framework/c/MFS9Ezf1f7I).

@huikang
Copy link
Contributor Author

huikang commented Dec 2, 2020

/retest

1 similar comment
@huikang
Copy link
Contributor Author

huikang commented Dec 2, 2020

/retest

@periklis
Copy link
Contributor

periklis commented Dec 2, 2020

Since v1.0, operatro-sdk generates the suit-test.go based on ginkgo and gomega (sdk.operatorframework.io/docs/building-operators/golang/migration/#migrate-your-tests). To make the code repo consistent with the sdk, I think we'd better switch to ginkgo.

If the E2E tests work for now without switching to ginkgo, I suggest to extract the ginkgo change in a separate PR. In general the team is still on a stand to keep ginkgo or not and we need to discuss this first before we move our suite. WDYT?

@huikang
Copy link
Contributor Author

huikang commented Dec 2, 2020

/retest

@huikang huikang force-pushed the LOG-980-library-change-sdk-1.2 branch from 1d7cd2d to e050fe3 Compare December 8, 2020 18:52
@huikang
Copy link
Contributor Author

huikang commented Dec 8, 2020

If the E2E tests work for now without switching to ginkgo, I suggest to extract the ginkgo change in a separate PR. In general the team is still on a stand to keep ginkgo or not and we need to discuss this first before we move our suite. WDYT?

Hi, @periklis , I have updated the commit of the test code in sdk 1.2: the ginkgo part has been removed.

@huikang huikang force-pushed the LOG-980-library-change-sdk-1.2 branch from e050fe3 to 2ae856b Compare December 8, 2020 20:58
@huikang
Copy link
Contributor Author

huikang commented Dec 8, 2020

/retest

3 similar comments
@huikang
Copy link
Contributor Author

huikang commented Dec 9, 2020

/retest

@huikang
Copy link
Contributor Author

huikang commented Dec 9, 2020

/retest

@huikang
Copy link
Contributor Author

huikang commented Dec 9, 2020

/retest

Copy link
Contributor

@periklis periklis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work! Way more concise!

@huikang
Copy link
Contributor Author

huikang commented Dec 9, 2020

/retest

@huikang huikang force-pushed the LOG-980-library-change-sdk-1.2 branch 2 times, most recently from ac11d72 to 12a5496 Compare December 9, 2020 15:26
@huikang
Copy link
Contributor Author

huikang commented Dec 9, 2020

/retest

Copy link
Contributor

@periklis periklis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good work. One minor thing in deleting objects and we are set to go.

@huikang huikang force-pushed the LOG-980-library-change-sdk-1.2 branch from 12a5496 to 37f7d0c Compare December 10, 2020 17:53
@huikang
Copy link
Contributor Author

huikang commented Dec 10, 2020

/test e2e-operator

@huikang
Copy link
Contributor Author

huikang commented Dec 10, 2020

/retest

Comment on lines 78 to 86
for i := 0; i < 2; i++ {
name := fmt.Sprintf("elasticsearch-%v-cdm-%v-%d", esUUID, dataUUID, i+1)
key = types.NamespacedName{Name: name, Namespace: operatorNamespace}
esDeployment := &apps.Deployment{}
err = wait.Poll(cleanupRetryInterval, cleanupTimeout, func() (done bool, err error) {
err = k8sClient.Get(context.Background(), key, esDeployment)
if err != nil && errors.IsNotFound(err) {
return true, nil
}
fmt.Printf("Wait for cleaning up elasticsearch deployment %s\n", key.String())
return false, nil
})
if err != nil {
t.Error(err)
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking of this a little bit more, we don't need to wait for deployments to be deleted. The work is done by the k8s GC, since we delete the elasticsearch CR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. I changed to checking the status of the elasticsearch CR because somehow in the my local cluster the e2e failed between tests due to unchanged status after deleting the CR.

@huikang huikang force-pushed the LOG-980-library-change-sdk-1.2 branch from 64fc7e0 to ddce3d4 Compare December 11, 2020 16:16
@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Dec 15, 2020
Copy link
Contributor

@alanconway alanconway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve
/lgtm

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 16, 2020
@jcantrill
Copy link
Contributor

/hold

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 16, 2020
@periklis periklis removed their assignment Dec 22, 2020
@ewolinetz
Copy link
Contributor

@jcantrill why did you place a hold on this?

Hui Kang added 3 commits January 11, 2021 19:27
- remove pinned operator-sdk
- upgrade controller-runtime to 0.6.3
- use operator-sdk v1.2.0 which resolves the problem of generating
service and serviceMonitor in previous version
- Remove imported test, test/e2eutil from old operator-sdk
- use k8s client from controller-runtime
@huikang huikang force-pushed the LOG-980-library-change-sdk-1.2 branch from 0c06f0d to 674b0ea Compare January 11, 2021 19:28
@openshift-ci-robot openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label Jan 11, 2021
@huikang huikang changed the base branch from master to master-post-release January 11, 2021 19:30
@ewolinetz ewolinetz added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Jan 11, 2021
@ewolinetz
Copy link
Contributor

/hold cancel

@openshift-ci-robot openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 11, 2021
@ewolinetz
Copy link
Contributor

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jan 11, 2021
@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alanconway, ewolinetz, huikang, periklis

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [alanconway,ewolinetz]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

5 similar comments
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@ewolinetz
Copy link
Contributor

/refresh

@jcantrill
Copy link
Contributor

Manually merging as all tests passed and includes appropriate labels

@jcantrill jcantrill merged commit 4da3d86 into openshift:master-post-release Jan 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants