-
Notifications
You must be signed in to change notification settings - Fork 345
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improves reliability of e2e testing #3382
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great workflow improvements! There are only a few nitpicks.
Also, I'm curious -- after the changes should we choose global e2e testing as default, and only go for namespaced testing only when it's truly necessary? |
The way I've approached splitting the tests is only if the test has to be namespaced should it be so. For example, those tests that require adding an env var to the operator need to be namespaced whilst those tests that add changes to the integration platform can still be global. Therefore, the namespaced tests tend to be defined as install-related, cli-related and change-to-the-operator-related. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
@@ -49,6 +51,8 @@ rules: | |||
- messaging.knative.dev | |||
resources: | |||
- subscriptions | |||
- channels | |||
- inmemorychannels |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is being fixed by the proper addressable-resolver cluster role binding in #3400
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So it is no longer necessary to add these to the permissions and can be removed from this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, this is obsolete and you can remove that because it is already covered by the addressable-resolver cluster role provided by Knative itself. In the end Camel K just uses a cluster role binding in order to bind the operator service account to this addressable-resolver cluster role. This should give all read permissions to Knative resources such as brokers, channels, etc.
Not sure about these create/delete permissions that you have also added in this PR for brokers and routes. Do we need that for some of the e2e tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@christophd
All additions to the permissions have been required due to tests failure on OCP4 and log messages to the affect that 'systemaccount... was forbidden from ${verb}ing x:y:z'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@phantomjinx yep makes sense, I wonder though if the write permissions are only required for the e2e tests. I can not recall that Camel K creates Knative brokers in production code
The OpenShift workflow keeps failing, so there seems to be something which may not be just flaky. |
@squakez |
It really fails one of the promotion tests:
|
Openshift still failing but the Promote is now passing. |
* Permissions necessary for knative to operate with camel-k
* Migrate to global operator when appropriate * Relocated tests to differentiate between cases that can run as with a global operator, if preferred, and those requiring a namespaced operator * New KamelInstall function that will not install an operator if the env var CAMEL_K_FORCE_GLOBAL_TEST is set. Instead it'll check if a global operator is available then install just the IntegrationPlatform * If test is being run as global test then used the global operator namespace rather than the given test namespace * Add channel to global operator install * Use global operator namespace for getting logs * Fixes OperatorPod function in test_support * Use KamelInstall rather than Kamel("install", ...) to support global * Fix order of uninstall cmd steps * Add ServicesByType function for testing a service's spec type * Add debugging and logging * Removes panic() in favour of failing test to allow for proper teardowns * Requires assigning the test (locus) to test support * Adds pre-cleanup jobs to each workflow * The name of the catalogsource was being assumed buried in the bundle build. Instead, move it to config-cluster, allowing pre-clean to occur after config but before build. * pre-clean must be moved since it clears the catalog source which has just been installed. * Rename e2e-kubernetes to e2e-common * Separates out e2e-install tests - namespace scoped tests * Specifies catalogsource name and namespace as inputs * If using OLM, the CRDs will be uninstalled prior to the call to uninstall the integration platform resulting in an error that integrationPlatform type cannot be found. * Cleanup preflight namespace after testing * Do not throw error if uninstall returns an error condition * Removes the catalogsource to ensure the index image of the catalog source is no longer cached. * Clean up orphan resources in cluster - integrations, platforms, kamelets, operatorgroups * Don't rely on kamel binary for uninstallation * Seems on some clusters it takes tests up to 10 minutes to get the integrations up and running * Sets tests to PROBLEMATIC for later fixing * Fixes flaky tests to make more reliable * Fix Test: Rest Query needs to look at correct namespace * KNative test fix: Move knative install after pre-clean * service-trait test failure * On OCP4, its possible for the services to be created in a different order and the ExternalName svc ends up being named "platform-http-server" * Adds ServicesByType function that fetches services according to the type field
* If not the profile's of integrations are updated to "Knative"
* Seems camel-k operator sometimes starts then immediately restarts so mitigate by extra waits and extra checks to ensure it is running and stable
* A system:puller rolebinding is necessary on openshift to ensure the images in one namespace can be pulled by another * Continue collecting the source kamelets even if there are no traits * Avoid json syntax errors if the rawmessages from the traits are empty * When validating kamelets, need to drop any namedconfig path extensions from the kamelet URIs so that the names can be directly compared * promotion_test.go * Have the test await for the readiness of the platform before doing the tests. Promotion depends on the info of the platform being complete.
* Stop 415 error occurring on Openshift 3.11
@squakez @christophd @tadayosi The oppenshift tests are failing but the errors are much more random (quarkus-native tests not spinning up, Environment tests failing then passing ....). The same tests are comfortably passing on OCP4 test suite so the tests themselves and camel-k seem less a problem than the OCP 3.11 platform. I can continue looking but the priority feels a lot lower now. |
@phantomjinx After merging this, it appears that the
is this something we can improve? |
Major changes