Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

E2E development allowing injection of secrets denoting testing on custom clusters #2711

Merged
merged 40 commits into from
Jan 31, 2022

Conversation

phantomjinx
Copy link
Contributor

@phantomjinx phantomjinx commented Oct 21, 2021

Extensive refactoring of e2e tests to support ability to inject secrets describing custom clusters that tests can be executed on.

Tests should use the following secrets:

  • E2E_CLUSTER_CONFIG: A key=value plaintext file providing configuration of the cluster

    • kube-admin-user-ctx: The context of the admin user as detailed in the kube config
    • kube-user-ctx: The context of the user as detailed in the kube config
    • image-registry-push-host: The url of the registry to push images to
    • image-registry-pull-host: The url of the registry the tests should pull images from (can be the same as push-registry but may be different, eg. using openshift's exposed internal registry)
    • image-registry-user: A configured user that can access push to the push-registry
    • image-registry-token: A token / password allowing the user to push to the push-registry
    • image-registry-insecure: Whether the pull-registry requires authentication
    • image-namespace: The namespace to push images to on the push-registry
    • has-olm: Does the cluster have an OLM capability
    • catalog-source-namespace: If OLM-supported this denotes the location where to install the catalog-source resource providing details of the built kamel bundle
  • E2E_KUBE_CONFIG: A base16 encoding of a kube config file, including both a user and admin context

  • E2E_UPSTREAM_REPOSITORY: The upstream repository from which to checkout the e2e test actions (should be this repository usually)

  • E2E_UPSTREAM_BRANCH: The upstream repository branch from which to checkout the e2e test actions (should be either main or a number branch depending on requirements)

Release Note

NONE

Copy link
Contributor

@squakez squakez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some concerns about the presence of minishift

.github/actions/kamel-config-cluster-custom/action.yml Outdated Show resolved Hide resolved
.github/actions/kamel-config-cluster-minishift/action.yml Outdated Show resolved Hide resolved
@phantomjinx phantomjinx force-pushed the e2e-refactor branch 7 times, most recently from 9d54d11 to e4c04ed Compare October 27, 2021 15:47
@phantomjinx phantomjinx force-pushed the e2e-refactor branch 5 times, most recently from f3cda66 to dbb0bfd Compare November 4, 2021 10:25
@phantomjinx phantomjinx changed the title Refactors e2e testing to allow for platform delegation E2E development allowing injection of secrets denoting testing on custom clusters Dec 13, 2021
@phantomjinx phantomjinx force-pushed the e2e-refactor branch 2 times, most recently from ffbb183 to 46c2ae8 Compare December 15, 2021 08:12
@phantomjinx
Copy link
Contributor Author

@astefanutti @christophd
At the stage where I'm confident that the refactor is working correctly. Tests that are failing seem to be either problematic tests (@squakez already working on kubernetes metric test I think). Not sure what's up with yaks.
Probably needs a final review then merge?

@astefanutti
Copy link
Member

astefanutti commented Dec 15, 2021

@phantomjinx awesome!

For the metrics e2e test, it's fixed with #2833 and #2836, so you may want to rebase. For YAKS tests, I think we need @christophd eagle eyes :)

@phantomjinx
Copy link
Contributor Author

@phantomjinx awesome!

For the metrics e2e test, it's fixed with #2833 and #2836, so you may want to rebase. For YAKS tests, I think we need @christophd eagle eyes :)

Looks like the metrics fix may have sorted it for kind but now failing for openshift:

Expected
            <uint>: 12
        to be ==
            <uint64>: 11

@squakez

@astefanutti
Copy link
Member

@phantomjinx that's hopefully fixed with #2836.

@phantomjinx phantomjinx force-pushed the e2e-refactor branch 3 times, most recently from 53e92b6 to 7d3da2f Compare January 10, 2022 12:53
@phantomjinx
Copy link
Contributor Author

@astefanutti

Down to the yaks test in KNative suite consistently failing (any ideas @christophd appreciated??).

Otherwise, will try to fix the tests I've labelled as PROBLEMATIC unless you would like to go ahead and merge?

@christophd
Copy link
Contributor

@phantomjinx I take a look at the yaks test and let you know

@astefanutti
Copy link
Member

@phantomjinx thanks. I'd suggest to wait for @christophd feedback, and have this merged after the upcoming 1.8.0 release.

@christophd
Copy link
Contributor

so I have had a look and there is one test failing apache-kamelet-catalog

So the test installs Camel K in a new namespace via kamel install and then tries to start an integration with a Kamelet timer-source being used. The IntegrationPlatform seems to not be available in the namespace and therefore no Kamelets available. This is why the test fails because it can not find timer-source Kamelet when running the integration.

Here is the log

YAKS operator installation completed
YAKS installed in namespace default  (global mode)
Creating new test namespace yaks-67cb9fc4-99bf-4c17-a157-73169c06cacf
Unable to find existing YAKS instance - adding new operator instance to temporary namespace by default
Added Knative addon to YAKS operator in namespace 'yaks-67cb9fc4-99bf-4c17-a157-73169c06cacf'
Added CamelK addon to YAKS operator in namespace 'yaks-67cb9fc4-99bf-4c17-a157-73169c06cacf'
Running installation:
OLM is not available in the cluster. Fallback to regular installation.
Camel K installed in namespace yaks-67cb9fc4-99bf-4c17-a157-73169c06cacf 
Integration "logger" created
Progress: integration "logger" in phase Waiting For Platform
Condition "IntegrationPlatformAvailable" is "False" for Integration logger: yaks-67cb9fc4-99bf-4c17-a157-73169c06cacf/camel-k
Integration "logger" in phase "Waiting For Platform"
Cannot reconcile Integration logger: error during trait customization: kamelets  found, timer-source not found in repositories: (Kubernetes[namespace=yaks-67cb9fc4-99bf-4c17-a157-73169c06cacf], Empty[])
Failed to run installation: 
signal: killed
AutoRemove namespace yaks-67cb9fc4-99bf-4c17-a157-73169c06cacf

Why there is no IntegrationPlatform created as part of kamel install?

phantomjinx and others added 22 commits January 27, 2022 10:44
* Timeout for build default to 60 seconds which is not long enough on OS
* Timeout for integration start is 60 seconds which is not long enough
  on OS
* If an env var is set then marked tests will be skipped

* Meant as interim option to allow test suites to avoid failure due to
  problematic tests rather than regressions in the coding.
* It is possible if set-version has been called, eg. building bundle, for
  the image name in the operator-deployment.yaml to be different to that
  defined by IMAGE_NAME. This can cause issues when calling functions
  such as `kustomize set image $(IMAGE_NAME)=....` as this will work but
  the image name in the deployment will never be updated (wrong mapping).

* Use a shell function to find the latest value of IMAGE_NAME & assign
  each time a Makefile rule  in install is executed.
* Discontinue using json-to-variables and converted secret to simple
  key-value list

* Converts all environment variables into inputs and outputs as these are
  not logged

* Creates bash scripts that are called from run: calls as these scripts do
  not get logged and set-output and set-mask can be used without leakage
* Flagged to be fixed on OCP4
* Adds specific run variables to each integration test execution to
  allow for filtering tests by a given regexp. Callers would need to
  ensure a "-run" is prefixed to the value of the env var
* Tries to install kamel using the camel-operator service
  account, which on OCP4 makes use of the OLM. This SA does
  not have the permissions for the OLM so the test needs to
  use a different SA - don't want to extend the operator's
  permissions.
* Provides coverage of all failing tests not just the tests up to the
  failed test.
* Allows for easier local unit testing of the functions
* Pre-flight
 * Adds action to execute a pre-flight test to ensure the kamel operator
   is the correct version as that built by the workflow

* Cleanup
 * Adds function to clean up any image streams left around by pushing
   images to exposed cluster registries
* Ensures that the ImagePullPolicy is set to Always in the bundle csv to
  avoid target clusters retaining out-of-date cached camel-k images
Make sure to wait for the integration platform to be running before starting the test integration
* Allows for easier debugging and quicker problem isolating.
* Allows for better control and checking of parameters

* Better debugging for scripts to be tested locally
* Useful for when running scheduled night-time test suites and ignoring
  problematic tests

* Action that scans the requisite e2e directory to report those tests still
  marked with the PROBLEMATIC flag.
Explicitly wait for timer-source Kamelet to be ready before starting the test integration
* Await the integration is up and running again after rebuild
  before checking its version

* Slow down the kamel installation requests as errors are
  being returned concerning CRDs not being the latest version
* If skip-problematic is not set then insert a brief log
  entry instead

* Adds ability to skip-problematic and other inputs to
  manual executions of upstream test workflows
@phantomjinx
Copy link
Contributor Author

@astefanutti
Any thoughts on what else we'd like to see in this before committing?

@astefanutti
Copy link
Member

@phantomjinx I was about to ask whether you were ready to have this merged :) This is good to go for me 👍🏼.

@phantomjinx
Copy link
Contributor Author

@phantomjinx I was about to ask whether you were ready to have this merged :) This is good to go for me 👍🏼.

Tests still fail on OCP4 and I have the problematic tests to work on but the failures seem random like the test runs here upstream. So, would like to merge and then keep working on fixes on main.

@astefanutti
Copy link
Member

Tests still fail on OCP4 and I have the problematic tests to work on but the failures seem random like the test runs here upstream. So, would like to merge and then keep working on fixes on main.

Sounds good, let's merge it!

@astefanutti astefanutti merged commit 333f1a3 into apache:main Jan 31, 2022
@astefanutti
Copy link
Member

Thanks for this monumental work 🚀!

@christophd
Copy link
Contributor

@phantomjinx totally agree! monumental! many thanks! 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/continuous integration Related to CI and automated testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants