Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OLM CI Tracking #2401

Open
10 of 43 tasks
timflannagan opened this issue Oct 6, 2021 · 7 comments
Open
10 of 43 tasks

OLM CI Tracking #2401

timflannagan opened this issue Oct 6, 2021 · 7 comments
Labels
kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test.

Comments

@timflannagan
Copy link
Contributor

timflannagan commented Oct 6, 2021

CI Improvements

Controller Improvements

Flakes

Misc/Needs Home/Triage/etc.

@timflannagan timflannagan added the kind/bug Categorizes issue or PR as related to a bug. label Oct 6, 2021
@timflannagan
Copy link
Contributor Author

Note: the "Garbage collection for dependent resources when a bundle with configmap and secret objects is installed when the CSV is deleted OLM ..." test blocks are increasingly reproducible. When poking around the "should have removed the old configmap and put the new configmap in place" test, it appears there's some hotlooping in the catalog operator when attempting to process a Subscription that previously failed resolution, and contention attempting to always remove that status condition when firing off blind Update calls.

@timflannagan
Copy link
Contributor Author

Misc: the need for an automatic rebasing mechanism for open PRs once a new PR has been merged from master.

@timflannagan
Copy link
Contributor Author

Misc: the need for updating the test provisioner to also attempt to gather testing artifacts before deleting the cluster.

@timflannagan
Copy link
Contributor Author

Misc: seeing quite a bit of connection-refused logs in the catalog-operator when firing off ListBundles calls:

E1006 17:13:09.466730       1 queueinformer_operator.go:290] sync "operators" failed: [error using catalog test-catalog (in namespace operators): failed to list bundles: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp 10.96.133.46:50051: connect: connection refused", error using catalog operatorhubio-catalog (in namespace operator-lifecycle-manager): failed to list bundles: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp 10.96.33.109:50051: connect: connection refused"]

@njhale njhale changed the title OLM flake tracking OLM CI Tracking Oct 12, 2021
@timflannagan
Copy link
Contributor Author

#2420 - another quality of life issue when running e2e locally.

@exdx
Copy link
Member

exdx commented Oct 14, 2021

There's occasionally a panic in the TestConnectionEvents series of unit tests where a 10 minute timeout occurs. This is seen in https://github.com/operator-framework/operator-lifecycle-manager/pull/2425/checks?check_run_id=3899261291

@timflannagan timflannagan added kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. and removed kind/bug Categorizes issue or PR as related to a bug. labels Nov 9, 2021
@akihikokuroda
Copy link
Member

As of today (01/21/2022), I see the following e2e failures.

  • should have copied CSVs in all other Namespaces
  • should create a Subscription for the latest entry providing the required GVK 
  • delete internal registry pod triggers recreation
  • can satisfy an associated ClusterServiceVersion's ownership requirement
  • should surface components in its status

In addition to these, I see some failures that are caused by the installplan creation wait timeout. They have the following in the test log.

waiting for catalog pod scoped-catsrc-hzt42 to be available (for sync) - TRANSIENT_FAILURE
catalog scoped-catsrc-hzt42 pod with address scoped-catsrc-hzt42.scoped-ns-cfw9r.svc:50051
03:47:22.1316:  (): nil
waiting for scoped-sub-wz8bw to have installplan ref
03:47:23.131:  (): nil
waiting for scoped-sub-wz8bw to have installplan ref
03:47:24.1319:  (): nil
waiting for scoped-sub-wz8bw to have installplan ref
03:47:25.1315:  (): nil
waiting for scoped-sub-wz8bw to have installplan ref

.........

waiting for scoped-sub-wz8bw to have installplan ref
03:52:21.1343: never got correct status: v1alpha1.SubscriptionStatus{CurrentCSV:"", InstalledCSV:"", Install:(*v1alpha1.InstallPlanReference)(nil), State:"", Reason:"", InstallPlanGeneration:0, InstallPlanRef:(*v1.ObjectReference)(nil), CatalogHealth:

I'll open issues for them later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test.
Projects
None yet
Development

No branches or pull requests

3 participants