Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move ClusterResourceQuota to CRD #22425

Merged
merged 15 commits into from Apr 10, 2019

Conversation

mfojtik
Copy link
Member

@mfojtik mfojtik commented Mar 28, 2019

Goal of this change is to move ClusterResourceQuota from OpenShift API server to CRD so the admission plugins that run in kubernetes apiserver do not have dependency on OpenShift API resource. In case the OpenShift API server have downtime that means every request to Kubernetes API server is delayed.

This PR require openshift/cluster-config-operator#25 to land first, otherwise CRQ is broken.
This will also disable etcd storage for openshift clusterresourcequota so we don't accidentally write to a wrong bucket (and we can actually prove this works).

TODO:

  • Fix TestIntegration/TestClusterQuota
  • Fix TestIntegration/TestEtcd3StoragePath
  • Fix test-cmd (need to create CRD)
  • Fix appliedresourcequota

/cc @sttts
/cc @deads2k
/cc @enj
/cc @derekwaynecarr

@openshift-ci-robot openshift-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Mar 28, 2019
@mfojtik mfojtik force-pushed the crq-to-crd branch 3 times, most recently from 812ea3e to ae48874 Compare March 28, 2019 12:27
@mfojtik
Copy link
Member Author

mfojtik commented Mar 28, 2019

/retest

@openshift-ci-robot openshift-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 2, 2019
@openshift-ci-robot openshift-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 4, 2019
@mfojtik
Copy link
Member Author

mfojtik commented Apr 4, 2019

/retest

@mfojtik mfojtik force-pushed the crq-to-crd branch 3 times, most recently from 9a8916b to 6578398 Compare April 4, 2019 08:25
@mfojtik
Copy link
Member Author

mfojtik commented Apr 5, 2019

/retest

@mfojtik mfojtik force-pushed the crq-to-crd branch 4 times, most recently from 79c1e0b to 71f9a7b Compare April 5, 2019 09:28
@mfojtik
Copy link
Member Author

mfojtik commented Apr 5, 2019

/retest

@mfojtik
Copy link
Member Author

mfojtik commented Apr 5, 2019

unit test is flake
cmd looks like {\"message\":\"rolebindings.rbac.authorization.k8s.io \\\"system:image-pullers\\\" is forbidden: caches not synchronized\"}]} when creating project request
integration gives me python error

EDIT: cmd failed because the CRD was not installed and that caused the new project to fail create because cluster resource quota admission has not synced informer...

@mfojtik
Copy link
Member Author

mfojtik commented Apr 10, 2019

@deads2k test-cmd passed ! \o/

@mfojtik
Copy link
Member Author

mfojtik commented Apr 10, 2019

/retest

unit test flake

@mfojtik
Copy link
Member Author

mfojtik commented Apr 10, 2019

OK, looks like I got 1 green test-cmd pass but it is flaking again. Give the integration test failing on not being able to observe change in status for 30s, I suspect this might be related to either reconciliation controller being slow on updating status or something is messing up the status in storage?

@mfojtik
Copy link
Member Author

mfojtik commented Apr 10, 2019

/retest

1 similar comment
@mfojtik
Copy link
Member Author

mfojtik commented Apr 10, 2019

/retest

@deads2k
Copy link
Contributor

deads2k commented Apr 10, 2019

what an adventure.

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Apr 10, 2019
@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: deads2k, mfojtik, soltysh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@mfojtik
Copy link
Member Author

mfojtik commented Apr 10, 2019

/retest

@sjenning
Copy link
Contributor

https://bugzilla.redhat.com/show_bug.cgi?id=1698538 for TestNewManagerImplStartProbeMode test flake

@mfojtik
Copy link
Member Author

mfojtik commented Apr 10, 2019

/retest

@deads2k
Copy link
Contributor

deads2k commented Apr 10, 2019

@mfojtik might be a failure to build

@mfojtik
Copy link
Member Author

mfojtik commented Apr 10, 2019

@deads2k
E0410 14:57:09.710110 13362 plugin_watcher.go:120] error dial failed at socket /tmp/volume/device_plugin236288530/device-plugin.sock, err: failed to dial socket /tmp/volume/device_plugin236288530/device-plugin.sock, err: context deadline exceeded when handling create event: "/tmp/volume/device_plugin236288530/device-plugin.sock": CREATE

looks more like an infra thing, it happens a lot in unit test job.

@mfojtik
Copy link
Member Author

mfojtik commented Apr 10, 2019

== RUN TestEventCreated E0410 15:15:36.490666 27410 horizontal.go:212] failed to compute desired number of replicas based on listed metrics for ReplicationController/test-namespace/test-rc: failed to get cpu utilization: no pods returned by selector while calculating replica count

/retest

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants