New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
THREESCALE-9001 / Added Zync Routes Resync Function #846
Conversation
Hi @den-rgb. Thanks for your PR. I'm waiting for a 3scale member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Verified that a route got recreated after being deleted, and left one small comment.
e295097
to
3ddbed4
Compare
/lgtm |
/ok-to-test |
00878d5
to
0abb861
Compare
0abb861
to
c58eddd
Compare
Code Climate has analyzed commit c58eddd and detected 10 issues on this pull request. Here's the issue category breakdown:
View more on Code Climate. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, I have serious doubts whether this is the right approach or not. Not saying it is wrongly implemetned, just saying we are taking some decisions that are controversial.
- The operator is acquiring knowledge about the expected routes. This something that belongs to 3scale internals. Although it is true that the
status
reconciler reads the basic installation's expected routes, this implementation takes one step further. - The implementation is part of the 3scale reconciler, blocking 3scale deployment when doing the task. The operator installation pattern is about creating resources and reading state to take actions. It should never be blocked executing one process in a pod or waiting of some resource to be ready.
- The process can be run multiple times in parallel. Note there are multiple event sources that trigger apimanager controller reconciliation. For instance, when any of the deploymentconfigs change status, the reconcile loop is executed.
- Have you considered a different approach? For instance, one shot CRD (like ProxyConfigPromote CRD) that will trigger one (and only one) process to recreate the routes. So, when the user misses some routes, or restores a deleted namespace, an instance can be created to run re-sync. I understand that it is much better experience if the operator detects automatically missing routes and self-heals installation. Just rising discussion as the implementation adds complexity
@@ -160,6 +160,32 @@ func (r *ZyncReconciler) Reconcile() (reconcile.Result, error) { | |||
return reconcile.Result{}, err | |||
} | |||
|
|||
if len(r.apiManager.Status.Deployments.Starting) == 0 && len(r.apiManager.Status.Deployments.Stopped) == 0 && len(r.apiManager.Status.Deployments.Ready) > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check could be implemented in apimanager_types.go
in a method that can be reused if needed.
return reconcile.Result{}, err | ||
} | ||
if exist { | ||
return reconcile.Result{}, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why return here? What if we want to reconcile more resources related to zync?
// we can force a resync of routes. see below for more details on why this is required: | ||
// https://access.redhat.com/documentation/en-us/red_hat_3scale_api_management/2.7/html/operating_3scale/backup-restore#creating_equivalent_zync_routes | ||
// This scenario will manifest during a backup and restore and also if the product ns was accidentally deleted. | ||
podName, err := r.findSystemSidekiqPod(r.apiManager) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My main concern of this PR is with this implementation. How do you know if the pod has already a process in place running the same command? For many reasons this zync reconciliation can be run multiple times and each time you are running a process in the sidekiq pod.
|
||
listOps := []client.ListOption{ | ||
client.InNamespace(namespace), | ||
client.MatchingLabels(map[string]string{"deploymentConfig": "system-sidekiq"}), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these labels have been defined elsewhere in this code base. Try to reuse them.
// Wait until system-sidekiq deployments are ready | ||
for !helper.ArrayContains(apimanager.Status.Deployments.Ready, "system-sidekiq") { | ||
r.Logger().Info("system-sidekiq deployments not ready. Waiting", "APIManager", apimanager.Name) | ||
time.Sleep(5 * time.Second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bad smell. Operator reconcilliation pattern is about returning fast. If you are waiting for something, return with requeue enabled.
func (r *BaseAPIManagerLogicReconciler) waitForSystemSidekiq(apimanager *appsv1alpha1.APIManager) error { | ||
|
||
// Wait until system-sidekiq deployments are ready | ||
for !helper.ArrayContains(apimanager.Status.Deployments.Ready, "system-sidekiq") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what if it never gets ready? this is blocking all the reconciliation of 3scale.
return false, fmt.Errorf("base routes not found") | ||
} | ||
|
||
func (r *BaseAPIManagerLogicReconciler) getAccountUrls() ([]string, []string, error){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a library where we have this implemented: https://github.com/3scale/3scale-porta-go-client
} | ||
|
||
func (p PodExecutor) ExecuteRemoteContainerCommand(ns string, podName string, container string, command []string) (string, string, error) { | ||
kubeClient, restConfig, err := getClient() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
about this client, is not available from the controller manager?
buf := &bytes.Buffer{} | ||
errBuf := &bytes.Buffer{} | ||
|
||
err = exec.Stream(remotecommand.StreamOptions{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is blocking, right? Meaning that the controller's runtime will be waiting for the process in the pod to be finished.
closing this as won't do |
THREESCALE-9001
WHAT
There are instances where the "bundle exec rake zync:resync:domains" needs to run on the system-sidekiq pod to recreate routes.
This was previously done in RHOAM but now it will be managed by the threescale operator.
TEST
Checkout this branch
Run the following commands:
make docker-build-only IMG=docker.io/<user>/3scale-operator:<tag>
make operator-image-push IMG=docker.io/<user>/3scale-operator:<tag>
make bundle-custom-build IMG=docker.io/<user>/3scale-operator:<tag> BUNDLE_IMG=docker.io/<user>/3scale-operator-bundles:<tag>
make bundle-image-push BUNDLE_IMG=docker.io/<user>/3scale-operator-bundles:<tag>
make bundle-run BUNDLE_IMG=docker.io/<user>/3scale-operator-bundles:<tag>
Once that runs and the operator is installed create the apimanager:
oc new-project 3scale-test
oc get apimanager apimanager-sample -oyaml | yq '.status'
Navigate to
Networking -> Routes
on the3scale-test
namespace once all deployments are in theReady
status.If a route gets deleted it should be restored.
Testing multitenancy
system-provider
orsystem-developer
routes associated with the new tenant they should also recreate.TODO
Implement unit testing for the additional logic in this pr.