Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

THREESCALE-9001 / Added Zync Routes Resync Function #846

Closed
wants to merge 1 commit into from

Conversation

den-rgb
Copy link

@den-rgb den-rgb commented Jul 26, 2023

THREESCALE-9001

WHAT

There are instances where the "bundle exec rake zync:resync:domains" needs to run on the system-sidekiq pod to recreate routes.
This was previously done in RHOAM but now it will be managed by the threescale operator.

TEST

  • Checkout this branch

  • Run the following commands:

    • make docker-build-only IMG=docker.io/<user>/3scale-operator:<tag>
    • make operator-image-push IMG=docker.io/<user>/3scale-operator:<tag>
    • make bundle-custom-build IMG=docker.io/<user>/3scale-operator:<tag> BUNDLE_IMG=docker.io/<user>/3scale-operator-bundles:<tag>
    • make bundle-image-push BUNDLE_IMG=docker.io/<user>/3scale-operator-bundles:<tag>
    • make bundle-run BUNDLE_IMG=docker.io/<user>/3scale-operator-bundles:<tag>
  • Once that runs and the operator is installed create the apimanager:

    • oc new-project 3scale-test
    • Create the test secret:
oc apply -f - << EOF
kind: Secret
apiVersion: v1
metadata:
  name: s3-credentials
  namespace: 3scale-test
data:
  AWS_ACCESS_KEY_ID: UkVQTEFDRV9NRQ==
  AWS_BUCKET: UkVQTEFDRV9NRQ==
  AWS_REGION: UkVQTEFDRV9NRQ==
  AWS_SECRET_ACCESS_KEY: UkVQTEFDRV9NRQ==
type: Opaque
EOF
    • Create the api manager (Make sure to change the wildcard domain to your personal cluster , you can find it in the url)
oc create -f - << EOF
apiVersion: apps.3scale.net/v1alpha1
kind: APIManager
metadata:
  name: apimanager-sample
  namespace: 3scale-test
spec:
  system:
    fileStorage:
      simpleStorageService:
        configurationSecretRef:
          name: s3-credentials
  wildcardDomain: apps.<cluster>.devshift.org

EOF
    • Check the progress of the api manager :
    • oc get apimanager apimanager-sample -oyaml | yq '.status'
  • Navigate to Networking -> Routes on the 3scale-test namespace once all deployments are in the Ready status.

  • If a route gets deleted it should be restored.

Testing multitenancy

  • go into Secrets -> system-seed -> get master password
  • click the master route and put in (username: master) along with the password you got.
  • Find Accounts -> Create account .
  • In Networking -> routes there should be new routes created linked to the account you just made.
  • Delete one of the system-provider or system-developer routes associated with the new tenant they should also recreate.

TODO

Implement unit testing for the additional logic in this pr.

@den-rgb den-rgb requested a review from a team as a code owner July 26, 2023 14:10
@openshift-ci
Copy link

openshift-ci bot commented Jul 26, 2023

Hi @den-rgb. Thanks for your PR.

I'm waiting for a 3scale member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Copy link
Contributor

@Patryk-Stefanski Patryk-Stefanski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verified that a route got recreated after being deleted, and left one small comment.

pkg/helper/podHelper.go Outdated Show resolved Hide resolved
@Patryk-Stefanski
Copy link
Contributor

/lgtm

@thomasmaas
Copy link
Member

/ok-to-test

@den-rgb den-rgb force-pushed the THREESCALE-9001 branch 6 times, most recently from 00878d5 to 0abb861 Compare August 4, 2023 15:57
@codeclimate
Copy link

codeclimate bot commented Aug 10, 2023

Code Climate has analyzed commit c58eddd and detected 10 issues on this pull request.

Here's the issue category breakdown:

Category Count
Complexity 2
Style 8

View more on Code Climate.

Copy link
Member

@eguzki eguzki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, I have serious doubts whether this is the right approach or not. Not saying it is wrongly implemetned, just saying we are taking some decisions that are controversial.

  1. The operator is acquiring knowledge about the expected routes. This something that belongs to 3scale internals. Although it is true that the status reconciler reads the basic installation's expected routes, this implementation takes one step further.
  2. The implementation is part of the 3scale reconciler, blocking 3scale deployment when doing the task. The operator installation pattern is about creating resources and reading state to take actions. It should never be blocked executing one process in a pod or waiting of some resource to be ready.
  3. The process can be run multiple times in parallel. Note there are multiple event sources that trigger apimanager controller reconciliation. For instance, when any of the deploymentconfigs change status, the reconcile loop is executed.
  4. Have you considered a different approach? For instance, one shot CRD (like ProxyConfigPromote CRD) that will trigger one (and only one) process to recreate the routes. So, when the user misses some routes, or restores a deleted namespace, an instance can be created to run re-sync. I understand that it is much better experience if the operator detects automatically missing routes and self-heals installation. Just rising discussion as the implementation adds complexity

@@ -160,6 +160,32 @@ func (r *ZyncReconciler) Reconcile() (reconcile.Result, error) {
return reconcile.Result{}, err
}

if len(r.apiManager.Status.Deployments.Starting) == 0 && len(r.apiManager.Status.Deployments.Stopped) == 0 && len(r.apiManager.Status.Deployments.Ready) > 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check could be implemented in apimanager_types.go in a method that can be reused if needed.

return reconcile.Result{}, err
}
if exist {
return reconcile.Result{}, nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why return here? What if we want to reconcile more resources related to zync?

// we can force a resync of routes. see below for more details on why this is required:
// https://access.redhat.com/documentation/en-us/red_hat_3scale_api_management/2.7/html/operating_3scale/backup-restore#creating_equivalent_zync_routes
// This scenario will manifest during a backup and restore and also if the product ns was accidentally deleted.
podName, err := r.findSystemSidekiqPod(r.apiManager)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My main concern of this PR is with this implementation. How do you know if the pod has already a process in place running the same command? For many reasons this zync reconciliation can be run multiple times and each time you are running a process in the sidekiq pod.


listOps := []client.ListOption{
client.InNamespace(namespace),
client.MatchingLabels(map[string]string{"deploymentConfig": "system-sidekiq"}),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these labels have been defined elsewhere in this code base. Try to reuse them.

// Wait until system-sidekiq deployments are ready
for !helper.ArrayContains(apimanager.Status.Deployments.Ready, "system-sidekiq") {
r.Logger().Info("system-sidekiq deployments not ready. Waiting", "APIManager", apimanager.Name)
time.Sleep(5 * time.Second)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bad smell. Operator reconcilliation pattern is about returning fast. If you are waiting for something, return with requeue enabled.

func (r *BaseAPIManagerLogicReconciler) waitForSystemSidekiq(apimanager *appsv1alpha1.APIManager) error {

// Wait until system-sidekiq deployments are ready
for !helper.ArrayContains(apimanager.Status.Deployments.Ready, "system-sidekiq") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if it never gets ready? this is blocking all the reconciliation of 3scale.

return false, fmt.Errorf("base routes not found")
}

func (r *BaseAPIManagerLogicReconciler) getAccountUrls() ([]string, []string, error){
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a library where we have this implemented: https://github.com/3scale/3scale-porta-go-client

}

func (p PodExecutor) ExecuteRemoteContainerCommand(ns string, podName string, container string, command []string) (string, string, error) {
kubeClient, restConfig, err := getClient()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

about this client, is not available from the controller manager?

buf := &bytes.Buffer{}
errBuf := &bytes.Buffer{}

err = exec.Stream(remotecommand.StreamOptions{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is blocking, right? Meaning that the controller's runtime will be waiting for the process in the pod to be finished.

@austincunningham
Copy link
Contributor

closing this as won't do

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants