Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use cached discovery client #5908

Merged
merged 3 commits into from Jul 7, 2022
Merged

Use cached discovery client #5908

merged 3 commits into from Jul 7, 2022

Conversation

dharmit
Copy link
Member

@dharmit dharmit commented Jul 1, 2022

Signed-off-by: Dharmit Shah shahdharmit@gmail.com

What does this PR do / why we need it:
This PR changes odo to use a cached discovery client.
This PR adds a cached discovery client along with the existing discovery
client. Functions can use the client appropriate for their use.

A cached discovery client refers its cache under ~/.kube/cache instead
of checking with the cluster every time.

Which issue(s) this PR fixes:

Fixes part of #5872

PR acceptance criteria:

  • Unit test

  • Integration test

  • Documentation

How to test changes:

  • Spin up a k8s/ocp cluster other than one on the local box. For example, use ClusterBot.
  • Execute odo list -v2 and look for the line that reads: all.go:119] queried api discovery in. Also, worth noting is the line that reads: all.go:61] all goroutines have returned in.
  • Note the time it takes and compare the same with odo built from current main branch.

Here are two comparative runs that I did. odo is the binary built using code in this PR, while odo-main is the binary built using main branch.

$ for i in `seq 1 10`; do time odo list 1>/dev/null; done     
odo list > /dev/null  0.60s user 0.15s system 14% cpu 5.357 total
odo list > /dev/null  0.46s user 0.07s system 13% cpu 4.009 total
odo list > /dev/null  0.48s user 0.08s system 13% cpu 4.190 total
odo list > /dev/null  0.46s user 0.07s system 13% cpu 3.815 total
odo list > /dev/null  0.51s user 0.08s system 12% cpu 4.686 total
odo list > /dev/null  0.47s user 0.07s system 15% cpu 3.646 total
odo list > /dev/null  0.46s user 0.08s system 13% cpu 4.120 total
odo list > /dev/null  0.55s user 0.11s system 9% cpu 6.706 total
odo list > /dev/null  0.49s user 0.09s system 14% cpu 3.853 total
odo list > /dev/null  0.49s user 0.07s system 15% cpu 3.558 total


$ for i in `seq 1 10`; do time odo-main list 1>/dev/null; done
odo-main list > /dev/null  0.65s user 0.13s system 13% cpu 5.633 total
odo-main list > /dev/null  0.55s user 0.09s system 5% cpu 11.381 total
odo-main list > /dev/null  0.48s user 0.09s system 11% cpu 4.915 total
odo-main list > /dev/null  0.58s user 0.09s system 10% cpu 6.531 total
odo-main list > /dev/null  0.61s user 0.12s system 10% cpu 6.999 total
odo-main list > /dev/null  0.50s user 0.08s system 11% cpu 4.917 total
odo-main list > /dev/null  0.58s user 0.12s system 13% cpu 5.420 total
odo-main list > /dev/null  0.51s user 0.08s system 12% cpu 4.603 total
odo-main list > /dev/null  0.58s user 0.08s system 16% cpu 3.965 total
odo-main list > /dev/null  0.59s user 0.11s system 8% cpu 8.230 total

@netlify
Copy link

netlify bot commented Jul 1, 2022

Deploy Preview for odo-docusaurus-preview ready!

Name Link
🔨 Latest commit 5edd7a0
🔍 Latest deploy log https://app.netlify.com/sites/odo-docusaurus-preview/deploys/62c56bdc758a2d000875058a
😎 Deploy Preview https://deploy-preview-5908--odo-docusaurus-preview.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site settings.

@openshift-ci openshift-ci bot requested review from cdrage and feloy July 1, 2022 12:40
@odo-robot
Copy link

odo-robot bot commented Jul 1, 2022

Unit Tests on commit 0e96671 finished successfully.
View logs: TXT HTML

@odo-robot
Copy link

odo-robot bot commented Jul 1, 2022

Windows Tests (OCP) on commit 0e96671 finished successfully.
View logs: TXT HTML

@odo-robot
Copy link

odo-robot bot commented Jul 1, 2022

OpenShift Tests on commit 0e96671 finished successfully.
View logs: TXT HTML

@odo-robot
Copy link

odo-robot bot commented Jul 1, 2022

Kubernetes Tests on commit 0e96671 finished successfully.
View logs: TXT HTML

@odo-robot
Copy link

odo-robot bot commented Jul 1, 2022

Validate Tests on commit 0e96671 finished successfully.
View logs: TXT HTML

@dharmit
Copy link
Member Author

dharmit commented Jul 4, 2022

Test failures are real. We are getting the output The project something does not exist instead of the expected output Unauthorized to access the cluster because of the usage of cached discovery client introduced in this PR.

Logs from failing tests
•••
------------------------------
• Failure [6.168 seconds]
odo project command tests
/go/odo_1/tests/integration/project/cmd_project_test.go:12
  when user is logged out
  /go/odo_1/tests/integration/project/cmd_project_test.go:56
    should show login message when setting project and not login [It]
    /go/odo_1/tests/integration/project/cmd_project_test.go:69

    Expected
        <string>: I0701 12:54:47.677616   28023 util.go:767] Cached response used.
        I0701 12:54:47.685740   28023 util.go:188] path /go/odo_1/tests/integration/project/devfile.yaml doesn't exist, skipping it
        I0701 12:54:47.685815   28023 util.go:188] path /go/odo_1/tests/integration/project/.devfile.yaml doesn't exist, skipping it
        I0701 12:54:47.803845   28023 kclient.go:201] Checking if "projects" resource is supported
         ✗  The project something does not exist
        
    to contain substring
        <string>: Unauthorized to access the cluster

    /go/odo_1/tests/integration/project/cmd_project_test.go:71
------------------------------
• Failure [6.612 seconds]
odo project command tests
/go/odo_1/tests/integration/project/cmd_project_test.go:12
  when user is logged out
  /go/odo_1/tests/integration/project/cmd_project_test.go:56
    should show login message when deleting project and not login [It]
    /go/odo_1/tests/integration/project/cmd_project_test.go:73

    Expected
        <string>: I0701 12:54:47.435941   28006 util.go:754] Response will be cached in /tmp/odohttpcache for 1h0m0s
        I0701 12:54:47.438116   28006 util.go:767] Cached response used.
        I0701 12:54:47.447973   28006 util.go:188] path /go/odo_1/tests/integration/project/devfile.yaml doesn't exist, skipping it
        I0701 12:54:47.448063   28006 util.go:188] path /go/odo_1/tests/integration/project/.devfile.yaml doesn't exist, skipping it
        I0701 12:54:47.596967   28006 kclient.go:201] Checking if "projects" resource is supported
         ✗  The project "something" does not exist. Please check the list of projects using `odo project list`
        
    to contain substring
        <string>: Unauthorized to access the cluster

Currently, this PR replaces DiscoveryClient with CachedDiscoveryClient, which causes the entire odo code to use the replaced client. I'll change the behaviour such that both clients are available in odo code, so that functions can choose the client that they would prefer to use instead of being forced to use cached data each time.

@@ -158,6 +161,12 @@ func NewForConfig(config clientcmd.ClientConfig) (client *Client, err error) {
return nil, err
}

config_flags := genericclioptions.NewConfigFlags(true)
client.cachedDiscoveryClient, err = config_flags.ToDiscoveryClient()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to cache between calls to odo, or only during an odo execution.

If we want to cache only during an odo execution, it seems that we could call Invalidate just after creating the client

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to cache only during an odo execution, it seems that we could call Invalidate just after creating the client

If we were to invalidate the cache each time we create a client, I think the gain we see in odo list performance in this PR would be gone. But I need to check that to be sure.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the gain we see in odo list performance in this PR would be gone. But I need to check that to be sure.

12s without invalidating the cache, 16s if we invalidate it right after creation.

$ odo list
 ✓  Listing components from namespace 'myproject' [12s]
 NAME                     PROJECT TYPE  RUNNING IN  MANAGED 
 * devfile-nodejs-deploy  nodejs        Deploy      odo     

$ odo list
 ✓  Listing components from namespace 'myproject' [16s]
 NAME                     PROJECT TYPE  RUNNING IN  MANAGED 
 * devfile-nodejs-deploy  nodejs        Deploy      odo

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the client caching? the discovered resources (CRD level), or also the resources themselves (CR level)?

How is the cache consistent: Do we always get the latest versions of what we query (at the price of some back-and-forths to know if the cached one is the last one), of is there some period during which we get old versions?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the client caching? the discovered resources (CRD level), or also the resources themselves (CR level)?

You can look at the cache under ~/.kube/cache/discovery. I took the cue from kubectl api-resources cached=true command.

How is the cache consistent: Do we always get the latest versions of what we query (at the price of some back-and-forths to know if the cached one is the last one), of is there some period during which we get old versions?

There's a ttl, beyond which it falls back to using regular discovery client:
https://github.com/kubernetes/client-go/blob/master/discovery/cached/disk/cached_discovery.go#L155-L157

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what are the solutions if the user doesn't want to wait for 10 minutes

We could document that the user could remove the cache directory from their system using rm -rf if they don't want to wait. However, this would be relevant for users using minikubs/CRC/kinD. Folks in the more realistic environment might not even have the permissions to install an Operator/CRD.

For the time being, I will add this info in odo list docs in this PR.

We discussed this topic when being in Paris, with Serena and the team. odo is dedicated to two different kind of people:

  • the "lead" developer, who will experiment odo, create and update the Devfile, test different operators, bindings, etc. He will probably have admin access to the cluster to make experiments (on its own CRC cluster for example)
  • the "normal" developer, who will have a tailor-made environment (prepared by the first "lead" developer"), and with minimal access to cluster.

The conclusion, as I remember, were that we need mostly to document for the "lead" developer, with the maximum of information for him to understand how to create and update the Devfile efficiently. The doc for the "normal" developer would be very simple, and would be prepared by the "lead" developer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The conclusion, as I remember, were that we need mostly to document for the "lead" developer, with the maximum of information for him to understand how to create and update the Devfile efficiently. The doc for the "normal" developer would be very simple, and would be prepared by the "lead" developer.

I wish someone had put this so nicely after that meeting. Anyway, better late than never.

Considering our lead developer scenario here, do you think that covering how to clear the cache manually for the 10-minute scenario would be a good thing to do? Do you think we should cover anything else as well in the docs?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think so. Don't forget to say it is a cache common to the kubectl command

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A cached discovery client refers its cache under `~/.kube/cache` instead
of checking with the cluster every time.

Signed-off-by: Dharmit Shah <shahdharmit@gmail.com>
Signed-off-by: Dharmit Shah <shahdharmit@gmail.com>
@sonarcloud
Copy link

sonarcloud bot commented Jul 6, 2022

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 1 Code Smell

No Coverage information No Coverage information
0.0% 0.0% Duplication

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. Required by Prow. label Jul 6, 2022
@cdrage
Copy link
Member

cdrage commented Jul 6, 2022

/approve

@openshift-ci
Copy link

openshift-ci bot commented Jul 6, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cdrage

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. Required by Prow. label Jul 6, 2022
@openshift-ci
Copy link

openshift-ci bot commented Jul 6, 2022

@dharmit: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/unit 5edd7a0 link true /test unit

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@feloy
Copy link
Contributor

feloy commented Jul 7, 2022

/override ci/prow/unit-and-validate-test
/override ci/prow/v4.10-integration-e2e

@openshift-ci
Copy link

openshift-ci bot commented Jul 7, 2022

@feloy: Overrode contexts on behalf of feloy: ci/prow/unit-and-validate-test, ci/prow/v4.10-integration-e2e

In response to this:

/override ci/prow/unit-and-validate-test
/override ci/prow/v4.10-integration-e2e

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot merged commit dec65a5 into redhat-developer:main Jul 7, 2022
cdrage pushed a commit to cdrage/odo that referenced this pull request Aug 31, 2022
* Use cached discovery client

A cached discovery client refers its cache under `~/.kube/cache` instead
of checking with the cluster every time.

Signed-off-by: Dharmit Shah <shahdharmit@gmail.com>

* Have both cached and regular discovery clients

Signed-off-by: Dharmit Shah <shahdharmit@gmail.com>

* Docs for odo list's use of cache
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. Required by Prow. lgtm Indicates that a PR is ready to be merged. Required by Prow.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants