Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backport: UI/API Slowness #19443

Closed
cjellick opened this issue Apr 8, 2019 · 2 comments
Closed

Backport: UI/API Slowness #19443

cjellick opened this issue Apr 8, 2019 · 2 comments
Assignees
Labels
kind/bug Issues that are defects reported by users or that we know have reached a real release
Milestone

Comments

@cjellick
Copy link

cjellick commented Apr 8, 2019

Backport #18870

Fixes
NOTE: These fixes should improve performance for all types of clusters whether RKE, non-RKE, imported or not.

A. Filter by namespaces:
What: When a project is selected, norman retrieves its associated namespaces. Rancher then uses those namespaces to query workloads and other project resources.

B. Properly response to -1 pagination flag:
What: When -1 is passed to the API, the max limit is set. The max limit has been raised to 10k. This is because pagination is done on the way out of the backend. This means if X pages are provided then called by the UI, it takes X times as long as a non-paginated response.

C. Increase size of parent cache:
What: The cache used to map pods to their parents has had its size increased to 100k. When it was 1k, any resource load over 1k would cause all entries to shift out. This rendered the cache nearly unused if a user had many resources.

How to test: Check the "Steps to reproduce" section for steps corresponding to each fix.

Steps to reproduce (least amount of steps as possible):
I have been able to reproduce the behavior with the following steps:
NOTE: I mentioned the type of cluster I used however, any type of cluster should work for reproducing issues, as mentioned above.

A:

  1. Use RKE on 8gb DO instance

  2. Import to Rancher

  3. Create new project and namespace.

  4. Create additional project and namespace.

  5. Launch 220 workloads with large environment variable: https://github.com/rmweir/many-workloads/ in first project/namespace pair.

  6. Launch a a couple workloads without large environment variable in second project/namespace pair.

  7. Create one more empty project with one empty namespace.

  8. Switch between projects.

Before fix: Both projects and project related API calls will take similar times to load.

After fix: Projects containing less resources should load faster.

B:

  1. Use GKE default instance.

  2. Launch 2050 workloads in a single project and namespace. Can use the script from previous steps, but do NOT include environment variable.

  3. Navigate to project, and inspect (networking tab).

  4. Refresh page. Observe number of calls made to project related resources such as pods and workloads.

Before fix: multiple requests of varying yet similar length
After fix: 1 request for each project related resource

C:

  1. Follow steps from B.

  2. Navigate to project containing large number of pods

Before fix: Project related resource API calls, particularly pods, take a long time to load (if hosting rancher locally this time will likely be >50 seconds.

After fix: API calls complete and page is useable in a fractions of the time. (API calls should if rancher is local should be <10 seconds.

@cjellick cjellick added the kind/bug Issues that are defects reported by users or that we know have reached a real release label Apr 8, 2019
@cjellick cjellick added this to the v2.2.2 milestone Apr 8, 2019
@cjellick
Copy link
Author

cjellick commented Apr 8, 2019

@rmweir I need you to enumerate each enhancement/fix you made and how they can be tested

@sowmyav27
Copy link
Contributor

sowmyav27 commented Apr 9, 2019

Verified in rancher:v2.2.2-rc5.
A.
Steps taken:

  1. Created an Amazon EC2 cluster.
  2. Created a project with 220 workloads, with large environment variable.
  3. Created a project/namespace with 3 workloads without large environment variable.
  4. Created a project/namespace without any workloads.
  5. Created a project with 100 namespaces with one workload each
    Verified the below:
  6. Projects containing fewer/no resources, loaded faster.
  7. When switching between projects, did not take much time. Significantly lesser than the v2.2.1 rancher.
  8. Verified in v2.2.1, that projects which had fewer resources/no resources, took longer to load, similar to the project which had 220 workloads.

B.
Steps taken:

  1. Created an Amazon EC2 Cluster.
  2. Launched 2050 workloads in a single project/namespace.
  3. Navigate to project, and inspect (networking tab)
  4. Refresh the page.
    Verified:
  5. The number of calls being made - workloads and pods call was 1 each.
  6. Verified in rancher:v2.2.1, number of calls being made - workloads and pods call were 3 each.
  7. When switching between projects, did not take much time. Significantly lesser than the v2.2.1 rancher.

C.
Steps taken:

  1. Created an Amazon EC2 Cluster.
  2. Launched 2050 workloads in a single project/namespace.
  3. GO to Workload tabs
    Verified:
  4. On loading the workloads tab, the api call took less than 30 seconds to fetch the workloads and around 25 seconds to load the pods.
  5. In rancher:v2.2.1, the action took more than 3 minutes to load the workloads.
  6. When switching between projects, did not take much time. Significantly lesser than the v2.2.1 rancher.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Issues that are defects reported by users or that we know have reached a real release
Projects
None yet
Development

No branches or pull requests

4 participants