Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UI list artifacts too slow when there is much trivy scan data #18013

Closed
geoger opened this issue Dec 20, 2022 · 19 comments · Fixed by #18610
Closed

UI list artifacts too slow when there is much trivy scan data #18013

geoger opened this issue Dec 20, 2022 · 19 comments · Fixed by #18610

Comments

@geoger
Copy link

geoger commented Dec 20, 2022

where there is many images in one repo, for exampe , more than 50 images.
and each image has trivy scan date, for example 100-200 (some times more) CVE issues
When user open UI to list the images for the repo to show , even there is 15 items each page, user have to wait more than 10 seconds to show.
we deploy Harbor to aws eks, using separated RDS as database.
Expected behavior and actual behavior:
better to check the UI performance, with much trivy data, and let UI show images within 5~6 seconds

Steps to reproduce the problem:
refer to the description above

Versions:
v 2.6.1

@geoger
Copy link
Author

geoger commented Dec 20, 2022

where there is many images in one repo, for exampe , more than 50 images.
and each image has trivy scan data, for example 100-200 (some times more) CVE issues
When user open UI to list the images for the repo to show , even there is 15 items each page, user have to wait more than 10 seconds to show.
we use deploy Harbor in aws eks, using separated RDS database.
Expected behavior and actual behavior:
better to check the UI performance, with much trivy data, and let UI show images within 5~6 seconds

@geoger
Copy link
Author

geoger commented Dec 20, 2022

This issue could be narrowed to following
calling api to list artifacts with the parameter "with_scan_overview" set to true, cost too much time.
page_size set to 10, and it will cost about 10-16 seconds.
but with the parameter "with_scan_overview" set to false, all other parameter keep the same, it will cost only about 1~2 seconds
same api calling, only enable "with_scan_overview" will cost 7-8 times more

@geoger
Copy link
Author

geoger commented Dec 20, 2022

And step further, I found that even call api (/projects/{project_name}/repositories/{repository_name}/artifacts/{reference} )to get scan overview(the parameter with_scan_overview set to true, other parameters set to false) of even one artifact, it will cost about 2-3 seconds, which doesn't make sense,it seems like there is performance issue here.

@chlins chlins self-assigned this Dec 22, 2022
@guillaumelfv
Copy link

guillaumelfv commented Jan 6, 2023

We also face the same issue running harbor 2.4.3

Previously we had no issue and were running version 2.3.4. So this issue might have been introduced at this point maybe ?

Issue happen using the UI and API and is solved with the API if setting with_scan_overview=false in the following requests:
https://harbor/api/v2.0/projects/myproject/repositories/myimage/artifacts?with_tag=false&with_scan_overview=true&with_label=true&page_size=15&page=1

@chlins
Copy link
Member

chlins commented Jan 6, 2023

We'll investigate this performance issue.

@chlins
Copy link
Member

chlins commented Jan 9, 2023

@geoger @guillaumelfv Hi, could you guys share the size of task table by executing the following SQL select count(*) from task;?

@guillaumelfv
Copy link

Here is the result for us:

registry=# select count(*) from task;
  count
---------
 1688830
(1 row)

@geoger
Copy link
Author

geoger commented Jan 9, 2023

Here is the result for us:
registry=> select count(*) from task;
count

3961435
(1 row)

@geoger
Copy link
Author

geoger commented Jan 9, 2023

Would it be possible to fix this issue in 2.7.x or 2.6.x ?

@guillaumelfv
Copy link

guillaumelfv commented Jan 16, 2023

Would also be cool if there was a a workaround while awaiting the release if possible ? Not sure if the root cause was found already or not.

Also need to know if this is planned to be back ported to lower version like 2.4.x / 2.5.x / 2.6.x / 2.7.x ?

Right now it make the Harbor UI useless as no one can see any tags from there, they just get an empty list and 504. And using the API as a workaround is not practical for user.

@chlins
Copy link
Member

chlins commented Jan 16, 2023

Would also be cool if there was a a workaround while awaiting the release if possible ? Not sure if the root cause was found already or not.

Also need to know if this is planned to be back ported to lower version like 2.4.x / 2.5.x / 2.6.x / 2.7.x ?

Right now it make the Harbor UI useless as no one can see any tags from there, they just get an empty list and 504. And using the API as a workaround is not practical for user.

The 2.4.x and 2.5.x are out of support, we are looking for the root cause and figure out the solutions, thanks.

@github-actions
Copy link

This issue is being marked stale due to a period of inactivity. If this issue is still relevant, please comment or remove the stale label. Otherwise, this issue will close in 30 days.

@geoger
Copy link
Author

geoger commented Mar 30, 2023

Any one can provide some solution or work around for this issue?
we could not wait so long until 2.9.0 is ready

@googer-zhang
Copy link

Will this issue be fixed recently? in 2.6.x?

@googer-zhang
Copy link

Had you ever locate the root cause? Or Can we provide some help to locate the issue?

@chlins
Copy link
Member

chlins commented Apr 20, 2023

Any one can provide some solution or work around for this issue? we could not wait so long until 2.9.0 is ready

@geoger Hi, currently no easy workaround for it. However, a possible solution to alleviate this is to manually clean up some useless records in the database to reduce the size of the task table.

@chlins
Copy link
Member

chlins commented Apr 20, 2023

Had you ever locate the root cause? Or Can we provide some help to locate the issue?

@googer-zhang Hi, we found the root cause is the query is slow to filter something from the task table when the table size is big, we may need to refactor and do some new designs in the part of scan.

@googer-zhang
Copy link

It sounds like so. We write scripts to do trivy scan by schedule for recently pushed images(within 6 months), user need to open UI to lookup trivy CVE, but every time it cost too much time to wait UI show list of images.(2~3 minutes). Hope there is a fix in near future. Thanks

@chlins
Copy link
Member

chlins commented Apr 25, 2023

It sounds like so. We write scripts to do trivy scan by schedule for recently pushed images(within 6 months), user need to open UI to lookup trivy CVE, but every time it cost too much time to wait UI show list of images.(2~3 minutes). Hope there is a fix in near future. Thanks

@googer-zhang Yes, it is the bad experience to wait so long time from UI, will fix the issue asap, the size of your table task and the elapsed time comparison of list artifacts API by changing the with_scan_overview=false or with_scan_overview=true will be helpful.

chlins added a commit to chlins/harbor that referenced this issue Apr 27, 2023
1. Change the query for listing tasks of scan which can use the db
   index.
2. Add the gin index for task.extra_attrs.report_uuids

Fixes: goharbor#18013

Signed-off-by: chlins <chenyuzh@vmware.com>
chlins added a commit to chlins/harbor that referenced this issue Apr 27, 2023
1. Change the query for listing tasks of scan which can use the db
   index.
2. Add the gin index for task.extra_attrs.report_uuids

Fixes: goharbor#18013

Signed-off-by: chlins <chenyuzh@vmware.com>
chlins added a commit to chlins/harbor that referenced this issue Apr 28, 2023
1. Change the query for listing tasks of scan which can use the db
   index.
2. Add the gin index for task.extra_attrs.report_uuids

Fixes: goharbor#18013

Signed-off-by: chlins <chenyuzh@vmware.com>
chlins added a commit to chlins/harbor that referenced this issue Apr 29, 2023
1. Change the query for listing tasks of scan which can use the db
   index.
2. Add the gin index for task.extra_attrs.report_uuids

Fixes: goharbor#18013

Signed-off-by: chlins <chenyuzh@vmware.com>
chlins added a commit that referenced this issue Apr 30, 2023
1. Change the query for listing tasks of scan which can use the db
   index.
2. Add the gin index for task.extra_attrs.report_uuids

Fixes: #18013

Signed-off-by: chlins <chenyuzh@vmware.com>
chlins added a commit to chlins/harbor that referenced this issue May 4, 2023
1. Change the query for listing tasks of scan which can use the db
   index.
2. Add the gin index for task.extra_attrs.report_uuids

Fixes: goharbor#18013

Signed-off-by: chlins <chenyuzh@vmware.com>
chlins added a commit to chlins/harbor that referenced this issue May 5, 2023
1. Change the query for listing tasks of scan which can use the db
   index.
2. Add the gin index for task.extra_attrs.report_uuids

Fixes: goharbor#18013

Signed-off-by: chlins <chenyuzh@vmware.com>
chlins added a commit to chlins/harbor that referenced this issue May 5, 2023
1. Change the query for listing tasks of scan which can use the db
   index.
2. Add the gin index for task.extra_attrs.report_uuids

Fixes: goharbor#18013

Signed-off-by: chlins <chenyuzh@vmware.com>
chlins added a commit to chlins/harbor that referenced this issue May 5, 2023
1. Change the query for listing tasks of scan which can use the db
   index.
2. Add the gin index for task.extra_attrs.report_uuids

Fixes: goharbor#18013

Signed-off-by: chlins <chenyuzh@vmware.com>
chlins added a commit that referenced this issue May 5, 2023
fix: improve the performance of list artifacts

1. Change the query for listing tasks of scan which can use the db
   index.
2. Add the gin index for task.extra_attrs.report_uuids

Fixes: #18013

Signed-off-by: chlins <chenyuzh@vmware.com>
chlins added a commit that referenced this issue May 5, 2023
fix: improve the performance of list artifacts

1. Change the query for listing tasks of scan which can use the db
   index.
2. Add the gin index for task.extra_attrs.report_uuids

Fixes: #18013

Signed-off-by: chlins <chenyuzh@vmware.com>
WilfredAlmeida pushed a commit to WilfredAlmeida/harbor that referenced this issue Jul 8, 2023
1. Change the query for listing tasks of scan which can use the db
   index.
2. Add the gin index for task.extra_attrs.report_uuids

Fixes: goharbor#18013

Signed-off-by: chlins <chenyuzh@vmware.com>
Signed-off-by: Wilfred Almeida <60785452+WilfredAlmeida@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants