Skip to content
This repository has been archived by the owner on Dec 7, 2022. It is now read-only.

Commit

Permalink
find_repo_content_units: skip db query if expected result is empty (#…
Browse files Browse the repository at this point in the history
…4008)

* find_repo_content_units: skip db query if expected result is empty

For performance reasons, this change does server-side counting
of units that are expected to be in result before actual db query
happens. If expected count is zero, skip db query and continue to
the next chunk of unit ids.

The performance problem is noticable in RHSM-pulp for big repos
with 200k and more units associated. During associate action
results of the most of queries in affected code will be empty
so we can skip querying the db this way.

Without this fix, code will do eg. 200 unnecesary queries for repo
with 200k units which will take a lot time to finish that leads
to major performance regression.
  • Loading branch information
rbikar committed Dec 4, 2020
1 parent 09fa329 commit 033b2e6
Showing 1 changed file with 7 additions and 0 deletions.
7 changes: 7 additions & 0 deletions server/pulp/server/controllers/repository.py
Expand Up @@ -235,6 +235,12 @@ def find_repo_content_units(
if unit_fields:
qs = qs.only(*unit_fields)

# for performace reasons, do server-side counting of units the will be returned in
# result before we do actual query to db
# if expected count of units in result is zero, continue to next chunk of unit_ids
if qs.count() == 0:
continue

for unit in qs:
if skip and skip_count < skip:
skip_count += 1
Expand Down Expand Up @@ -1830,3 +1836,4 @@ def validate_file(file_path, checksum_algorithm, checksum):
else:
if not os.path.isfile(file_path):
raise IOError(_("The path '{path}' does not exist").format(path=file_path))

0 comments on commit 033b2e6

Please sign in to comment.