Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Artifacts] Improve list artifacts querying #5655

Merged

Conversation

TomerShor
Copy link
Member

@TomerShor TomerShor commented May 29, 2024

Previously, when listing artifacts we did several intensive operations against the DB:

  1. When asking a specific tag, we ran and SQL query on the tags, than filtered the artifacts on their UIDs from the returned tags.
  2. For adding tags to the artifact object, we:
    1. Queried the db for all artifacts with the given filters
    2. Iterated over the returned artifacts
    3. For each artifact, query the DB for its tags and add them to the struct.

All of the above resulted in a lot of overhead and too many queries.

Now we:

  1. Query all artifacts, joined* on the artifact tags, filtered by tag name. (This will already return the permutations)
  2. For each artifact add the tag to its metadata.
  3. Moved the producer_uri filtering to the same for loop as (2) to not iterate over all artifacts again.

*outer join if no specific tag was requested so we return artifacts that are tag-less, and a simpler join when querying on a specific tag as in that case tag-less functions are irrelevant.

https://iguazio.atlassian.net/browse/ML-6642

Copy link
Member

@quaark quaark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Couple of things

server/api/db/sqldb/db.py Outdated Show resolved Hide resolved
server/api/db/sqldb/db.py Outdated Show resolved Hide resolved
Copy link
Member

@quaark quaark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more minor thing

server/api/db/sqldb/db.py Outdated Show resolved Hide resolved
Copy link
Member

@quaark quaark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👑

@TomerShor TomerShor changed the title [Artifacts] Improve list artifacts tag querying [Artifacts] Improve list artifacts querying May 30, 2024
@TomerShor TomerShor merged commit f0184b3 into mlrun:development May 30, 2024
12 checks passed
TomerShor added a commit to TomerShor/mlrun that referenced this pull request May 30, 2024
@TomerShor TomerShor deleted the bugfix/list-artifacts-join-tags branch May 30, 2024 13:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants