New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
repoquery: fix rich deps matching by using provide expansion from libdnf #1432
Conversation
There was an error while running a copr build. You can re-trigger copr build by adding a comment ( |
@lukash Did you make any performance measurements? |
@j-mracek I did not. From what I remember the command from the bug was always quite slow - but I might remember that wrong. Are you concerned about anything in particular? |
I would like to hear that there is no performance issue with the code change. The original code is slow, but it could be even worse. |
I'll check it. |
☔ The latest upstream changes (presumably 5ccde9f) made this pull request unmergeable. Please resolve the merge conflicts. |
I've done some rudimentary performance measurements. As a summary, it's mostly the same, the new version is slightly slower. There's a big performance hit for Here's a table with the measurements. I've picked the
|
Oh, just to add: The stars in the QUERIES are there to increase the amount of matches for the queries, so that the times are longer and performance differences more significant. They're not the usual queries users would run most of the time, obviously. |
@lukash Please could you rebase it? |
I have discovered a performance problem with --whatconflicts: time sudo dnf repoquery -C --disableplugin=* --whatconflicts $(rpm -qa --queryformat="%{name},") | The original dnf is able to perform the command in 3.835s but with patches in 18.262s. |
Please kindly read my #1432 (comment). Thanks. |
There was an error while running a copr build:
You can re-trigger copr build by adding a comment ( |
dnf/cli/commands/repoquery.py
Outdated
depquery = depquery.union(query.filter(requires=resolved_nevras)) | ||
|
||
if all_dep_types: | ||
# TODO this is actually very inefficient, as it builds the same list of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure about the comment. I would recommend to remove it because it has not much informative value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment has a lot of value, it just may not be phrased well. The problem is each call to query.filter IIRC resolves the names
to packages or does some similarly expensive operation (sorry, I forgot the details, I can check them if you find it important here), which is the same for all the five total calls that are happening here. Therefore, if the code was written in a way this same operation was done only once, the code could be five times faster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please could you remove the comment anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to give me a good reason for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simple I don't think that the code should be used as a wish list. Additionally I don't think that there is any real impact on performance. Also the implementation is limited by API available in libsolv.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a wish, it is a performance issue with the code that I've discovered.
There is a real impact on performance. I've done a very rudimentary performance check (as I kind of hope it will be enough for you) and for a repoquery like:
dnf3 repoquery --whatdepends "libdnf*"
The pyseq_to_reldeplist()
function (https://github.com/rpm-software-management/libdnf/blob/20ba333f9a54e73e95257e1a78f0f55bb6e88809/python/hawkey/iutil-py.cpp#L237) is being called 5 times with the same arguments and the same result. It is taking about 50% of the total time (I've eyeballed the time tbh., didn't go through measuring it exactly...) of the command run time.
This is obviously not a libsolv limitation, it is a shortcoming of the current API design and the comment is there to bring attention to it, because it is not apparent at first glance this may be happening.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then please enhance the comment and provide additional information or remove it. The issue is with resolving string as a glob to reldep. It could be enhance by providing API for the conversion and then using list of reldeps for query. To general notes have nearly no value and the author of the note will not remember details. Just remember how many time you already mentioned that you do not remember details about your code, in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've adjusted the comment, sorry about being imprecise in the first place.
FTR I think the right API fix is to have support for depends
, instead of separating out the resolving of globs. I think that's better for a high-level API.
dnf/cli/commands/repoquery.py
Outdated
return done | ||
|
||
def by_all_deps(self, names, query, all_dep_types=False): | ||
# in case of arguments being NEVRAs, resolve them to packages | ||
resolved_nevras = self._resolve_nevras(names, query) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using nevra could be confusing because we have NEVRA object. Option parser use REQ for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is confusing, the naming? The comment? I'm not sure what you mean...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
resolved_nevras
is confusing. Inside the variable there is a query. What about nevras_query
, REQs_query
, specs_query
, subjects_query
or anything else that is better than resolved_nevras
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've renamed it to resolved_nevras_query
, as the _query
suffix seems to be what you really want and while a bit long, it still fits on the lines.
Moves the check for alldeps/exactdeps switches to the configure() method along with the other checks. Raises dnf.cli.CliError instead of dnf.exceptions.Error.
Fixes a bug introduced in bd00c04, where the by_all_deps() arguments were changed to an array, but in tree_seed() it still passes a single string. Untested, I have actually no idea what the code does there, it looks kind of suspect.
Separates whatrequires and whatdepends handling into two separate code branches, the original was overly complex, confusing and presented multiple unnecessary corner cases. The by_all_deps() arguments now also have a much better defined role.
I believe that information in doc/api_queries.rst is outdated. @lukash Please could you take a look? |
Could you be more specific? |
At lease please could you update a table between lines 90 and 115. For requires an argument type query/package sequence is not described. |
Ok, I see, I'll try to add that. I also see there's no mention of
I suppose, but I'm not entirely sure how to phrase that and I'm also having a bit of a hard time remembering what exactly are the differences here (rich deps evaluation being the main difference I'm introducing), do you have any suggestions what to mention? |
That rich deps are evaluated differently between arguments type. Like package |
I've added a commit updating the documentation in |
Uses the new feature of filtering by solvables to properly resolve rich dependencies for packages. The new code now, along with serching for the arguments as reldeps, also searches for them as NEVRAs and then looks up the dependencies for the packages it found. https://bugzilla.redhat.com/show_bug.cgi?id=1534123 https://bugzilla.redhat.com/show_bug.cgi?id=1698034
…RhBug:1534123,1698034) To properly list the conflicts, suggests, recommends, enhances and supplements dependencies of packages defined through provides or even rich dependencies, it is required to resolve them to packages and query for them, because that's where the dependencies are defined. https://bugzilla.redhat.com/show_bug.cgi?id=1534123 https://bugzilla.redhat.com/show_bug.cgi?id=1698034
Add the new types of arguments supported for dependency filtering.
📌 Commit 757f9ad has been approved by |
Separates whatrequires and whatdepends handling into two separate code branches, the original was overly complex, confusing and presented multiple unnecessary corner cases. The by_all_deps() arguments now also have a much better defined role. Closes: #1432 Approved by: j-mracek
Uses the new feature of filtering by solvables to properly resolve rich dependencies for packages. The new code now, along with serching for the arguments as reldeps, also searches for them as NEVRAs and then looks up the dependencies for the packages it found. https://bugzilla.redhat.com/show_bug.cgi?id=1534123 https://bugzilla.redhat.com/show_bug.cgi?id=1698034 Closes: #1432 Approved by: j-mracek
…RhBug:1534123,1698034) To properly list the conflicts, suggests, recommends, enhances and supplements dependencies of packages defined through provides or even rich dependencies, it is required to resolve them to packages and query for them, because that's where the dependencies are defined. https://bugzilla.redhat.com/show_bug.cgi?id=1534123 https://bugzilla.redhat.com/show_bug.cgi?id=1698034 Closes: #1432 Approved by: j-mracek
Add the new types of arguments supported for dependency filtering. Closes: #1432 Approved by: j-mracek
☀️ Test successful - status-papr |
Moves the check for alldeps/exactdeps switches to the configure() method along with the other checks. Raises dnf.cli.CliError instead of dnf.exceptions.Error. Closes: #1432 Approved by: j-mracek
Separates whatrequires and whatdepends handling into two separate code branches, the original was overly complex, confusing and presented multiple unnecessary corner cases. The by_all_deps() arguments now also have a much better defined role. Closes: #1432 Approved by: j-mracek
Uses the new feature of filtering by solvables to properly resolve rich dependencies for packages. The new code now, along with serching for the arguments as reldeps, also searches for them as NEVRAs and then looks up the dependencies for the packages it found. https://bugzilla.redhat.com/show_bug.cgi?id=1534123 https://bugzilla.redhat.com/show_bug.cgi?id=1698034 Closes: #1432 Approved by: j-mracek
…RhBug:1534123,1698034) To properly list the conflicts, suggests, recommends, enhances and supplements dependencies of packages defined through provides or even rich dependencies, it is required to resolve them to packages and query for them, because that's where the dependencies are defined. https://bugzilla.redhat.com/show_bug.cgi?id=1534123 https://bugzilla.redhat.com/show_bug.cgi?id=1698034 Closes: #1432 Approved by: j-mracek
Add the new types of arguments supported for dependency filtering. Closes: #1432 Approved by: j-mracek
This is meant to fix a bug where rich dependencies would match a package even if the require specified a different version range than doesn't match the version of the provide.
Uses new libdnf (PR linked below) that can expand package provides while filtering so that it doesn't haveto be implemented in dnf in a non-straightforward way that was causing it to match rich dependencies incorrectly. Further description is with the commits, along with some shortcomings this currently has.
rpm-software-management/libdnf#762
Tests for this are here:
rpm-software-management/ci-dnf-stack#577
https://bugzilla.redhat.com/show_bug.cgi?id=1534123
https://bugzilla.redhat.com/show_bug.cgi?id=1698034