New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose component sort_score as public API? #269
Comments
|
Constructing two AsPool instances seems to be the way most software centers handle that situation currently, and I don't think that's a terrible thing to do. The match score is also - relatively - self-contained, so you could compare components from two pools with each other. However, the original intent of the API was to be used with just one pool that collects data from all the sources the software center wants it to read from, so that the SC doesn't even have to do its own merging. AppStream takes a conservative approach to what API is public, because the less there is public, the more stuff can be changed without breaking other people's code or breaking the API/ABI. For example, the score isn't public because it is en implementation detail and I originally wasn't sure whether its format or even datatype would be kept stable. As soon as the API would've been made public, we would have had to support it until the next API break. Anyway, back to the "which packaging system was the component from?" question: The "single At the moment, there is an advantage to using one pool for "stuff from the operating system" and one second pool for "everything else" when you are on Debian or Ubuntu though. Those two distributions (and their derivatives) create a system-wide cache of all the components in their repositories, which libappstream will just load directly if it is up-to-date. This effectively gives close-to-zero startup delay for an tl;dr: The situation is complicated, but I am very interested in improving it, by either making the one-pool usecase more attractive or even exposing the raw match scores in search results (or both). |
|
Thank you for your quick and very helpful response! I've had a quick look into what would be involved in AppCenter to move to a single-pool model and it would be a relatively large piece of refactoring. The Flatpak and PackageKit backends are fairly self-contained, almost to the point of being separate modules and handle their own AppStream data, so centralising that is no small feat. However, I can also see that if we did combine both pools, it would reduce a lot of the complexity in AppCenter so I'm convinced that it should be a long-term goal. I've opened an issue in our issue tracker to investigate it further so I can provide some more detailed feedback in the future about what works well and what works not so well when I experiment with that. I'll likely look at throwing together some quick command line tools I can use to investigate the various requirements of AppCenter. However, the quality of search results is one of the most complained about things within AppCenter today, I think largely due to the fact that we can't order the results in any meaningful way. So, in the short-term, I'd be very interested in having the raw match scores exposed if possible. Do you have any preference on how you'd expose those scores? Is it as simple as exposing the My GLib C knowledge isn't great, so I could probably achieve a PR for the first option there if you like, but anything more complex may take me a while! Thanks again! |
That's what I feared... This is precisely the reason why SCs do have separate pools, for example KDE's Discover tool (which also uses libappstream via its Qt bindings) also has independent pools for very self-contained backends. Sadly in some occasions this means that SCs duplicate a lot of the abstraction that AppStream already gives them (I haven't taken a close look at AppCenter yet though).
Neat! I kind of hoped this was something to quickly test, but if it isn't we could as stop-gap solution also make the score API public. I intend for a bigger API break for AppStream's 1.0 release, if the "one pool" approach has been tested well in AppCenter, and we know which parts are missing for you and which API could be better, we could make any changes necessary at that point.
That would likely be easiest, but I'll think about this a bit more, and also take another look at the scoring algorithm just to make sure that the scores are indeed comparable between disconnected pools, so they actually help you with your problem.
Neat :-) I'd likely want to keep this as simple as possible :-) Libappstream hides all its private symbols from its ABI, and separates the two visibilities by regular (public) and private header (*-private.h). I'm pretty glad I went that route, because it makes ABI break checkers work very well, and making API public is just a matter of moving a declaration between headers (the GIR and subsequently your vapi file will be adjusted automatically). The only tricky thing is that the scoring integer type is a custom type (because I wasn't sure whether it would stay the same type of integer) - not sure what the vapi file will make of that. In any case, since I am not using libappstream for a software-center application anymore myself, any feedback from you on certain aspects of AppStream is highly valued (for specification changes, I usually ask the maintainers of KDE Discover and GNOME Software for feedback already, and usually ping cassidyjames as well - even if there's no feedback, it means SC authors will know about a new feature, like the new controls support. If you want, I can also keep you in the loop). |
1 similar comment
That's what I feared... This is precisely the reason why SCs do have separate pools, for example KDE's Discover tool (which also uses libappstream via its Qt bindings) also has independent pools for very self-contained backends. Sadly in some occasions this means that SCs duplicate a lot of the abstraction that AppStream already gives them (I haven't taken a close look at AppCenter yet though).
Neat! I kind of hoped this was something to quickly test, but if it isn't we could as stop-gap solution also make the score API public. I intend for a bigger API break for AppStream's 1.0 release, if the "one pool" approach has been tested well in AppCenter, and we know which parts are missing for you and which API could be better, we could make any changes necessary at that point.
That would likely be easiest, but I'll think about this a bit more, and also take another look at the scoring algorithm just to make sure that the scores are indeed comparable between disconnected pools, so they actually help you with your problem.
Neat :-) I'd likely want to keep this as simple as possible :-) Libappstream hides all its private symbols from its ABI, and separates the two visibilities by regular (public) and private header (*-private.h). I'm pretty glad I went that route, because it makes ABI break checkers work very well, and making API public is just a matter of moving a declaration between headers (the GIR and subsequently your vapi file will be adjusted automatically). The only tricky thing is that the scoring integer type is a custom type (because I wasn't sure whether it would stay the same type of integer) - not sure what the vapi file will make of that. In any case, since I am not using libappstream for a software-center application anymore myself, any feedback from you on certain aspects of AppStream is highly valued (for specification changes, I usually ask the maintainers of KDE Discover and GNOME Software for feedback already, and usually ping cassidyjames as well - even if there's no feedback, it means SC authors will know about a new feature, like the new controls support. If you want, I can also keep you in the loop). |
Indeed, we have a lot of methods that are very similar in name to AppStream methods (
Yes, I think making the score API public at this stage is best for us in terms of offering a quick solution to the search issues our users experience. Given that the Ubuntu 20.04 release is upon us, we now have our work cut out for us preparing the next release of elementary OS. So, it's unlikely I'll be able to spend much time looking at the single-pool approach given the amount of work that's potentially involved in doing that. But I do feel it's a good goal and I'll definitely be sure to share any feedback.
I'll have a play around with this and see what kind of vapi we get out when that type is exposed.
I speak regularly with Cassidy and the rest of the team so I'm usually reasonably across changes like that. I usually get more involved with the technical implementation of such changes into AppCenter rather than the initial definition of the specifications. But if you ever want a sense check on anything that's going to cause vapi changes, feel free to ping me. Thanks again for your time! |
I'm opening this mostly as a request for discussion and/or advice rather than a 100% confirmed feature request.
In elementary AppCenter, we obviously use AppStream and semi-recently we added support for managing Flatpak applications alongside the applications managed by PackageKit.
In doing this, we found it easiest to build two AppStream pools, one by default pointing at the DEP-11 data, and another pointing at a pre-processed set of the Flatpak appstream data. This was done so we could clearly tell in code which components had come from which "backend" and we knew if we wanted to install/update/remove a particular component, it was already associated with either Flatpak or PackageKit.
Now, this brings around some issues with searching and sorting as obviously we have to run a given search twice across both pools and then combine the results somehow. I understand that the results that come back from AppStream are already sorted in a kind of relevance order. But without these relevance metrics accessible to us, we can't combine the search results from the two pools in a meaningful way, at least not without implementing a (probably inefficient) relevancy algorithm of our own.
Do you have any opinions on whether these relevancy metrics should be exposed publically, or indeed whether they'd be any use as a comparison between two pools? I guess if those metrics are somehow based on the relative relevancy between components in the pool, then they're not that useful to compare two different pools. However, if an identical component in two different pools always returned the same relevancy for a given search term, then that's probably a useful metric to have?
If you don't feel exposing this is the correct solution, can you think of any way we might combine both of these pools into one while still having a generic way to figure out where a component came from so we can associate it with a backend. I appreciate there's
as_component_get_origin, but considering it can be set to anything, it doesn't feel like a fool proof enough way of doing this.As an aside, from a quick glance, it looks like
gnome-softwaredoesn't use the built-in appstream search functionality and instead implements its own on top oflibxmlb, again having the AppStream data from different backends in different "silos".The text was updated successfully, but these errors were encountered: