Does not read MediaBaseUrl field in DEP-11 files #70

Closed
ximion opened this Issue Dec 16, 2015 · 3 comments

Comments

Projects
None yet
2 participants
Collaborator

ximion commented Dec 16, 2015

Hi!
The MediaBaseUrl field in DEP-11 files is used as a "prefix" for media URLs, i.e. URLs of screenshots or icons of the remote type.
See the specification: http://www.freedesktop.org/software/appstream/docs/sect-AppStream-DEP11.html#spec-dep11-general

It fills in the same role Origin already has, and allows for easily switching the server which serves the media, which is a desired feature at Debian for derivatives as well as it will become useful in a setup where multiple mirrors exist for media content.

With the current implementation, asglib prevents GNOME-Software from showing any screenshots at all on Debian.
reference files for testing: ftp://ftp.debian.org/debian/dists/sid/main/dep11/

Cheers,
Matthias

Owner

hughsie commented Dec 17, 2015

So, the reasons I'm not particularly happy about MediaBaseUrl are the following:

  • I'm unclear what problem it's trying to solve. Using MediaBaseUrl for the Fedora XML metadata the file size is reduced by 2%, but simply gzipping it it reduces by a further 60% (which is what we do with the XML metadata) and as a consequence it actually loads faster than the uncompressed metadata. If the argument is that it makes it easier to re-write the URL, you're no better off then just rewriting each URL as you still have to load and resave the entire file anyway.
  • It's unclear how to implement this. Does MediaBaseUrl supplement remote screenshots without a prefix (which incidentally makes them filenames or paths, not URLs), or does it overwrite all the remote screenshots? If it's the former then it's not actually a way to set up a screenshot mirror. Derivatives can't just rsync the screenshots and change the MediaBaseUrl as they need to produce AppStream metadata themselves to avoid the screenshots and the references in the files changing.
  • It makes it hard to merge files, due to unclear semantics of MediaBaseUrl we just have to remove it and make the URLs full, which kinda negates the purpose of the tag in the first place.
  • It's not clear what "media" it's prefixing; do remote icons get checked too?
  • It's going to increase the RSS for any files that use MediaBaseUrl and decrease the parsing speed for both files that do and files that don't.
  • I don't see how it can be used in a multiple mirror setup as you can only specify one URL for the MediaBaseUrl; it's not even useful for round-robin functionality.
Collaborator

ximion commented Dec 17, 2015

  • I'm unclear what problem it's trying to solve. Using MediaBaseUrl for the Fedora XML metadata the file size is reduced by 2%, but simply gzipping it it reduces by a further 60% (which is what we do with the XML metadata) and as a consequence it actually loads faster than the uncompressed metadata. If the argument is that it makes it easier to re-write the URL, you're no better off then just rewriting each URL as you still have to load and resave the entire file anyway.

Still, in case I want to rewrite the URL, I would have to go through the whole file, know the previous URL and string-replace it with the new URL. By having MediaBaseUrl, I just have to change one field.
Also, it permits having a list of mirrors somewhere from which data can be selected, so you just need to concatenate the mirror-url and the URL piece found in the DEP-11 document.

*It's unclear how to implement this. Does MediaBaseUrl supplement remote screenshots without a prefix (which incidentally makes them filenames or paths, not URLs), or does it overwrite all the remote screenshots? If it's the former then it's not actually a way to set up a screenshot mirror. Derivatives can't just rsync the screenshots and change the MediaBaseUrl as they need to produce AppStream metadata themselves to avoid the screenshots and the references in the files changing.

No sure what you mean here... Thing is: If there is a MediaBaseUrl field in the document, all remote URLs are relative to the url. So if MediaBaseUrl is "http://example.org", a screenshot "url" will be "a/ab/abcde.desktop/image.png" and the baseurl will just be added to that.
There is no mixing of MediaBaseUrl with full URLs, it's an all-or-nothing approach.

  • It makes it hard to merge files, due to unclear semantics of MediaBaseUrl we just have to remove it and make the URLs full, which kinda negates the purpose of the tag in the first place.

Why would you want to merge files? We don't do that with DEP-11 data, so it's kind of irrelevant. And yes, you would remove the MediaBaseUrl from the resulting file on a merge and use full URLs, which isn't a problem, since the default case is non-merged, "pure" DEP-11 YAML documents, and not merged ones.

  • It's not clear what "media" it's prefixing; do remote icons get checked too?

Yes - didn't I state that already? Although, right now, remote icons aren't used, so this mainly affects screenshots.

  • It's going to increase the RSS for any files that use MediaBaseUrl and decrease the parsing speed for both files that do and files that don't.

Why? When I implemented this in libappstream, it didn't impact parsing speed at all, since it's just a string to hold in memory, and a couple of extra string concatenantions if the string is found to be non-NULL when reading remote media.

  • I don't see how it can be used in a multiple mirror setup as you can only specify one URL for the MediaBaseUrl; it's not even useful for round-robin functionality.

That will happen in future, probably (e.g. by having a list of mirrors in the header). It's more to have an extensible specification from the beginning.

Collaborator

ximion commented Apr 18, 2016

This is fixed now, thanks Robert!

@ximion ximion closed this Apr 18, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment