Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix and improve Guix support #218

Closed
ghost opened this issue Apr 25, 2017 · 30 comments
Closed

Fix and improve Guix support #218

ghost opened this issue Apr 25, 2017 · 30 comments

Comments

@ghost
Copy link

ghost commented Apr 25, 2017

Guix and its system distribution GuixSD run from one repository, like NixOS and nixpkgs.

Our website is at https://gnu.org/s/guix and our master source is at https://git.savannah.gnu.org/cgit/guix.git/log/

Parsing depends on where you read from. All packages are definitions in scheme modules located in "gnu/packages/", the listing on our website is currently not very 3rd party friendly, though someone else might be able to provide more insight on the website.

@AMDmi3
Copy link
Member

AMDmi3 commented Apr 25, 2017

Guix support is there since February: https://repology.org/repository/gnuguix.
I've used GNU prefix because it's widely used inside the project, maybe it is confusing and should be removed?
It currently parses date from the website, but if there were some json available with more info it would be nice.

@ghost
Copy link
Author

ghost commented Apr 26, 2017

Thanks for the reply.

Oh, okay. That wasn't very obvious to me but it makes sense to name it like that.
Having the name "GNU Guix" is correct and should be kept.

In the link you pasted, the links to the gnu.org page could be renamed in my opinion.
Either:
"GNU Guix and GuixSD home"
"GNU Guix and GuixSD packages"
or just
"GNU Guix home"
"GNU Guix packages"
as both exist and Guix is not limited to GuixSD.

Moving on:

The "packages" pages on the website are generated nightly directly from master.
There is no json but something more programmer friendly can be achieved with Guile.
https://lists.gnu.org/archive/html/guix-devel/2017-04/msg00603.html

@AMDmi3
Copy link
Member

AMDmi3 commented Apr 26, 2017

could be renamed in my opinion

Done

There is no json but something more programmer friendly can be achieved with Guile.

Can do. Is there an example on how to extract metadata for all packages with guile and dump it into something which can be easily parsed from python?

AMDmi3 added a commit that referenced this issue Apr 26, 2017
@AMDmi3 AMDmi3 changed the title Add Guix support Improve Guix support Apr 26, 2017
@ghost
Copy link
Author

ghost commented Apr 26, 2017

Thanks!

I'm afraid that the extraction option is currently nothing I can help with, I have too much other issues to work out currently.
Your best bet is to try the guix-devel@gnu.org mailinglist, some people there are on github aswell and should be capable to comment.

@ghost
Copy link
Author

ghost commented May 2, 2017

Apparently there is json available. For the how, what, which I can't help you.

https://lists.gnu.org/archive/html/guix-devel/2017-05/msg00027.html

@AMDmi3
Copy link
Member

AMDmi3 commented May 3, 2017

This contains too little data: only name+version+homepage, while website also has summary and license, and I could make use of even more data (maintainer, category, source download urls).

@ghost
Copy link
Author

ghost commented May 3, 2017

In this case I advise you to file a bug report for feature request (details on the right list are listed at the bottom here: https://www.gnu.org/software/guix/), pointing to this github bug as reference.

@AMDmi3
Copy link
Member

AMDmi3 commented Jul 18, 2019

GUIX support was disabled after the website layout change, for it's no longer possible to extract the needed data.

@AMDmi3 AMDmi3 changed the title Improve Guix support Fix and improve Guix support Jul 18, 2019
@ghost
Copy link
Author

ghost commented Jul 19, 2019

Someone else should pick this up eventually or communicate it to the right channels (guix-devel ML forwarded), I'm currently not active in Guix or Guix System.

@nico202
Copy link

nico202 commented Jul 24, 2019

Which info do you need? A simple scheme program like this:

(use-modules (gnu packages)
             (json))

(display
 (scm->json-string
  (fold-available-packages
   (lambda* (name version result #:rest rest)
     (cons (cons name version) result)) '())))

produce a json dictionary with {"name":"version", ...}.

I'm new to scheme but I think I'll be able to provide at least also url and license. Probably it's possible to clone the repo and generate the json in a CI

@AMDmi3
Copy link
Member

AMDmi3 commented Jul 25, 2019

Which info do you need?

https://repology.org/addrepo

A simple scheme program like this:

Sorry, running scheme code in Repology is not an option.

Probably it's possible to clone the repo and generate the json in a CI

I'd prefer official dump. CI based updates do not sound reliable.

@nico202
Copy link

nico202 commented Aug 22, 2019

Finally we added the json here: https://guix.gnu.org/packages.json
If anybody can take a look I'd be glad, thanks!

@AMDmi3
Copy link
Member

AMDmi3 commented Aug 22, 2019

Great, I'm on it!

Could more info be included there, such as license, downloads URLs (these were available to legacy website parser and were useful to Repology, which could also report broken URLs back) and source reference (e.g. gnu/packages/shells.scm#n395) so we could link to the source directly?

@AMDmi3
Copy link
Member

AMDmi3 commented Aug 22, 2019

Also some homepage's are set to boolean false. This doesn't look correct, it'd be better to omit them or set to null values.

@AMDmi3
Copy link
Member

AMDmi3 commented Aug 23, 2019

Also can't a version be presented in a way so revision can be split from it reliably? Judging by https://guix.gnu.org/manual/en/html_node/Version-Numbers.html I could detect revision as -[0-9]+-[0-9a-f]{7,}, but there are cases which do not follow this policy, like ghmm 0.9-rc3-0.2341.

@civodul
Copy link

civodul commented Aug 27, 2019

@AMDmi3, to complement what @nico202 had implemented, I added what you requested (and more) to https://guix.gnu.org/packages.json . So you'll now find source code URLs, commits/revisions, package definition location, and the Common Platform Enumeration (CPE) name when available.

Let us know what you think!

@AMDmi3
Copy link
Member

AMDmi3 commented Aug 28, 2019

This is just great! But what about homepage=false and version/revision split?

@nico202
Copy link

nico202 commented Aug 28, 2019

as a note, the one with homepage == false are:

curl https://guix.gnu.org/packages.json |  jq -c '.[] | select( .homepage == false)'

{"name":"binutils-bootstrap","version":"0","synopsis":"Bootstrap binaries of the GNU Binutils","homepage":false,"location":"gnu/packages/bootstrap.scm:106"}
{"name":"bootstrap-binaries","version":"0","synopsis":"Bootstrap binaries of Coreutils, Awk, etc.","homepage":false,"location":"gnu/packages/bootstrap.scm:106"}
{"name":"bootstrap-tarballs","version":"0","synopsis":"Tarballs containing all the bootstrap binaries","homepage":false,"location":"gnu/packages/make-bootstrap.scm:881"}
{"name":"gcc-bootstrap","version":"0","synopsis":"Bootstrap binaries of the GNU Compiler Collection","homepage":false,"location":"gnu/packages/bootstrap.scm:507"}
{"name":"glibc-bootstrap","version":"0","synopsis":"Bootstrap binaries and headers of the GNU C Library","homepage":false,"location":"gnu/packages/bootstrap.scm:436"}
{"name":"gnome-default-applications","version":"0","synopsis":"Default MIME type associations for the GNOME desktop","homepage":false,"location":"gnu/packages/gnome.scm:6558"}
{"name":"guile-bootstrap","version":"2.0","synopsis":"Bootstrap Guile","homepage":false,"location":"gnu/packages/bootstrap.scm:343"}
{"name":"static-binaries-tarball","version":"0","synopsis":"Statically-linked bootstrap binaries","homepage":false,"location":"gnu/packages/make-bootstrap.scm:814"}

@civodul
Copy link

civodul commented Aug 28, 2019

I've fixed the homepage issue: it won't be emitted anymore in those cases (the change will show up in packages.json within an hour).

As for the version/revision split, source gives you the actual Git commit or SVN revision when available, which is more accurate than a second guess from the version string. WDYT, @AMDmi3?

@AMDmi3
Copy link
Member

AMDmi3 commented Aug 28, 2019

I've fixed the homepage issue

👍

I've run into another technical problem:

% curl --silent -I https://guix.gnu.org/packages.json | grep Last-Modified: 
Last-Modified: Thu, 01 Jan 1970 00:00:01 GMT

The date does not change which would prevent Repology from updating packages.json.

As for the version/revision split, source gives you the actual Git commit or SVN revision when available, which is more accurate than a second guess from the version string

This still does not allow to extract upstream version reliably. For instance, here are 4 cases with svn_revision, but they all handle it differently.

   {
      "homepage" : "http://icculus.org/smpeg/",
      "location" : "gnu/packages/video.scm:2539",
      "name" : "libsmpeg",
      "source" : {
         "svn_revision" : 399,
         "svn_url" : "svn://svn.icculus.org/smpeg/trunk/",
         "type" : "svn"
      },
      "synopsis" : "SDL MPEG decoding library",
      "version" : "0.4.5-399"
   },
   {
      "homepage" : "http://netpbm.sourceforge.net/",
      "location" : "gnu/packages/netpbm.scm:39",
      "name" : "netpbm",
      "source" : {
         "svn_revision" : 2965,
         "svn_url" : "http://svn.code.sf.net/p/netpbm/code/advanced",
         "type" : "svn"
      },
      "synopsis" : "Toolkit for manipulation of images",
      "version" : "10.78.3"
   },
   {
      "homepage" : "https://www.ctan.org/pkg/bibtex",
      "location" : "gnu/packages/tex.scm:6860",
      "name" : "texlive-bibtex",
      "source" : {
         "svn_revision" : 49435,
         "svn_url" : "svn://www.tug.org/texlive/tags/texlive-2018.2/Master/texmf-dist//bibtex",
         "type" : "svn"
      },
      "synopsis" : "Process bibliographies for LaTeX",
      "version" : "49435"
   },
   {
      "homepage" : "http://ghmm.org",
      "location" : "gnu/packages/machine-learning.scm:189",
      "name" : "ghmm",
      "source" : {
         "svn_revision" : 2341,
         "svn_url" : "http://svn.code.sf.net/p/ghmm/code/trunk",
         "type" : "svn"
      },
      "synopsis" : "Hidden Markov Model library",
      "version" : "0.9-rc3-0.2341"
   },

@civodul
Copy link

civodul commented Aug 28, 2019

I noticed the Last-Modified issue as well and opened a bug: https://issues.guix.gnu.org/issue/37207. I suspect we'll just get rid of Last-Modified altogether though, which is not as good as we'd like but the simplest solution for now.

This still does not allow to extract upstream version reliably. For instance, here are 4 cases with svn_revision, but they all handle it differently.

I'm not sure what you mean. I was thinking that, in those cases, the version string that Guix uses is irrelevant to Repology given that you have the revision number (or commit ID). And precisely, in those cases, there is no "upstream version": we're using an snapshot that does not correspond to a release. (Those cases should be quite rare, fortunately.)

Or did you have something else in mind?

@AMDmi3
Copy link
Member

AMDmi3 commented Aug 28, 2019

I noticed the Last-Modified issue as well and opened a bug: https://issues.guix.gnu.org/issue/37207. I suspect we'll just get rid of Last-Modified altogether though, which is not as good as we'd like but the simplest solution for now.

👍

And precisely, in those cases, there is no "upstream version": we're using an snapshot that does not correspond to a release

Not really. If closest official version before the snapshot can be extracted (say, 1.2.3 out of 1.2.3-4.5678), it may used directly - e.g. it may (validly) outdate older releases (e.g. 1.2.2), and it may be (validly) outdated by newer releases (e.g. 1.2.4). If there's problem with reliable extraction, in most cases it would lead to invalid marking of newest official versions as outdated (by e.g. 1.2.3-4 or 1.2.3-5678). Sometimes it's not possible to reliably extract upstream version, but additional information is available which can give a hint on that the version is a snapshot, and it may be the GUIX case. For example, we may ignore version if it contains svn_rev or the beginning of git_rev - it thus won't produce false positives even if it doesn't follow expected snapshot pattern. On the downside, it's not as useful, so it's always desirable to extract upstream version.

More thoughts here: #345 (comment)

@civodul
Copy link

civodul commented Aug 29, 2019

I would argue that, like you wrote, you should ignore version when git_ref or svn_rev is available. You could rebuild a human-readable version from that, for example with git describe.

Now, if you'd rather work on the version string, then I'm afraid we can't offer more than the policy you referred to earlier at https://guix.gnu.org/manual/en/html_node/Version-Numbers.html. Not explicitly mentioned in the policy is the fact that version numbers increase monotonically (per glibc's strverscmp version comparison algorithm); that should allow you to reliably tell whether X is newer than Y.

HTH!

@AMDmi3
Copy link
Member

AMDmi3 commented Aug 29, 2019

I would argue that, like you wrote, you should ignore version when git_ref or svn_rev is available. You could rebuild a human-readable version from that, for example with git describe.

git describe on what? Fetching source repository for each package is obviously not an option.

Well anyway we can start from this. I'll deploy new parser after some more testing and when Last-Modified issue is fixed.

AMDmi3 added a commit that referenced this issue Sep 20, 2019
AMDmi3 added a commit that referenced this issue Sep 20, 2019
@nico202
Copy link

nico202 commented Oct 2, 2019

@AMDmi3 I saw you are working on this, thanks! Btw, why do you need the Last-Modified? Isn't ETag preferred?

@AMDmi3
Copy link
Member

AMDmi3 commented Oct 2, 2019

It isn't. Repology uses Last-Modified.

@ghost
Copy link
Author

ghost commented Oct 4, 2019

A humble request: next time please open a new ticket, github's nice UI took me long enough to not get notification emails about this and I still can't move it out of my own open tickets. So once you all have fixed this issue, open a new one if new issues with Guix arise.

Thanks.

@AMDmi3
Copy link
Member

AMDmi3 commented Nov 1, 2019

I've deployed it. It won't update because of still not fixed last-modified: Thu, 01 Jan 1970 00:00:01 GMT problem, but this does not concern me any more. As asked by @ng-0, please open a new ticket if any other problems arise.

@cbaines
Copy link

cbaines commented May 25, 2020

@AMDmi3

Well anyway we can start from this. I'll deploy new parser after some more testing and when Last-Modified issue is fixed.

So the Last-Modfied issue should be somewhat dealt with now, as the header is no longer being set.

→ wget --spider --server-response https://guix.gnu.org/packages.json
Spider mode enabled. Check if remote file exists.
--2020-05-25 10:55:30--  https://guix.gnu.org/packages.json
Resolving guix.gnu.org (guix.gnu.org)... 141.80.181.40
Connecting to guix.gnu.org (guix.gnu.org)|141.80.181.40|:443... connected.
HTTP request sent, awaiting response... 
  HTTP/1.1 200 OK
  Server: nginx
  Date: Mon, 25 May 2020 09:55:30 GMT
  Content-Type: application/json
  Content-Length: 9824829
  Connection: keep-alive
  Accept-Ranges: bytes
Length: 9824829 (9.4M) [application/json]
Remote file exists.

This has only just changed, so maybe this will lead to more up to date information about Guix, but let me know if more is needed to make that happen.

@AMDmi3
Copy link
Member

AMDmi3 commented May 25, 2020

Nice work, it took half a year to remove a header. Anyway, this has fixed guix update, as can be seen on graphs:

AMDmi3 added a commit that referenced this issue May 27, 2020
danjelalura pushed a commit to danjelalura/guix-artwork that referenced this issue Jul 22, 2020
Reported at <repology/repology-updater#218 (comment)>.

* website/apps/packages/builder.scm (packages-json-builder): Do not emit
"homepage" when it's false.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants