Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

preview of changes done by shpc update #541

Merged
merged 12 commits into from
Apr 8, 2022
Merged

Conversation

vsoch
Copy link
Member

@vsoch vsoch commented Apr 5, 2022

I'm opening this against #538 so we only see changes to the container.yaml files!

I'll look this carefully over tonight and make comments about any questions or issues that I see - just decided I was very hungry and dinner sounds good :)

muffato and others added 3 commits April 1, 2022 17:45
…d to ask the user (#528)

* Only ask for confirmation once
* The .version file is only in module_dir, not container_dir
* When deleting containers, also delete the parent directories as long as they are empty
`shpc add` now supports Docker
@vsoch vsoch mentioned this pull request Apr 5, 2022
@vsoch
Copy link
Member Author

vsoch commented Apr 5, 2022

okay already have a bug! The previous version of python shouldn't have been removed (causing the error here) I'll debug after dinner.

Copy link
Contributor

@muffato muffato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments. Overall it looks quite nice ! It feels good to see the updates coming through,.

By the way, for all software that are available in both https://hub.docker.com/r/biocontainers/ and https://quay.io/repository/biocontainers/ I would remove the former from the registry. As far as I could see, only the latter is regularly updated from BioConda

registry/golang/container.yaml Outdated Show resolved Hide resolved
3.10.0a7-alpine: sha256:9b7958e47cd5bd4d092c3b28802493ad1870bce988b2f6ff97f6c81d96fcda80
3.9.10: sha256:3aae21920963df3205fba69826cc07fcf2fad91f9e062add921766b36e89e6e8
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see you have the 3.9.* filter. Do you think it'd be possible to configure the updated to get the latest version of each 3.x ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep will change to 3[.]*I was trying to pin to just 3 so I would weed out most of the namespace.

Comment on lines 8 to +14
2.0.14--hdcc8f71_0: sha256:275f3b3c587f8a40a39693db8acd91da8be6f053ff1426863da22061bd4e7957
0.9.36--h56fc30b_0: sha256:0f80076288d495263598b24d8684da078a86f359bd58e39958cb491abd817399
0.8.36--h8b12597_4: sha256:13476bcb2a7077eae93db47fcc9f7f7662fdcb06193dc95139c7a6cf7c2de824
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks funny but that's normal. They skipped v1.* 🤷🏼
But why are they no other 2.0.* versions ? For a given x.y stem does it only select the highest (z,t) – assuming the pattern x.y.z--*_t ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what I'm going to call "too many tag" syndrome. Because crane only gives us the top 50, we only see a sample and not any with 2.0. This is an example that would need to have them manually added. https://crane.ggcr.dev/ls/quay.io/biocontainers/diamond

I saved the original registry this time so I should not need to open a new PR for every attempt

Signed-off-by: vsoch <vsoch@users.noreply.github.com>
Signed-off-by: vsoch <vsoch@users.noreply.github.com>
@vsoch
Copy link
Member Author

vsoch commented Apr 5, 2022

Bless all the developers that use semantic versions! lol

* adding more clear error message when tag is not known
Signed-off-by: vsoch <vsoch@users.noreply.github.com>
Copy link
Contributor

@muffato muffato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's looking good ! Just a few questions regarding which tags are selected

v1.139_cv3: sha256:05a6dd401d47c930191fa8c017b2cf60c099c4da236f134586a2f3610e426229
2.3.0: sha256:5e1bc7788a3aa1329e59821e3f89b5bb209372926a67ed1a95f7b9a71350e2be
v2.3.0_cv3: sha256:8624a229b70fc495e29f55f67ec630e30fa43db3f2a5e276e0e0a68551fee0ea
'1.141': sha256:675128282859c9edf1c8679478c828a48d980a4e73b95abba6a087d315d01cfc
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious about this. Without the quotes, the yaml parser would transform this to a float ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the tags should be parsed as strings and the pieces are split into int.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if the quotes are unnecessary, can they be removed ? I find it disturbing that some tags have quotes and some don't

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The quotes are added by the yaml parser, and we use a specific one that preserves comments. I don't know any way to remove them, but I don't think they hurt. Is there a reason to not want them there other than aesthetic?

registry/elasticsearch/container.yaml Outdated Show resolved Hide resolved
registry/ghcr.io/autamus/gawk/container.yaml Outdated Show resolved Hide resolved
Comment on lines 17 to 23
7.4.0: sha256:5248d0e8f7f6ad2095c3a053d5461daa17b02097410f0e9f6397f8f4dedc34bf
latest: sha256:3c50ce7b943306ede1ca0d502a41a8861014ee5377352557716fcc9a8f877992
'7': sha256:3c50ce7b943306ede1ca0d502a41a8861014ee5377352557716fcc9a8f877992
7-jdk17: sha256:3c50ce7b943306ede1ca0d502a41a8861014ee5377352557716fcc9a8f877992
7-jdk16: sha256:f174c0dcf9a84b4035f1fcb62f0340ddc69c0b93320e0d35f097d20ce2ca89d5
7-jdk11: sha256:d920da5cedc06e09c031b748673f2f99456fe8a5d3f437e211fd3cf0ca859fcc
7-jdk-openj9: sha256:acd908af42e1bee2f841eeac031d41317ab3fddcc3ec9d0d4e1cb4b28be24f5f
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one looks strange. First of all, you said that crane only gives the top 50. But for gradle https://crane.ggcr.dev/ls/gradle gives a ton of tags, including more in the 7.* series.

  1. Is 50 the limit imposed by quay.io maybe ?
  2. Why didn't the update pick up the others 7.* ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be the same bug as before with allowing the duplicates - let me run again and we will see if it's still an issue.

Signed-off-by: vsoch <vsoch@users.noreply.github.com>
@vsoch
Copy link
Member Author

vsoch commented Apr 6, 2022

okay that did something bad because now it looks like the latest isn't accurate.

@vsoch
Copy link
Member Author

vsoch commented Apr 6, 2022

I'll need to work on this after work, it's going to be more than a quick push.

Signed-off-by: vsoch <vsoch@users.noreply.github.com>
@@ -3,10 +3,15 @@ url: https://hub.docker.com/r/jupyter/datascience-notebook
maintainer: '@vsoch'
description: Jupyter Datascience Notebook from https://github.com/jupyter/docker-stacks
latest:
"4.0": sha256:5e5bf78bfa351c0255cea5269a0461afbf6b50b51e923a7436229208ea8487f9
b78201340369: sha256:027e159b6780294aa930623f0f7e6c5094972be48d5e15027024a811fdc7e5c7
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@muffato this looks like a red flag - I'm wondering if we should try to filter out patterns that look like commits.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right. This looks like a commit or a checksum. It's not something that shpc could rank against the other tags
(especially this particular one is 2 year old https://hub.docker.com/layers/datascience-notebook/jupyter/datascience-notebook/b78201340369/images/sha256-027e159b6780294aa930623f0f7e6c5094972be48d5e15027024a811fdc7e5c7?context=explore whereas other tags are less then a week old. Hopefully they're legacy tags from a time when they didn't have proper versioning in place ?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And to answer #541 (comment)

Let's maybe allow strings of length <=8 made of digits only because some project may use timestamps e.g. 20220406.

Option 1. To exclude commit IDs: ignore the strings of length > 8 that are only composed of digits + hex characters only.

Option 2. Am I understanding right that you need some integers to be able to rank the versions ? You could do something after the parsing and splitting. Could we say that the tuple must start with at least 1 integer and that the lengths of the integers must be 8 characters at most ? So it would:

  • allow '4.b78201340369' because it looks like a version 4 something, which can still be compared against other version numbers
  • exclude all commit IDs that include a letter
  • include strings that are entirely made of digits as long as they're not too long

@vsoch
Copy link
Member Author

vsoch commented Apr 6, 2022

okay I think this is more sane - I had a bug that was removing most versions :P I think we might still want to try and remove "stuff that looks like a commit hash" in which case we should figure out a pattern for that, some length of lowercase letters and numbers I think?

@vsoch
Copy link
Member Author

vsoch commented Apr 6, 2022

Don't worry about the failure for now - it's a test that needs to be updated given we change a version.

Signed-off-by: vsoch <vsoch@users.noreply.github.com>
@vsoch
Copy link
Member Author

vsoch commented Apr 7, 2022

okay just pushed an updated version that does a pretty good job filtering out the commit tags I think.

@muffato
Copy link
Contributor

muffato commented Apr 7, 2022

The quotes are added by the yaml parser, and we use a specific one that preserves comments. I don't know any way to remove them, but I don't think they hurt. Is there a reason to not want them there other than aesthetic?

OK, I understand that the need to use that parser trumps the quoting style it uses. No worries

Copy link
Contributor

@muffato muffato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this, @vsoch . A last batch of comments :)

Comment on lines +18 to 25
5.0.6: sha256:fed6248ae0bb0d54c0448eb786c87120737eedc522172ee1536ad47789782348
latest: sha256:fed6248ae0bb0d54c0448eb786c87120737eedc522172ee1536ad47789782348
'5': sha256:fed6248ae0bb0d54c0448eb786c87120737eedc522172ee1536ad47789782348
5-nanoserver-ltsc2022: sha256:28b9079c157d0f2c23fec6352a37d85e2c20947c721214024b3aea6338341ff7
5-windowsservercore-ltsc2016: sha256:9620d60541d45e913c30f78623ee4b5ec8f650399339982dda11e9cd1d1b9c78
5-nanoserver-1809: sha256:89c4b070fbc207f5b28f69cff5dec9b841ace73c3c7baeb09b598c060f173699
'5.0': sha256:fed6248ae0bb0d54c0448eb786c87120737eedc522172ee1536ad47789782348
aliases:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it inserting the flavours of 5 but not the flavours of 5.0.6 ? Is it because nanoserver etc end up like minor version number ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only consider the major and minor (e.g., 5.0) so if the parser has already seen 5.0 (without a third) it will not include 5.0.6. The idea behind that is that 5.0.6 might be deprecated soon, but the 5.0 will be a "last update of 5.0 with all changes for that version family). This also assumes that the person is not saying that 5.0 == 5.0.0 - if they used 5.0.0 it would be ordered as less than 5.0.6, and that would be chosen first.

2020.06-arm64: sha256:2d881faf0b20866826a2f0ebbd87ed9791b64222489814901a090e5ed1aa3252
2020.06-x86_64: sha256:0bc28fe0b15b172241f785079782447251139ad0ef0b978f3f1483d5caa9e56d
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm very glad that this one made it through. It shows that the commit-id filter is not overzealous 👍🏼

Comment on lines 15 to 18
19-ea-10: sha256:1dceb32239eb8a2558be7bbc6fcf53b9759799b99a75f1f3a73f483cc79ac028
19-alpine3.15: sha256:fd8c856fb7e3fc2d93815c4b9fa1d112c8d9f46d01f64c6ea858d33518555e5f
19-ea-15: sha256:7e593cb75001103d515e7f3a1fc81598c5d7753339368399b7b5226816036c7f
19-ea-14: sha256:ad5eab9ef6f4e4d0dd971ea0e94eaf992f4d8fb1853ee5d8dfac7e273f4dfed3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not actively developing with Java, but as far as I can see, -ea means "early access". Would it make sense to completely filter them out ? Otherwise, I think there should ideally only be the last one, i.e. 19-ea-15, but it's probably not worth modifying the shpc code for that

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah we can add a custom filter here to avoid tags with ea!

tags:
2.11.0--h5ef6573_1: sha256:6d35d716aa12ba7b0c715fa8a30359b43ba9151e854dc4407b949e7b57c3a50a
2.12.0--h5ef6573_0: sha256:4045560dad56fee946c9bcccd2ba0e373292a5adf8a2f7ba09917cbaeaa74323
2.12.0--ha140323_1: sha256:2736617a96c30bce93a149fe5f47b4f4a290a7d21c1503619430db619d9bc3a0
2.13.2--ha140323_0: sha256:727ae4211b1a2fdfa33c1bb30fbf8ffc231d610c6497fe7150b031fa9a70b7b1
date.2011_11_26--ncurses5.9_8: sha256:eabe346a2adac1bb30aaaa62764f299638daa2a83ef6408722b0f89f8c490e20
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is 4 years old. Can it be filter out ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely.

Signed-off-by: vsoch <vsoch@users.noreply.github.com>
Signed-off-by: vsoch <vsoch@users.noreply.github.com>
@vsoch
Copy link
Member Author

vsoch commented Apr 7, 2022

okay I wound up doing two changes:

  • remove the 2011 date here d274624
  • redid the openjdk because I think we probably don't need to provide windows containers for hpc :) a0ee49a

Copy link
Contributor

@muffato muffato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you got it 👏🏼

Signed-off-by: vsoch <vsoch@users.noreply.github.com>
@vsoch
Copy link
Member Author

vsoch commented Apr 8, 2022

Agree! And I'm good to tweak the recipes of individual containers as we move forward.

@vsoch vsoch merged commit 586f9e0 into add/updater Apr 8, 2022
@vsoch vsoch deleted the add-and-test-updated branch April 8, 2022 02:47
vsoch added a commit that referenced this pull request Apr 8, 2022
* adding support for update
* preview of changes done by shpc update (#541)

* If the container is deleted, the module has to be deleted too. No need to ask the user (#528)
* Only ask for confirmation once
* The .version file is only in module_dir, not container_dir
* When deleting containers, also delete the parent directories as long as they are empty
* `shpc add` now supports Docker

cf #520 (comment)
* round 2 of attempted updates!
I saved the original registry this time so I should not need to open a new PR for every attempt
* remove debug ipython
* Adding more clear error message when tag is not known (#543)
* adding more clear error message when tag is not known

Signed-off-by: vsoch <vsoch@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants