Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Build links to Docker images for Chart.yaml #6963

Merged
merged 18 commits into from
Feb 9, 2023

Conversation

opello
Copy link
Contributor

@opello opello commented Feb 4, 2023

Description

The current Chart.yaml sources generation logic does not generate non-URLs but does generate quite a few invalid URLs. This is a continuation of improving the links to help attribution and discoverability.

⚒️ Fixes

Various broken Docker image links across several container registry hosts.

⚙️ Type of change

  • ⚙️ Feature/App addition
  • 🪛 Bugfix
  • ⚠️ Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • 🔃 Refactor of current code

🧪 How Has This Been Tested?

By placing the sync_tag function in a separate file and running it against several individual charts as well as all of the charts like the CI job would.

Example of the changes after this has run:

diff --git a/charts/stable/plex/Chart.yaml b/charts/stable/plex/Chart.yaml
index 28afd14c27..f0a7615d14 100644
--- a/charts/stable/plex/Chart.yaml
+++ b/charts/stable/plex/Chart.yaml
@@ -19,8 +19,7 @@ maintainers:
 name: plex
 sources:
   - https://github.com/truecharts/charts/tree/master/charts/stable/plex
-  - https://ghcr.com/ghcr.io/onedr0p/plex
-  - https://hub.docker.com/ghcr.io/onedr0p/plex
+  - https://ghcr.io/onedr0p/plex
   - https://github.com/k8s-at-home/container-images/pkgs/container/plex
 type: application
 version: 12.0.10
diff --git a/charts/enterprise/traefik/Chart.yaml b/charts/enterprise/traefik/Chart.yaml
index f5dd7834c1..6993b57a3e 100644
--- a/charts/enterprise/traefik/Chart.yaml
+++ b/charts/enterprise/traefik/Chart.yaml
@@ -19,7 +19,7 @@ maintainers:
 name: traefik
 sources:
   - https://github.com/truecharts/charts/tree/master/charts/enterprise/traefik
-  - https://hub.docker.com/traefik
+  - https://hub.docker.com/_/traefik
   - https://github.com/traefik/traefik
   - https://github.com/traefik/traefik-helm-chart
   - https://traefik.io/

The diffstat reports 738 files changed, 727 insertions(+), 739 deletions(-) so there are still quite a few "bad" URLs.

📃 Notes:

Supersedes #6821.

There is a change of indent to be consistent with what seemed prevalent in the file. There is a move from an if-ladder to a case statement for nicer string matching. There is the use of parameter expansion to manipulate strings. Any of these might be worth careful review or an alternative, more accessible approach.

✔️ Checklist:

  • ⚖️ My code follows the style guidelines of this project
  • 👀 I have performed a self-review of my own code
  • #️⃣ I have commented my code, particularly in hard-to-understand areas
  • 📄 I have made corresponding changes to the documentation
  • ⚠️ My changes generate no new warnings
  • 🧪 I have added tests to this description that prove my fix is effective or that my feature works
  • ⬆️ I increased versions for any altered app according to semantic versioning

➕ App addition

If this PR is an app addition please make sure you have done the following.

  • 🪞 I have opened a PR on truecharts/containers adding the container to TrueCharts mirror repo.
  • 🖼️ I have added an icon in the Chart's root directory called icon.png

Please don't blindly check all the boxes. Read them and only check those that apply.
Those checkboxes are there for the reviewer to see what is this all about and
the status of this PR with a quick glance.

Signed-off-by: Dan Christensen <opello@opello.org>
The explanation is also meant to remind anyone that sees it that the
code could inadvertently remove a sources sequence entry that was
intentionally added, because it can not tell.

Signed-off-by: Dan Christensen <opello@opello.org>
Signed-off-by: Dan Christensen <opello@opello.org>
This is a faithful move from the if-ladder to a case statement that
preserves the existing behavior, with optimization to follow.  The
behavior of the function before and after this change is the same.

Signed-off-by: Dan Christensen <opello@opello.org>
No "container source" entry from description_list.md has a scheme.  The
values are parsed from the Dockerfiles and would not have one there
either.

Signed-off-by: Dan Christensen <opello@opello.org>
Parse the tccr.io prefix specifically instead of just checking for the
substring tccr which could result in a false positive.

The generated link was also going to point to a truecharts subdirectory
under mirror in the containers repository that does not exist.

Signed-off-by: Dan Christensen <opello@opello.org>
Parse the lscr.io prefix specifically instead of just checking for the
substring lscr which could result in a false positive.

The generated link would also return a 404 because the web interface
requires the image name to be passed in the query string.

Signed-off-by: Dan Christensen <opello@opello.org>
Parse the gcr.io prefix specifically instead of just checking for the
substring gcr which could result in a false positive.

Signed-off-by: Dan Christensen <opello@opello.org>
The intent of this code is to generate URLs to be included in
documentation to attribute inputs to the chart.  If a publicly
accessible URL can not be generated from the image name it makes sense
to not add anything and instead rely on a manual edit to the Chart.yaml.

Signed-off-by: Dan Christensen <opello@opello.org>
There does not seem to be a general purpose web index to the azurecr.io
hosted images.

Signed-off-by: Dan Christensen <opello@opello.org>
Signed-off-by: Dan Christensen <opello@opello.org>
Parse the public.ecr.aws prefix specifically instead of just checking
for the substring public.ecr.aws which could result in a false positive.

Signed-off-by: Dan Christensen <opello@opello.org>
There does not seem to be a general purpose web index to the ocir.io
hosted images.

Signed-off-by: Dan Christensen <opello@opello.org>
From the perspective of linking to image details on the Docker Hub web
interface, there are two types of images:

  1. Docker Official Images
  2. all of the other images, regardless of their trustworthiness

The Docker Official Images can be referenced several ways, either on the
command line when passed to docker pull, or in the FROM instruction of a
Dockerfile:

  * busybox
  * library/busybox
  * docker.io/busybox
  * docker.io/library/busybox

Furthermore, over the years there have been several domains used for the
official Docker Hub registry:

  * docker.io
  * index.docker.io
  * registry-1.docker.io
  * registry.hub.docker.com

The goal here is handling each possible case, which makes Docker Hub
images more complex than the handling for other registries.

It also makes the case block's '*' (default) case harder to find in the
sequence of glob expressions, but this is necessary to avoid repeating
the parsing or adding another helper function.

Reference:
docker/hub-feedback#2113
docker/cli#3793

Signed-off-by: Dan Christensen <opello@opello.org>
Signed-off-by: Dan Christensen <opello@opello.org>
Signed-off-by: Dan Christensen <opello@opello.org>
By assuming image names that are not handled by other cases are Docker
Hub images there is a risk of generating bad links.  Minimize this risk
by not generating a link if the image name for a Docker Hub link has two
slashes.  This is a case that should not happen and would likely mean an
unsupported registry is being used.

There is still a risk of an unsupported registry being treated as Docker
Hub and an invalid link being generated.  That case is if the domain and
image name is example.com/busybox where there is only one slash.

Signed-off-by: Dan Christensen <opello@opello.org>
Sort the cases from longest to shortest prioritizing any case with a
suffix only glob over any case with a prefix glob.  The intention is to
avoid having a case that can not be reached.

The combined Docker Hub and default case is last.  It might make sense
to split the default case handling off but it does not seem to be a
problem right now.

Signed-off-by: Dan Christensen <opello@opello.org>
@truecharts-admin truecharts-admin added the size/M Categorises a PR that changes 30-99 lines, ignoring generated files. label Feb 4, 2023
Copy link
Contributor Author

@opello opello left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed sync_tag also exists in renovate-bump.yaml. I'm not sure how difficult it would be to share it as a real script from the workflows directory, but that seems like a nice option. Otherwise, maybe a job template. But that seems pretty ugly.

Either way, as another commit in this PR or something soon after the synchronization of the two should probably be addressed.

Comment on lines +99 to +121
curr_sources=$(
go-yq '
.sources[] |
select(
. != "https://github.com/truecharts*" and
. != "https://ghcr*" and
. != "docker.io*" and
. != "https://docker.io*" and
. != "https://hub.docker*" and
. != "https://fleet.*" and
. != "https://github.com/truecharts/containers/tree/master/mirror/*" and
. != "https://public.ecr.aws*" and
. != "https://ocir.io*" and
. != "https://gcr*" and
. != "https://azurecr*" and
. != "https://quay*" and
. != "https://lscr*" and
. != "https://github.com/truecharts/containers*" and
. == "http*"
)
' \
"${chart}/Chart.yaml"
)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may be controversial ...

;;
# There have been a number of domains used for the Docker Hub registry over the years.
# NOTE: This is also the default case!
docker.io/*|index.docker.io/*|registry-1.docker.io/*|registry.hub.docker.com/*|*)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default case is a little sneakily hidden here. I don't think it's awesome but it minimizes repetition without adding a new function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm okey with that tbh.

# Bail out if the image name has more than 1 slash.
if [ ${#slashes} -gt 1 ]; then
prefix=""
echo "WARNING: Not assuming '$container' is a Docker Hub image"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I introduced printing this warning but didn't redirect it to stderr as I wasn't sure of how such things should be handled, but it seemed appropriate feedback to generate. But maybe it belongs somewhere else, like where the Dockerfile gets added to the containers repository instead?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine to warn about this here :)

@truecharts-admin truecharts-admin added precommit:ok CI status: pre-commit validation successful lint:ok CI status: linting successful install:ok CI status: install successful labels Feb 4, 2023
@Ornias1993 Ornias1993 self-requested a review February 4, 2023 08:12
@Ornias1993 Ornias1993 self-assigned this Feb 4, 2023
@Ornias1993 Ornias1993 merged commit 0d4e9d0 into truecharts:master Feb 9, 2023
@truecharts-admin
Copy link
Collaborator

This PR is locked to prevent necro-posting on closed PRs. Please create a issue or contact staff on discord if you want to further discuss this

@truecharts truecharts locked as resolved and limited conversation to collaborators Feb 17, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
install:ok CI status: install successful lint:ok CI status: linting successful precommit:ok CI status: pre-commit validation successful size/M Categorises a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants