Implement registry and registry cache cleaning by ianpittwood · Pull Request #253 · posit-dev/images-shared

ianpittwood · 2025-10-28T19:48:13Z

Closes #247

github-actions · 2025-10-28T19:51:49Z

Test Results

797 tests 797 ✅ 8m 59s ⏱️
1 suites 0 💤
1 files 0 ❌

Results for commit fdf2add.

♻️ This comment has been updated with latest results.

ianpittwood · 2025-11-07T20:48:15Z

Successful CI run: https://github.com/posit-dev/images-shared/actions/runs/19180771318/job/54836720185

bschwedler

This all makes sense to me.

bschwedler · 2025-11-13T19:44:16Z

.github/workflows/ci.yml

+  clean-caches:
+    name: Clean Caches
+    permissions:


Do you anticipate consuming workflows will invoke this as a separate job as is done here?

Could be a separate job or a separate workflow? Not sure. I guess I'd lean for it to be a job that runs at the end of builds to clean up. Did you have a lean?

For repos that trigger ci at least daily, running it there makes a lot of sense.

This job has no needs, so it will continue to clean up old images even if the builds fail. Yay!

.github/workflows/clean-caches.yml

posit-bakery/posit_bakery/registry_management/ghcr/models.py

bschwedler · 2025-11-13T20:32:00Z

posit-bakery/posit_bakery/registry_management/ghcr/clean.py

+        untagged_versions = package_versions.filter_untagged()
+        untagged_old_versions = untagged_versions.filter_older_than(remove_untagged_older_than)


Would it make sense to chain the methods here?
Example uses the method names without filter_

Suggested change

untagged_versions = package_versions.filter_untagged()

untagged_old_versions = untagged_versions.filter_older_than(remove_untagged_older_than)

untagged_old_versions = package_versions.untagged().older_than(remove_untagged_older_than)

That would select items that are untagged and older than the given time. Don't we want to always remove images that are older than that date regardless of their tagging?

Good point. I was thinking about this in terms of a subfilter.

For the cache, it makes sense to clean up anything older than a certain age.

For production repos we probably want to support this type of subselection. The logic of the larger function should help accomplish this though.

bschwedler

This looks good to me. A few small thoughts, but take 'em or leave 'em!

bschwedler · 2025-12-05T15:27:04Z

.github/workflows/ci.yml

+  clean-caches:
+    name: Clean Caches
+    permissions:


For repos that trigger ci at least daily, running it there makes a lot of sense.

This job has no needs, so it will continue to clean up old images even if the builds fail. Yay!

.github/workflows/clean-caches.yml

bschwedler · 2025-12-05T15:32:12Z

posit-bakery/posit_bakery/registry_management/dockerhub/api.py

+    def get_repositories(self, namespace: str = None) -> list[dict]:
+        if namespace is None:
+            namespace = self.identifier
+        target = urljoin(self.BASE_URL, self.ENDPOINTS["repositories"].format(namespace=namespace))


What do you think about making this a class method?

Suggested change

target = urljoin(self.BASE_URL, self.ENDPOINTS["repositories"].format(namespace=namespace))

target = urljoin(self.BASE_URL, self.endpoints("repositories", namespace=namespace))

bschwedler · 2025-12-05T15:40:49Z

posit-bakery/posit_bakery/registry_management/ghcr/api.py

+        """Get details on a package."""
+        target_url = self.ENDPOINTS["package"].format(organization=organization, package=quote(package, safe=""))
+        log.debug(f"GET {target_url}")
+        headers, response = self.client.requester.requestJsonAndCheck(


nit: unused variable

Suggested change

headers, response = self.client.requester.requestJsonAndCheck(

_, response = self.client.requester.requestJsonAndCheck(

bschwedler · 2025-12-05T15:41:03Z

posit-bakery/posit_bakery/registry_management/ghcr/api.py

+                organization=organization, package=quote(package, safe="")
+            )
+            log.debug(f"GET {target_url} (page {page}/{page_count})")
+            headers, response = self.client.requester.requestJsonAndCheck(


nit:

Suggested change

headers, response = self.client.requester.requestJsonAndCheck(

_, response = self.client.requester.requestJsonAndCheck(

bschwedler · 2025-12-05T15:48:56Z

posit-bakery/posit_bakery/registry_management/ghcr/clean.py

+        untagged_versions = package_versions.filter_untagged()
+        untagged_old_versions = untagged_versions.filter_older_than(remove_untagged_older_than)


Good point. I was thinking about this in terms of a subfilter.

For the cache, it makes sense to clean up anything older than a certain age.

For production repos we probably want to support this type of subselection. The logic of the larger function should help accomplish this though.

bschwedler · 2025-12-05T15:54:40Z

posit-bakery/posit_bakery/registry_management/dockerhub/api.py

+        "tag": "/namespaces/{namespace}/repositories/{repository}/tags/{tag}",
+    }
+
+    def __init__(self, identifier: str = None, secret: str = None):


There are several places in this PR where we should use the union type for nullalble params.

Suggested change

def __init__(self, identifier: str = None, secret: str = None):

def __init__(self, identifier: str | None = None, secret: str | None = None):

This reverts commit 36f2745.

Co-authored-by: Benjamin R. J. Schwedler <ben@posit.co>

ianpittwood marked this pull request as ready for review November 7, 2025 20:49

ianpittwood requested a review from bschwedler as a code owner November 7, 2025 20:49

bschwedler approved these changes Nov 13, 2025

View reviewed changes

ianpittwood requested a review from bschwedler November 20, 2025 20:57

bschwedler approved these changes Dec 5, 2025

View reviewed changes

bschwedler reviewed Dec 5, 2025

View reviewed changes

ianpittwood added 22 commits December 8, 2025 13:00

Implement a flow for deleting dangling caches in GHCR

60ad6ae

Add a clean flow for normal registries

b7466f1

Add a lightweight Dockerhub API wrapper

a523f35

Add a clean_registry function for Dockerhub

19b8cbb

Refactor GHCR implementation as an API wrapper class

9994aa1

Update clean implementation for refactor

a8c0d0d

Add clean command for cache cleanup

b433c17

Add tests for config.clean_caches

4627bb7

Add dry run option

443491f

Bug fixes

cf3d908

Add shared workflow and append to CI

9a0d937

Fix quoting and operators

a7b35ab

Fix command name

c956d42

Temporarily override version for CI pull

2c3642e

Fix tests for time-zone aware DT patch

94531cd

Add defaults on ternaries

12d7eb7

Set GITHUB_TOKEN for cache cleaning

ef22102

Revert version setting on clean-caches job

b3ba986

Line separate arguments

e07f641

Change target version for testing

51b3483

Revert "Change target version for testing"

eee1a56

This reverts commit 36f2745.

Remove filter prefix from GHCRPackageVersions functions

2cd4a39

ianpittwood and others added 4 commits December 8, 2025 13:00

Update .github/workflows/clean-caches.yml

2f9485a

Co-authored-by: Benjamin R. J. Schwedler <ben@posit.co>

Use endpoint as a classmethod

5542c49

Fix type hints on API classes

a84e751

Fix poetry.lock conflicts

5ce8332

ianpittwood force-pushed the clean-command branch from 14c71a6 to 5ce8332 Compare December 8, 2025 20:02

Fix multiplatform_plan.json datetimes

fdf2add

ianpittwood merged commit a9dd4ab into main Dec 8, 2025
6 of 7 checks passed

ianpittwood deleted the clean-command branch December 8, 2025 21:01

		untagged_versions = package_versions.filter_untagged()
		untagged_old_versions = untagged_versions.filter_older_than(remove_untagged_older_than)

	untagged_versions = package_versions.filter_untagged()
	untagged_old_versions = untagged_versions.filter_older_than(remove_untagged_older_than)
	untagged_old_versions = package_versions.untagged().older_than(remove_untagged_older_than)

	target = urljoin(self.BASE_URL, self.ENDPOINTS["repositories"].format(namespace=namespace))
	target = urljoin(self.BASE_URL, self.endpoints("repositories", namespace=namespace))

	headers, response = self.client.requester.requestJsonAndCheck(
	_, response = self.client.requester.requestJsonAndCheck(

	def __init__(self, identifier: str = None, secret: str = None):
	def __init__(self, identifier: str \| None = None, secret: str \| None = None):

Conversation

ianpittwood commented Oct 28, 2025

Uh oh!

github-actions bot commented Oct 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test Results

Uh oh!

ianpittwood commented Nov 7, 2025

Uh oh!

bschwedler left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bschwedler left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions bot commented Oct 28, 2025 •

edited

Loading