Various improvements to tools/{analyze-layer-reuse.py,inspect-layers.sh}#69
Various improvements to tools/{analyze-layer-reuse.py,inspect-layers.sh}#69
Conversation
These changes came out of a session with Opus to generalize this script a bit more so it doesn't have the magic "auto-discovery based on tag" behaviour anymore, but instead you just pass in all the images you want in the order of "updates" directly on the CLI. That makes it more intuitive to use and generally useful. While we're here, simplify a bunch more things like dropping the "compare originals" feature. That can just be a separate invocation essentially. Also support the fallback to image history for component info similar to what we did for `inspect-layers.sh`. And finally, rename `--show-components` to `--show-changed-components` and add a `--show-unchanged-components`, which is also useful to know sometimes. Sort those components by layer size. Assisted-by: Claude Opus 4.6
Show total compressed size alongside the layer count in the header. Assisted-by: Claude Opus 4
There was a problem hiding this comment.
Code Review
This pull request introduces significant improvements to analyze-layer-reuse.py and a minor enhancement to inspect-layers.sh. The Python script is refactored to be more generic, accepting a list of images instead of discovering them by prefix. It now calculates data reuse based on layer sizes (bytes) rather than layer counts, which is more accurate. It also adds a fallback mechanism to read component information from OCI history for older images. The shell script is updated to display the total image size.
My review identifies a couple of areas for improvement in the Python script. One is a logic issue in how component sizes are aggregated, which could lead to misleading output. The other is a suggestion to refactor away from using a global variable for better maintainability. The changes are otherwise excellent and greatly improve the utility of these tools.
See individual commit messages.