-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dockerfile: Use a fork of ComboStrike's base image, since it disappeared. #729
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…red. ComboStrike's docker-rails base image doesn't seem to exist in Docker Hub anymore, so our production image builds are failing. This switches the base image to a fork of their image that I created through some very sketchy methods. Since we don't have access to the base image anymore, the best I could do was take one of our most recent askdarcel-api prod images, and dissect it in order to recover the ComboStrike image. The trickiest thing about this is that the original ComboStrike image contained `ONBUILD` commands, which are only executed when some other image uses the ComboStrike image as a base image. These `ONBUILD` commands are no longer part of the Docker image after we build our prod image, so I had to restore these manually. The full process I followed to create this image was the following: - Download the latest `latest` askdarcel-api image with `docker pull sheltertechsf/askdarcel-api`. In retrospect, I should have picked a tag that was a numbered version, but for posterity, this was https://hub.docker.com/layers/sheltertechsf/askdarcel-api/latest/images/sha256-8aad42d50d705c9a9f1c7178fa7a7c7dfb7839315002305d15b0bfba6a7cebd2?context=explore - Export it to a tarball: `docker save -o image.tar sheltertechsf/askdarcel-api:latest` - Unpack the tarball - Open up `manifest.json`, note the SHA specified under the `Config` key - Open up the JSON file corresponding to that `Config` key: - Count the number of layers as well as the `history` items that do not have `"empty_layer": true` - Identify the layer corresponding to the ComboStrike base image by carefully reading the `created_by` keys and finding the right one based on what our Dockerfile actually runs and which ones appear in ComboStrike's open source Dockerfile, which is still on GitHub - Delete all the `history` items that corresponding to layers that are not part of the ComboStrike base image - Delete all the `rootfs` `diff_ids` that correspond to layers that are not part of the ComboStrike base image - The original `OnBuild` key was set to `null`. Replace this with an array of strings where each string is one of the `ONBUILD` commands in ComboStrike's original Dockerfile. I reverse engineered this by building a different Dockerfile that used `ONBUILD` commands and seeing how the `OnBuild` key in this JSON file was formatted. - Save this config JSON file - Delete all the actual image layer files (which are tar files with the same SHAs as the layers) - Back in `manifest.json`, delete the layers that have been removed. - tar everything back up, being careful to erase the user IDs and group IDs for every file, since not doing this causes problems - Import it back into Docker: `docker load -i newimage.tar` - Attempt to build the prod image - And fail on my M2 MacBook Pro due to a segfault when installing Ruby gems - Tag the image and upload it to Docker Hub - Download the image onto my old Intel MacBook, and try building the prod image again, this time successfully References: - https://github.com/ComboStrikeHQ/docker-rails/blob/6cbd1f229455086288f6e49065da708391520940/onbuild/Dockerfile This is the open source copy of ComboStrike's Dockerfile, but it itself starts with another base image that I'm not exactly sure how to recreate, so I thought it would be easier to recover it by modifying our prod image. This was useful for grabbing all the `ONBUILD` commands. - https://medium.com/htc-research-engineering-blog/restore-rollback-layers-from-docker-image-a4e4b117d7e6 This was a super useful guide for explaining how to open up and modify an existing Docker image. I still had to do a few extra things on top of what was in the guide, but it was really useful for figuring out what the major steps were. - http://h2.jaguarpaw.co.uk/posts/reproducible-tar/ When packing the edited image back into a tar file, I initially had issues using it as a base image because the permissions weren't correct, as they ended up with my own user's ID. I used the commands in this guide for reproducible tar files to erase the user ID.
schroerbrian
approved these changes
Jan 3, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finally getting back to this after the holiday haze. This is amazing! Thank you for fixing this and for the detailed explanation - I don't think I'd have been able to do it myself
Thanks for reviewing it! Hope that the deploys to staging and prod work now. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
ComboStrike's docker-rails base image doesn't seem to exist in Docker Hub anymore, so our production image builds are failing. This switches the base image to a fork of their image that I created through some very sketchy methods.
Since we don't have access to the base image anymore, the best I could do was take one of our most recent askdarcel-api prod images, and dissect it in order to recover the ComboStrike image. The trickiest thing about this is that the original ComboStrike image contained
ONBUILD
commands, which are only executed when some other image uses the ComboStrike image as a base image. TheseONBUILD
commands are no longer part of the Docker image after we build our prod image, so I had to restore these manually.The full process I followed to create this image was the following:
latest
askdarcel-api image withdocker pull sheltertechsf/askdarcel-api
. In retrospect, I should have picked a tag that was a numbered version, but for posterity, this was https://hub.docker.com/layers/sheltertechsf/askdarcel-api/latest/images/sha256-8aad42d50d705c9a9f1c7178fa7a7c7dfb7839315002305d15b0bfba6a7cebd2?context=exploredocker save -o image.tar sheltertechsf/askdarcel-api:latest
manifest.json
, note the SHA specified under theConfig
keyConfig
key:history
items that do not have"empty_layer": true
created_by
keys and finding the right one based on what our Dockerfile actually runs and which ones appear in ComboStrike's open source Dockerfile, which is still on GitHubhistory
items that corresponding to layers that are not part of the ComboStrike base imagerootfs
diff_ids
that correspond to layers that are not part of the ComboStrike base imageOnBuild
key was set tonull
. Replace this with an array of strings where each string is one of theONBUILD
commands in ComboStrike's original Dockerfile. I reverse engineered this by building a different Dockerfile that usedONBUILD
commands and seeing how theOnBuild
key in this JSON file was formatted.manifest.json
, delete the layers that have been removed.docker load -i newimage.tar
References:
https://github.com/ComboStrikeHQ/docker-rails/blob/6cbd1f229455086288f6e49065da708391520940/onbuild/Dockerfile
This is the open source copy of ComboStrike's Dockerfile, but it itself starts with another base image that I'm not exactly sure how to recreate, so I thought it would be easier to recover it by modifying our prod image. This was useful for grabbing all the
ONBUILD
commands.https://medium.com/htc-research-engineering-blog/restore-rollback-layers-from-docker-image-a4e4b117d7e6
This was a super useful guide for explaining how to open up and modify an existing Docker image. I still had to do a few extra things on top of what was in the guide, but it was really useful for figuring out what the major steps were.
http://h2.jaguarpaw.co.uk/posts/reproducible-tar/
When packing the edited image back into a tar file, I initially had issues using it as a base image because the permissions weren't correct, as they ended up with my own user's ID. I used the commands in this guide for reproducible tar files to erase the user ID.