Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: checkpoints #48

Open
zabealbe opened this issue Dec 6, 2021 · 3 comments
Open

Feature request: checkpoints #48

zabealbe opened this issue Dec 6, 2021 · 3 comments

Comments

@zabealbe
Copy link

zabealbe commented Dec 6, 2021

The build try fix iterate approach does not really work while building a .pimod file, rebuilding an image from scratch because of a typo takes way too much time.
A very cool feature would be Dockerfile-like checkpoints, or, a CHECKPOINT macro to use in the .pimod file

@oxzi
Copy link
Member

oxzi commented Dec 7, 2021

I see your point, but don't know how one can implement this without getting too much complex for this specific issue.

Back then, we addressed this problem through an iterative approach. Like, e.g., with Ansible Roles, we had a "basic" Pifile extending the vendor's image. Third Pifiles then consumed the already altered image and performed further changes. This is sketched in the paper in figure 3 on page 7.

Of course I am open for better ideas.

@zabealbe
Copy link
Author

zabealbe commented Dec 9, 2021

In your case creating checkpoints every RUN would be very expensive as we can’t implement the layers mechanism docker has, my idea is to add a CHECKPOINT directive that saves the image in the .cache and restores it as needed.

So basically it’s like having multiple pimod files daisy chained but automatically and explicitly with a CHECKPOINT directive

oxzi added a commit that referenced this issue Dec 12, 2021
Does not work yet; #48.
@oxzi
Copy link
Member

oxzi commented Dec 12, 2021

Having experimented a bit, I can say that this doesn't work well with the current architecture.

The fundamental difference between Docker's and pimod's design is how the data is stored.
Docker uses multiple overlays, each for one of the linear steps within the Dockerfile.
In contrast, pimod does not execute its commands as they appear, but within their stage. Between those stages might be functions which are executed when entering of leaving a stage. Commands of the 30-chroot and 40-postprocess stage might be executed within the guest resp. image, resulting in the image being mounted while pimod is within this stage. Thus, sadly, determinism is difficult to achieve.

I tried implementing a simple hash based cache for those pimod commands which should alter the image. While doing so, I experienced unexpected hash changes, e.g., after entering the chroot. It seems like just mounting the image creates a non-determinism.

As we are working with complete images, not just overlays, both saving and loading creates IO load. When using a CoW filesystem like btrfs, this can be reduced by telling cp to use the CoW feature. However, on, e.g., an ext4 file system, the caching takes longer then simple RUNs.

At the moment I feel that with both the current architecture and the limitations/complexity of Bash, this would be better left unimplemented. However, please feel free to have a look at the not working checkpoint branch: https://github.com/Nature40/pimod/compare/checkpoint.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants