-
Notifications
You must be signed in to change notification settings - Fork 772
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
buildah to use cache, layers for bash builds as well #1292
Comments
/cc @jeremyeder |
This is an awesome suggestion! I wrote a tool which utilizes buildah for building images (and ansible as a frontend) and I had to solve the same issue: caching layers. It would be awesome if buildah had an API for caching. On the other hand, it will be really tricky because you would need to tell buildah up front what the script is and how are you changing the filesystem and then buildah would need to figure out if there is a matching entry in the cache. |
@umohnani8 What do you think? Would this even be possible. We might be able to figure out that buildah from fedora Also how would we know that two buildah from fedora calls were related? |
I guess that any of those
|
@umohnani8 Any update on this issue? |
@QiWang19 Can you look into this? |
I'm very interested in this concept, great idea |
unbaked thought: could a design like redo be a good fit? A core idea of There are multiple implementations of redo, recording dependencies in different ways; each puts its own
But maybe a contract like "I'm doing whatever I want with So if we want a flat script, how would a script with arbitrary bash steps skip those steps? Maybe a 2-directional contract, with a buildah command reporting info on cache hit/miss? BTW, is it technically possible for buildah to "fast forward" the same container from current state to a an image found in cache? Or would we need |
@cben interested in opening a PR for this? |
I was also looking into this cache idea and actually spent some time playing about with That said, I think for
Using RedoThe short time I used
With this information, you can split a NAME=stage-name # could be random but keep in mind the lingering images
OUTPUT=$3
CONTAINER=$(buildah from fedora)
# do buildah steps
buildah commit $CONTAINER $NAME
echo $NAME > $OUTPUT This, in itself, is a do file, though it could also be written generically:
and the accompanying build instructions:
A stage leaves behind a file containing the output it made, in Built in supportInstead of relying on an external tool (although a tool that has many implementations and is not difficult to obtain),
The process of how it works:
So as an example: export CACHEFILE=.mybuild
buildah from fedora # saves its details (or hash of these details) and fedora. Also creates .mybuild.fedora-working-container
buildah run $CONTAINER -- yum install nginx # saves its details and advances the pointer
buildah copy $CONTAINER www /var/www # saves its details (including hash of the content it copied) and advances pointer
buildah commit end-image # saves its details and also end-image Now the same thing is executed again: export CACHEFILE=.mybuild
buildah from fedora # recognizes the start of .mybuild
buildah run $CONTAINER -- yum install nginx # recognizes the command hasn't changed and moves pointer
buildah copy $CONTAINER www/ /var/www # if www hasn't changed, recognizes and moves pointer
buildah commit second-image # recognizes end-image and recommits it as second-image It must be noted, if export CACHEFILE=.mybuild
buildah from fedora
buildah run --add-history $CONTAINER -- yum install nginx # now becomes a saved point
buildah copy $CONTAINER www/ /var/www # now if this changes, it can continue from after yum install nginx
buildah commit third-image With this system, even if instructions diverge, this creates multiple alternative paths that can still resolve to a saved image. |
Note that the |
It might also be noted that for the If the user wants to speed up its rebuild, they should also apply strategic uses of |
Lastly, if this addition to |
Sounds good, I don't know when we can get someone else to look at it. |
I will then try to familiarize myself with the buildah code and see if my idea is somewhat feasible. I'm not sure when I can work on it but as I'm migrating away from docker and dockerfiles (appreciating the daemonless, rootless abilities of buildah/podman and the ability to use system tools during image steps of buildah and mount dirs (e.g. for extracting test results)), some form of buildah cache would be beneficial. While builds in general seem to be faster, nothing beats a "well I can skip this 10 minute build process as no code changed". I guess it's a bit of a bad example as that could also be fixed by keeping a dirty build directory around and others fixed by a cache mount, but for sake of repeatability, the cache can help without resorting to mounts and dirty build dirs that have their share of potential issues. I predict it will take me some time and I made some assumptions that I might run against when trying to implement, but I'll keep this issue for any news I have on that topic. |
@dsonck92 Did you ever get a chance to work on this? |
Not yet as I'm pretty occupied with other things but I'm considering picking this up in December. |
I've tried playing around a little with the
Meanwhile, using a
My belief that buildah's image cache does not reuse layers arises from some experimentation I did myself with this approach (where I found that simply cloning an image, e.g. My understanding is that Docker's cache does not suffer this issue because its cache does store things in a layered-manner, unlike |
I recommend you to not over-enginering here. Ask yourself why do you really need a cache or layer? |
Well, meanwhile I changed my personal build system quite differently. I'm utilizing gitlabs (and probably other CI's have similar features) ability to run build steps in a container directly. This gave the ability to extract the different stages and reuse those later on, making the buildah step essentially a "from, add, commit". This lowered the complexity of builds considerably and essentially runs the build steps through podman on kubernetes, which adds some parallelism. Too bad it doesn't (yet) have a generic container runner and is tied to either kubernetes or docker. Now I know that this essentially evades the problem but I find it relatively elegant. I did discuss this at work, which basically shot it down with "but now it's not a single dockerfile that you can just execute" (or buildah dependent shell script), but then again, you can put the stages in a shellscript and execute those after eachother in the gitlab file and separate buildah steps. Though that will limit its usability somewhat. Complex matter |
A friendly reminder that this issue had no activity for 30 days. |
I don't think this should be closed. The concern @kwshi had with disk usage is resolved, but the main topic of this issue still has no solution as far as I am aware. |
A friendly reminder that this issue had no activity for 30 days. |
A friendly reminder that this issue had no activity for 30 days. |
A friendly reminder that this issue had no activity for 30 days. |
A friendly reminder that this issue had no activity for 30 days. |
@flouthoc PTAL |
A friendly reminder that this issue had no activity for 30 days. |
@kwshi @dsonck92 I have a different perspective to solution for this issue. So i am just posting it here. Could you guys please go through it and give your thoughts
Working of Build-from-bash
So
TLDR: CONS: |
If buildah currently caches At least, that's what I'm getting from your For me, the original feature is not that important anymore, since I'm generating all my files inside a container inside CI, and only require a single copy into the final container, the CI has some caching capabilities. But I do think an intermediate containerfile generator could help. |
Translating a bash script directly to Containerfile syntax is not feasible because of the complexity available with a full scripting language (if statements, for loops, external commands, etc...). If your bash script is simplistic enough that it can be translated to a Containerfile then you might as well write it in Containerfile syntax to start. The benefit to bash scripts is the additional flexibility not expressible using Containerfile syntax. A Containerfile generator is a great in between idea. Not as flexible as using a full bash script, but a lot more flexible than a static Containerfile. Something like that would be useful in a lot of cases. That said, it seems to me that a generator like that should probably be a separate project instead of part of buildah. |
The original comment mentioned this:
How would this look like in practice? |
@rhatdan I'm guessing you were the "Dan" mentioned in the original issue. Could you provide some more information on how to achieve this with
|
I was guessing something like:
|
The problem with this seems to me that it does not use the cache at all if this is used in a sequential manner such as a bash script. It will always create a container from What I would be looking for is a mechanism to skip the run steps if |
You are correct this has never been implemented. Only the Caching for Containerfile exists. This discussion has always been about a mechanism to build a generate bash version of caching. |
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
Description
Buildah can use cache/layers for building with
buildah bud
(#767). It would be beneficial if buildah could use a cache/layers for bash builds as well. I am using bash builds extensively as there are more convenient/useful for all of my use-cases.Dan mentioned that I could use commits in-between and base off the next step of them.
Nice idea but I think this will clutter up the script and the signal-to-noise ratio would be rather high.
BTW using bash scripts to build the images is/was a great idea, this way I can use everything, really everything as a tool that I can install on my Linux and I am not tied to a DSL that I cannot extend or is limited.
Describe the results you received:
Executing the buildah script a second time, buildah executes each step again.
Describe the results you expected:
Skip steps that did not change compared to the last invocation of the script.
Output of
rpm -q buildah
orapt list buildah
:Output of
buildah version
:The text was updated successfully, but these errors were encountered: