New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(docker): adding git context support #86
Conversation
d426acd
to
77cd56c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should sort out the key situation, but otherwise it's "clean as a whistle." Well done again, sir.
01f33aa
to
9939be1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some things to consider; we can discuss whether / if some of my nits should be dealt with now or later.
But despite any comments (which are about both clarity and things where a better approach is possible), it's well considered and pretty clear on the whole, so nice job again.
requires "https://github.com/viega/zippy == 0.10.7" | ||
requires "https://github.com/aruZeta/QRgen == 3.0.0" | ||
requires "https://github.com/crashappsec/con4m#16f9597a4609df60c167085f8d309f69df463de9" | ||
requires "https://github.com/viega/zippy == 0.10.7" # MIT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the purpose of annotating this with the license?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
initially was using another dep for the semver parsing/comparing but it was GPL so then checked the licenses of existing deps. no specific reason, just seems useful information if inspecting deps as chalk is a static binary ans so we need to be careful with included deps/libs
src/configs/chalk.c42spec
Outdated
@@ -2561,6 +2561,28 @@ Note that, when chalk is invoked with 'docker' as its EXE name, the default IO c | |||
""" | |||
} | |||
|
|||
field git_exe { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for these it might make more sense to make this a list of places to look, then you can use the findAllExePaths
call, which is in nimutils and has the signature:
proc findAllExePaths*(cmdName: string, extraPaths: seq[string] = @[], usePath = true): seq[string]
Basically, you are fine to take the first item in the list... we don't in some contexts because we do extra checking (to make sure we don't run another chalk executable for instance).
The docs for the above:
## The priority here is to the passed command name, but if and only
## if it is a path; we're assuming that they want to try to run
## something in a particular location. Generally, we're disallowing
## this in config files, but it's here just in case.
##
## Our second priority is to the the extraPaths array, which is
## basically a programmer supplied PATH, in case the right place
## doesn't get picked up in our environment.
##
## If all else fails, we search the PATH environment variable.
##
## Note that we don't check for permissions problems (including
## not-executable), and we do not open the file, so there's the
## chance of the executable going away before we try to run it.
##
## The point is, the caller should anticipate failure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed these configs as eventually well want to use libgit anyway. for now it just searches PATH
src/docker_base.nim
Outdated
|
||
dockerPathOpt = findExePath("docker", userPath, ignoreChalkExes = true) | ||
dockerExeLocation = dockerPathOpt.get("") | ||
dockerExeLocation = findExePath("docker", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Style point, I don't love chaining calls like this when a single function is so large as to span multiple lines. It's much better clarity wise to have individual assignments, with "bite-sized" logic. That's why I often end up with a variable with "Opt" or ?
in the name that is just an intermediate. It generates the same code, but the more discrete statements makes it much easier to follow. Here, the get()
is so insignificant relative to the rest of the assignment that I missed it on my first read, for instance, and I already know the API...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed
src/docker_base.nim
Outdated
# Docker version 1.13.0, build 49bf474 | ||
# Docker version 23.0.0, build e92dd87 | ||
# Docker version 24.0.6, build ed223bc820 | ||
let (output, exitcode) = execCmdEx(dockerExeLocation & " --version") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Nim API for this stuff sucks. If docker changes its semantics about what gets printed on stdin and stdout this can silently break (since execCmdEx merges the two streams, OR refuses to give you stderr if you pass whatever the flag was to say not to combine them).
Additionally, remember that issue I fixed a few months back where a file was too big to fit into the OS buffer when passing over stdin? Yeah, execCmdEx does not deal with this problem and can easily hang if you pass too much into it.
All in all, the Nim team doesn't do a good job w/ posix subprocess stuff and I don't trust anything in their API at all. You really should use runDockerGetEverything()
below this (which manages getting the exe location anyway) or go straight to the underlying runCommand()
in nimutils if you need more customization of behavior.
I know you're just changing some code that is old and hasn't been touched for a long time, but if we're touching it, I think we should do it right :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
switched all usages I could find to use runCmd*
which normalizes to runCommand
from nimutils
src/docker_base.nim
Outdated
once: | ||
trace("Docker injection method: --build-context") | ||
|
||
var chmodstr = "" | ||
|
||
if chmod: | ||
if chmod or hasUser: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This hasUser flag doesn't seem right. The original code only passes the chmod flag when we're copying in an executable that needs to be chalked, and just always makes it executable.
If we're copying non-executable bits in yet need to set permissions (and I'm not sure where we'd need to do that), then I would instead make the parameter take a mode, not a boolean, and get the mode right. I don't see why, when you might want to chmod, the user is even relevant... if chmod might be warranted, just do it, esp since we don't have full visibility into all versions of docker and associated semantics.
But I think that's probably overkill.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if user is set in the dockerfile, /chalk.json
would not be accessible as it would be set with root permissions hence the need to change its permissions but Ill change to pass the permission mode
src/docker_git.nim
Outdated
error(strip(result.stdOut & result.stdErr)) | ||
raise newException(ValueError, "Git failed") | ||
|
||
proc getDefaultBranch(git: DockerGitContext): string = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, I'm not really satisfied with requiring the presence of git
. I guess it's fine for a first pass, but I'm thinking we should be wrapping and shipping libgit (and maybe libssh if we have to??)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additionally, if we do go out with this, then we basically have to wait a long time to deprecate the config vars that ideally won't be needed for very long. Why not just say, if it's not in PATH, make sure it's in your path?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removing config option and only checking PATH now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also just fyi docker
internally calls git
command so requiring git
be on PATH
wont be a new requirement for people to add chalk
to existing docker
flows
src/semver.nim
Outdated
# it only handles basic dot-separated version format | ||
|
||
type Version* = ref object | ||
parts: seq[int] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why isn't parts a three-tuple that has default values if you want to allow short version strings? It would make the comparison more efficient and more clear too (Many people aren't going to get the 'zip' idiom anyway).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed to tuple
src/semver.nim
Outdated
proc parseVersion*(version: string): Version = | ||
new result | ||
result.parts = @[] | ||
for i in version.strip(chars={'v', ',', '.'}).split('.'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also relies on parseInt to throw an error for malformed stuff, which makes this highly error prone as a parser. If people accidentally pass in the wrong thing, they could get spurious versions, etc.
OR it could blow up on things it probably should accept, like if people do semver but w/ release candidates... we've all seen v2.1.0-rc1
for instance. Personally I'd be much more precise and explicit when it comes to what is or is not allowed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for now this will support versions consisting of 1-3 version parts so only:
1
1.2
1.2.3
no other things will be allowed like rc1
or whatnot but eventually if we need that we can add support for that
src/semver.nim
Outdated
return not (self == other) | ||
|
||
proc `>`*(self: Version, other: Version): bool = | ||
for (a, b) in zipLongest(self.parts, other.parts, 0): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, if you use a tuple, you can just rely on the default tuple comparison, which will be far faster and far more clear.
For instance, something like:
type Version = tuple[major: uint, minor: uint, patch: uint]
And then >
and '!=' both work right without you even having to overload something. And it allows people to clearly ask for somevar.major
.
Or if you wanted to track any dash and tag name after, you could do it in an object, then just compare the tuple fields.
(and, more specifically, declare Version
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed comparison to use tuple however as in some places it prints original version back keeping it as object so that it can keep the original version name so things like 1.0
dont print as 1.0.0
if the tuple always has 3 versions
492c3d8
to
c00b750
Compare
5d2497f
to
f8b62cc
Compare
f8b62cc
to
400bb0b
Compare
Adding compatibility with docker git contexts. For example: ``` docker build -t chalk http://github.com/crashappsec/chalk.git ``` This is done by: 1. Doing shallow fetch of git context into a bare git repo. ``` git fetch --depth=1 origin <ref> ``` 2. From there we can extract `Dockerfile` with `git show` 3. Adjust Dockerfile as usual and pass it to build via `stdin` 4. If `stdin` is not supported, git context is checked out locally to disk and git context is replaced with path on disk. NOTE this is only necessary for non-buildx builds and it has edge cases as docker binary does more magic to checkout git context such as following all sub-modules recursively however for now chalk is kept simple by only checking out the main git context repo without any recursion. --- In addition fixing providing files to docker build as buildx version check was incorrect as it: * never called getBuildXVersion() so version comparison was against 0 * did not compare versions semantically so 0.8 was always greater than 0.11 which is incorrect as per semver To handle semver comparison, instead of using 3rd party lib such as https://github.com/euantorano/semver.nim added very simple `semver.nim` with basic comparison operators. refactor(semver): removing semver dep in favor of very simple local implementation
when using ssh scheme in git context, SSH needs to be customized to parse known hosts correctly
…rmissions looks like docker when installed via snap cant access tmp files directly inside /tmp but it can access them when they are nested inside another folder in /tmp this is important as chalk passes chalkmark and Dockerfile in some cases via tmp files and so it needs to write it to a place which will be accessible by docker
moving exception handling to docker command logic vs cmd parsing which allows to correctly handle exceptions without segfaulting
in order to facilitate that, git fetch logic was reworked to correctly fetch the git ref including related tags/branches/etc
otherwise git honors local git repo configs (if any) and therefore might be executing with duplicate/conflicting configs unfortunately there does not seem to be a way to ignore all local configs and only honor global git configs
for now it will look for a binary on PATH, and if not found blows up in some cases. eventually we can switch to using libgit/etc
also * renamed runProc* to runCmd* so prefix is common across commands * refactored codecDocker to ensure * file streams are closed during extraction * reusing common logic to extract chalk.json from layers
this allows to non-obtrusively show chalk reports in github actions
8f6affd
to
32b7927
Compare
Issue
Support for git context is required for default docker push action as it uses git context by default
https://github.com/docker/build-push-action
Otherwise chalk github action cant run with default parameters
Description
Adding compatibility with docker git contexts. For example:
This is done by:
Dockerfile
withgit show
stdin
stdin
is not supported, git context is checked out locallyto disk and git context is replaced with path on disk.
NOTE this is only necessary for non-buildx builds
and it has edge cases as docker binary does more magic to checkout
git context such as following all sub-modules recursively
however for now chalk is kept simple by only checking out the
main git context repo without any recursion.
In addition fixing providing files to docker build as buildx version
check was incorrect as it:
0.11 which is incorrect as per semver
To handle semver comparison, instead of using 3rd party lib
such as https://github.com/euantorano/semver.nim
added very simple
semver.nim
with basic comparison operators.refactor(semver): removing semver dep in favor of very simple local implementation
Testing
tests:--slow
TODOs before merge: