Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support for identifying out of date docker images #13061

Open
josh-m-sharpe opened this issue May 19, 2022 · 7 comments
Open

support for identifying out of date docker images #13061

josh-m-sharpe opened this issue May 19, 2022 · 7 comments

Comments

@josh-m-sharpe
Copy link

Proposal

Nomad should tell me that an image I have deployed is not the latest version available. Preferably with a bell icon and red dot.

This may seem like an unlikely thing for Nomad to solve for, but if not Nomad, what other tool would do it? As best as I can tell, the nomad server agent is the only running process that could have knowledge of the running image versions at any given time as well as have access to check the repository for a more current version.

Use-cases

Security.

Attempted Solutions

None within the scope of nomad. I'm thinking of manhandling those job files and parsing out the current versions and then notifying myself somehow.

@tgross
Copy link
Member

tgross commented May 19, 2022

I love the idea of this kind of thing, but has some architectural complications...

As best as I can tell, the nomad server agent is the only running process that could have knowledge of the running image versions at any given time as well as have access to check the repository for a more current version.

As it turns out, the server has no idea what a Docker image is: anything that falls in the task.config is handled by the task driver on the client (which could be a third-party driver and not inside Nomad at all!). Even the client doesn't really know what the schema of the config is. Also, the client itself runs as root on the host, so we probably don't want to make it in charge of making third-party scanning requests.

That being said, the Job and Allocation APIs are readable by any application with the appropriate Nomad ACL token. So a potentially interesting idea here is to run a job on the cluster that has minimally scoped Nomad privileges (read-only on the Job/Allocation) and have it periodically scan the set of allocations, get their task.config.image (or other artifacts for drivers like qemu!) and then send those results to whatever third-party scanning service we'd want to use.

@tgross tgross added this to Needs Triage in Nomad - Community Issues Triage via automation May 19, 2022
@tgross tgross moved this from Needs Triage to Needs Roadmapping in Nomad - Community Issues Triage May 19, 2022
@josh-m-sharpe
Copy link
Author

As an aside, do such "third party scanning" tools exist? The only thing in this realm I'm familiar with is github's Dependabot - but thats not an API data could be sent to to be scanned.

@tgross
Copy link
Member

tgross commented May 19, 2022

Without recommending any one in particular, I know that Docker has an integration with Synk for their registry. See https://docs.docker.com/develop/scan-images/#scan-using-docker-hub for example.

@dasavick
Copy link
Contributor

dasavick commented May 20, 2022

I really like the idea. However, without hardcore version pinning (specific commit/patch) practiced by nomad cluster operator, scanning for task.config.image would be of limited to no usability.

This is actually not a problem of no version pinning at all: using major versions like redis:7 can put us severely behind update schedule when a job is running uninterrupted for a long time, while potentially giving a false sense of security and being up-to-date.

I feel like scanning for the string in the job files is more for tools like previously mentioned dependabot. I don't think there is anything that would prevent that, and it does not give a false sense of runtime up-to-dateness.


That said, docker stores runtime image version as digest (docker inspect mycontainer) in Config.Image and that does not help, I don't think there is a way to reverse the digest as many tags can have the same one.

I don't see how that would allow obtaining specific version distance easily, but comparing that to the digest of current pull of task.config.image would at least allow an equality check to be made resulting in "there is some new version" information.. a bit underwhelming, but at least something.


It also came to my attention that nomad cannot redeploy currently running jobs with unchanged jobspec (same image tag), which complicates things further (#1576, #2038) for any external tool to do that without some hacky env/meta changes (#698, #3949). I looked into force_pull but it does not seem to do anything for re-submit of already running jobs. Updates from the UI/in general would be even better if there was image prefetch (#6380) available.

@tgross
Copy link
Member

tgross commented May 23, 2022

That's all an excellent point @PinkLolicorn: basically it comes down to Docker image tags aren't immutable (and are frequently mutated!).

That said, docker stores runtime image version as digest (docker inspect mycontainer) in Config.Image and that does not help, I don't think there is a way to reverse the digest as many tags can have the same one.

I think that's ok... the Registry API does support query by manifest reference for the Detail API. That still leaves the matter of an API for the scans themselves, which I don't see published anywhere.

It also came to my attention that nomad cannot redeploy currently running jobs with unchanged jobspec (same image tag)

Yeah, that comes back to the task.config being opaque to the server. Pinning by SHA is probably the way to go if you want to do this kind of thing. It's not super user-friendly if you're deploying via the command line, but if you've got a scenario where you're trying to drive change based on scans, maybe a CI-driven workflow is better anyways?

@nierob
Copy link

nierob commented Jul 22, 2022

Actually, I have solved the problem by attacking it from a different angle. Instead of detecting an old image I'm very aggressive with updating. Whenever there is a new version of an image I'm testing it and deploy automatically (well almost).

Inside docker config in nomad spec:

  config {
    image = trimprefix(file(join("/", [ [[.basedir]], "Dockerfile.FROM.only" ])), "FROM ")
  }

I'm using a template, where basedir is just base dir for importing local file and Dockerfile.FROM.only looks like:

FROM foo/bar:latest@sha256:abcabcabcabcabcbacbabcabc

Then dependabot or other tool can update that Dockerfile at will. So one does not get only "detection", but it goes one step further: a pull request. Mark that image digest is used to guarantee immutability. Usage of latest is generally Ok here, because of the digest it is not an "unknown" version. One can use a non generic tag too but it seems that dependabot sometimes ignores it (bug?). The solution depends on a strong, blocking CI. Depending on content of the image, some version updates are not straightforward, for example in case of a DB there could be a data migration, that may require additional steps. YMMV.

@ppacher
Copy link

ppacher commented Jan 17, 2023

Hi,

we're also trying to figure out a way to get notified when a new container image is available for any task that uses the docker driver. There are already projects like watchtower but it needs to talk to containerd directly which is not really suitable in a Nomad environment. I played around with the Nomad API a bit and think a straightforward solution would be that task-drivers can append custom meta-data to a task allocation (i.e. adding something like DriverInfo to Allocation.TaskStates). This way, the docker/podman driver could append the image hash (which it knows; maybe also the docker container ID) to the task. That would allow external tools that query the Nomad API to immediately know which exact version of an image is used for each allocation and notify/act accordingly.

Right now, one would need a system job that runs on all clients and can talk to the docker daemon and one "control" server that queries the Nomad API to detect which allocations are scheduled on which clients and then try to aggregate the information. While this could work it requires a lot of care to correctly map the container image to the task allocation as multiple tasks using the same docker image might be executed on a client (correctly attributing the Nomad task per container is still possible by parsing the docker-inspect output and extract the nomad allocation ID from the container environment).

What are your thoughts on this? Or is there any other (better) way that I missed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

No branches or pull requests

5 participants