Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: allow for remote repository inspection #14258

Closed
wants to merge 1 commit into from

Conversation

miminar
Copy link
Contributor

@miminar miminar commented Jun 29, 2015

Added new flag to docker inspect command:

$ docker inspect --remote <imageName>[:<tag>]...

Which inspects remote images. Additional [:TAG] suffix may further
specify desired <image>.

This allows for remote inspection without downloading image layers.

Log Tag, Digest and Registry to stderr as debug messages. Need to pass
-D flag to Docker client to see them.

Resolves #14257

@rhatdan
Copy link
Contributor

rhatdan commented Jul 13, 2015

Anyone want to look at these? We want to use this to be able to verify an image is up 2 date.

The idea is you have an image say foobar, which is based on rhel7-1.2. The Atomic tool searches through all of the images and searches for newer versions of rhel7 at the registry. When it finds one it reports to the user that they might want to rebuild the foobar image to use the newer layered image. Or to contact the provider of the foobar image to provide an updated version of the package.

@crosbymichael @tianon @jfrazelle Anyone?

@LK4D4
Copy link
Contributor

LK4D4 commented Jul 13, 2015

Idea makes sense for me. Maybe we should ping registry guys too.
ping @docker/core-maintainers @stevvooe @dmcgowan

)

type LookupRemoteConfig struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please document data structures according to Go's documentation guidelines.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@tiborvass
Copy link
Contributor

@stevvooe this in design review, code review is not required yet, otherwise it would need a rebase ;)

@stevvooe
Copy link
Contributor

@tiborvass If you lay code down before doing a proper exploration of the design space, it can be up for review.

@stevvooe
Copy link
Contributor

Before proceeding with this proposal, at least the following questions should be addressed:

  1. What are the common use cases this addresses?
  2. What other solutions or workarounds have been tried to solve this same problem? Why are they insufficient?
  3. Has the concept of a "shallow" pull been explored?
  4. Does the concept of a "remote" image even make sense? Is this concept orthogonal to other concepts of what an image is? Why is this differentiated? How does the concept of a "remote" image apply to the rest of the platform?

@rhatdan
Copy link
Contributor

rhatdan commented Jul 13, 2015

  1. The common use case is to be able to download and examine one or more image json files from the registry to discover which images you want to pull. Our goal it to look at the LABEL data associated with images. Show me all of the images based on RHEL7, show me all of the images with this license field.
  2. One option would be to pull all of the images that match the search criteria.
  3. Please define "shallow" pull.

@rhatdan
Copy link
Contributor

rhatdan commented Jul 13, 2015

Michal is in CZ, so he will respond tonight.

Added new flag to `docker inspect` command:

    $ docker inspect --remote <imageName>[:<tag>]...

Which inspects remote images. Additional `[:TAG]` suffix may further
specify desired `<image>`.

This allows for remote inspection without downloading image layers.

Log Tag, Digest and Registry to stderr as debug messages. Need to pass
`-D` flag to Docker client to see them.

Signed-off-by: Michal Minar <miminar@redhat.com>
@miminar
Copy link
Contributor Author

miminar commented Jul 14, 2015

@stevvooe Maybe there's a better name for remote image. The point is that downloading binary blobs and making remote image a local one just to inspect it is quite an expensive operation. The object of inspection is really an image stored either locally or at remote registry. But I don't think there are other areas where a concept of remote image could fit. You can't run, build or delete remote image. You may only pull it and create/update it with a push. To avoid changing the terminology for these commands, the help message could be changed to something like "Inspect image stored remotely". WDYT?

@stevvooe
Copy link
Contributor

@miminar I see. What if we explored the concept of a shallow pull? This would just pull the manifest and image json. It can be inspected and ran, but running would cause a full pull:

docker pull --shallow ubuntu

This is a much better approach than adding the concept of "remotes" to the inspect command. Otherwise, we risk over-coupling remote image operations to commands that don't really have any notion of remote.

@rhatdan
Copy link
Contributor

rhatdan commented Jul 15, 2015

How would this work? The end goal is to be able to just do a docker inspect content on remote registry for multiple images.
Lets look at an example. I have a mongodb image based on rhel7 base image rhel7-7.1.2

Would the tooling need to do

docker pull --shallow rhel7:latest
docker inspect rhel7:latest

Would the inspect work? Would it pull down the image?

@cpuguy83
Copy link
Member

I would rather have something that actual performs the desired action.
I don't think --shallow makes much sense, and --remote does dirty up inspect quite a bit.
If we want to compare a local image to a remote, why not have a command that can do that?

Going beyond that, I think this fits into the overall security strategy, where there ought to be a way to easily determine (w/o pulling) if your images are up to date (and their parents are up to date).

@rhatdan
Copy link
Contributor

rhatdan commented Jul 15, 2015

Well it is bigger then just security. Basically you want to look at the image meta data on the registry.
I could envision where I just want to do specific queries of images before pulling. I don't want to wrap this up in security. In the example I gave, it would not necessarily be security, just that their is a newer image available then the layered image my application is based on.

I might want to search for images with maintainer @cpuguy83, all images based on RHEL7, Or all images from Company XYZ...

@ghost
Copy link

ghost commented Jul 15, 2015

@cpuguy83 the intend is really a remote expect. The output we are looking for is the same as a local inspect would provide - with minor adjustments for some data that is not applicable in one of the cases.

@stevvooe
Copy link
Contributor

What about a tool that can inspect the remote registry? Adding more interaction with the registry to other commands will further couple the two commands. A local image and a remote image may have very different semantics. Some information output by the inspect command may not make sense for a remote image. By adding this command, we make the decision that these continue to remain coupled, which is directly against the current goal for the registry.

@ghost
Copy link

ghost commented Jul 15, 2015

There are some projects along those lines - like libdoug.

But in this case we are deliberately trying to get to exactly the same
semantic. There is differences in data are minimal. My experience with
package distribution from the dpkg/apt and rpm/yum tradition actually has
convinced me that keeping them coupled is a good idea. It's actually one of
the things that we really like about Docker - an integrated transport with
consistency between the remote and local representation.

On Wed, Jul 15, 2015 at 2:20 PM, Stephen Day notifications@github.com
wrote:

What about a tool that can inspect the remote registry? Adding more
interaction with the registry to other commands will further couple the two
commands. A local image and a remote image may have very different
semantics. Some information output by the inspect command may not make
sense for a remote image. By adding this command, we make the decision that
these continue to remain coupled, which is directly against the current
goal for the registry.


Reply to this email directly or view it on GitHub
#14258 (comment).

@rhatdan
Copy link
Contributor

rhatdan commented Jul 15, 2015

You can add other METADATA that is different then the docker image json data, but we really just want to compare examine the local image json data the same way we examine the remote. The tooling we are building will first examine all local images before going to remote registries. Having two different tools for examining the same data seems a little crazy.

@willmtemple
Copy link

There's really no reason I should have to pull and store any amount of data permanently within my local image cache to inspect a remote image. I also think --shallow pull is a bad idea in general. What happens if I have fooimage:latest older than the same tag on the docker hub, and then I docker pull --shallow fooimage:latest? Does it untag my image? That seems very silly if all I was interested in was checking the metadata on the registry. We can already do this with standard http libraries, or even calls to curl, but that requires negotiating different registry API versions and authentication. Since the docker daemon must already implement that functionality, it doesn't make sense to have to write our own client to check that data.

I think that --remote is the most semantically obvious way to get to this functionality given what we consider to be a "remote" (a tag on a registry), in the current state of docker.

IMHO, however, in an ideal world, locals and remotes would be mostly indistinguishable, similar to git repos, and information about the "remoteness" of a repository should be syntactically encoded into how we refer to it on the CLI e.g. docker.io/repo:tag or localhost:5000/repo:tag is always considered "remote".

@stevvooe
Copy link
Contributor

@rhatdan I agree the use case is valid but just adding this to docker inspect because it doesn't obviously belong anywhere else is the kind of scope creep that makes projects unusable.

The assumption that remote images have the same data is flawed. Images don't necessarily exist remotely. They may be only assembled when they reach a docker client. There is a massive effort to decouple the image distribution model to make it more flexible (at your request, even). The existence of features like this, where we make remote access sprawls over the UI, serve only to hold these efforts back.

@willmtemple In fact, the suggestion of a "shallow pull" comes directly from git. While git may make remote and local seem indistinguishable, it is git's design that provides this very effective illusion. The local references in git provide a view into what is available remotely. Operations on remote repositories are only available through a restricted set of commands (push, pull, fetch, clone). We still have this proven design in docker. Right now, only push, pull, run, build and search interact with remote services. Adding --remote to inspect breaks this design and departs from the principles that makes git's, and docker's, design so effective.

I am not suggesting that users write their own clients to access this data. I am suggesting that we don't rush in a half-baked feature that violates the design goals.

@willmtemple
Copy link

@stevvooe I think that makes sense in the context of git, where I can have a local repository with multiple configured remotes, and can fetch a remote without clobbering my own branches in the local scope. But when I docker pull image-already-on-system:latest it pulls that image, untagging the old image in the process. A "shallow" pull seems like it would have to erase my local tags to store that information locally in a sane manner without a real notion of remotes. While I don't think that comparisons with git are unfair or unwarranted, IMHO they also aren't fully accurate.

If I understand this correctly: a "shallow" local repository would obey the normal contracts of an image except that when the user actually requires the image data, it is pulled? I'll retract my previous statement about this being a "generally bad idea," now that I've given it some thought. I can definitely see the use-case for that, but I think that the contract of docker pull requires it to modify the local repo state with new information, and I think it would be a real shame if viewing the remote metadata required me to modify my local repos.

Ultimately I don't think it will really matter whether it's an extension to inspect or its own remote-inspect (or whatever it wants to be called, though I do think that's a bit uneconomical), or even a "shallow" pull as long as there's a facility for easily, automatically testing to see if an image is up-to-date with the registry.

@stevvooe
Copy link
Contributor

@willmtemple I do agree that the missing piece between docker and git are discrete "remote" and "local" "branch heads". This isn't as bad with shallow if one runs a strict tag (say "2.0"), while a shallow might update "latest" but the conjoined model would be surprising.

Taking this PR forward, it seems like we can do one of two things:

  1. Add the --remote flag to docker inspect with the caveat that we can remove it in any version at any time without notice. This is not something that we want to continue supporting in the future.
  2. Add a very restricted docker remote inspect --as-image command. This could be the kernel of a tool intended to allow interaction with remote docker resources. We'd still reserve the right to change the behavior of this command in the future, but we could at least ensure --as-image always works correctly.

Also, remember that any solution should support digest references, as well.

@willmtemple
Copy link

@stevvooe Sounds good. I like either of those solutions from a purely utilitarian standpoint as long as the ability to inspect a remote repository is obvious to the user. Going forward, @miminar has had to devote his attention to other areas, so after having a short conversation with him, I'm going to adopt this Pull Request.

I do think that --remote will be more economical, as I'm not sure what other components would fall under the "remote" umbrella. pull, push, and search are inherently remote, and run only interfaces with remote resources (afaik) if it can't find the image locally. I wouldn't think that we would put these under a remote subcommand. What would be some other examples of commands that would run in that space? Since #13375 introduced large changes, I'll spend the next few cycles reworking this code with the possibility of a remote subcommand in mind.

@stevvooe
Copy link
Contributor

@willmtemple docker remote would allow for configuration of remote resources, such as registries and search index, similar to git remote. docker remote inspect would depart from the git model by allowing one to interact with registry objects (tags, manifests, etc.). docker remote delete might allow you to delete, along some sugar to do so with docker push. This idea is very undeveloped.

It sounds like you are head down the option 1 path. Let's not forget the following when we do eventually delete this functionality:

Add the --remote flag to docker inspect with the caveat that we can remove it in any version at any time without notice. This is not something that we want to continue supporting in the future.

@rhatdan
Copy link
Contributor

rhatdan commented Jul 23, 2015

That is fine with us, since we plan on wrapping this functionality. We could even start out with it marked as deprecated if you want. I would prefer to see the idea of docker remote evolve and show up before we spend a lot of time building it, since you seem to understand it much better then we do.

We do want to get access to the API via docker-py also, so we can talk to them about this.

@rhatdan
Copy link
Contributor

rhatdan commented Jul 23, 2015

As I explained above, #14258 (comment), we want to use this capability in the atomic tool to tell users whether or not they are using older layered images.

@icecrime
Copy link
Contributor

icecrime commented Aug 3, 2015

I haven't read this PR in every details, but I understand you reached an agreement on the path forward so I'm assuming you'll be ok with me closing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Proposal: pull down JSON file independent from an image
9 participants