Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] add import-image command for importing single image #83

Closed
wants to merge 3 commits into from

Conversation

iwilltry42
Copy link
Member

@iwilltry42 iwilltry42 commented Jul 1, 2019

This PR adds a stand-alone k3d import-image command that can be used like this:

k3d import-image -n test0 -i nginx:local

This would do the following:

  1. Run docker save to save the image from the local docker daemon to a tar archive (in a directory that's bind-mounted into all the node containers)
  2. Exec into each node container and run ctr image import to make the image usable by containerd inside the containers
  3. Delete the archive locally

Requirements:

  • internal: always needs to bind-mount a directory (currently <clusterDir>/images) into all the node containers
  • external: rancher/k3s with a tag >= v0.7.0-rc2 because of the inclusion of ctr in k3s

Up for improvement:

  • make use of args instead of --image/-i
  • allow submitting a comma- or space-separated list of images to import multiple images with a single run
  • bind-mount a directory into all the containers that can be used by containerd to cache images
    • That way, we'd only need to import the image into a single node and have it shared with the others
      • especially to save space on local disk

Note:

This is still in very early phase, so this PR is mainly made for contributors and testers to have a look and give an opinion :)

Issue Reference: #19

@iwilltry42 iwilltry42 added the enhancement New feature or request label Jul 1, 2019
@iwilltry42 iwilltry42 self-assigned this Jul 1, 2019
@iwilltry42
Copy link
Member Author

iwilltry42 commented Jul 1, 2019

Update: the changes in this branch leverage goroutines to run the ctr image import in all nodes concurrently.
Pro/Con:

  • Pro: faster than sequentially issuing the command per container
  • Unsure: we're not returning (only printing) an error when it fails for one container
  • Unsure: Could as well be achieved by submitting the exec job for each container to the docker api and putting the connection in a list -> then reading from those connections gives a hint on which of them finished successfully

@andyz-dev
Copy link
Contributor

Here are some high level comments from q quick glance. I have not reviewed the code line by line.

  1. I'd think 'load' will easier to type than 'import-images', so how about

k3d load -n test1 image1 imag2 ... imageN

  1. Notice the -i option is also dropped from above example. image1.. imageN are arguments to the load subcommand. We can thus avoid requiring a comma separated list, (I think you have the same suggestion within the PR? )

  2. Instead of using host bind mount, how about using a docker volume? I'd think docker volume is more portable regardless of the host file systems, and we don't have to worry about writing into the host file system directly. It may also help when the docker machine is running remotely. Since we have access to docker API directly, we can just create a docker volume for each cluster for loading images.

@iwilltry42
Copy link
Member Author

Here are some high level comments from q quick glance. I have not reviewed the code line by line.

1. I'd think 'load' will easier to type than 'import-images', so how about

Let's add load as an alias to have the more verbose (explanatory) option still present. WDYT?

k3d load -n test1 image1 imag2 ... imageN

1. Notice the -i option is also dropped from above example. image1.. imageN are arguments to the load subcommand.  We can thus avoid requiring a comma separated list,  (I think you have the same suggestion within the PR? )

I already dropped the -i flag in favor of using args, where the list can either be comma- or space-separated 👍

2. Instead of using host bind mount, how about using a docker volume? I'd think docker volume is more portable regardless of the host file systems, and we don't have to worry about writing into the host file system directly. It may also help when the docker machine is running remotely.  Since we have access to docker API directly, we can just create a docker volume for each cluster for loading images.

You're right, I'll have a look into that 👍

@iwilltry42
Copy link
Member Author

iwilltry42 commented Jul 2, 2019

One problem I found while working on a solution using docker volume is, that with default settings, we may not have access to the volume directory that docker uses, so we'll have to figure out a way to either
a) put the directory in a location without protected access
b) save the tarball there somehow via the API

UPDATE: An idea could be to spawn a container than has the Docker CLI in it (or a script using the API) that shares the docker socket (or has access to the API) and mounts the named volume. This could then directly run a docker save into the named volume.
The idea I had before was a combination of docker save and docker cp, but in a scenario where you're connected to a remote docker daemon, this would result in twice the network traffic.

My brain is hurting a bit from thinking about this for too long... maybe I'm missing something obvious here? @andyz-dev

@andyz-dev
Copy link
Contributor

Here are some high level comments from q quick glance. I have not reviewed the code line by line.

1. I'd think 'load' will easier to type than 'import-images', so how about

Let's add load as an alias to have the more verbose (explanatory) option still present. WDYT?

Your call. In general, I am not a big fan of alias. But it probably make sense in this case, as you have pointed out.

k3d load -n test1 image1 imag2 ... imageN

1. Notice the -i option is also dropped from above example. image1.. imageN are arguments to the load subcommand.  We can thus avoid requiring a comma separated list,  (I think you have the same suggestion within the PR? )

I already dropped the -i flag in favor of using args, where the list can either be comma- or space-separated 👍
Thank you !

2. Instead of using host bind mount, how about using a docker volume? I'd think docker volume is more portable regardless of the host file systems, and we don't have to worry about writing into the host file system directly. It may also help when the docker machine is running remotely.  Since we have access to docker API directly, we can just create a docker volume for each cluster for loading images.

You're right, I'll have a look into that 👍
Nice!

@andyz-dev
Copy link
Contributor

One problem I found while working on a solution using docker volume is, that with default settings, we may not have access to the volume directory that docker uses, so we'll have to figure out a way to either
a) put the directory in a location without protected access
b) save the tarball there somehow via the API

UPDATE: An idea could be to spawn a container than has the Docker CLI in it (or a script using the API) that shares the docker socket (or has access to the API) and mounts the named volume. This could then directly run a docker save into the named volume.
The idea I had before was a combination of docker save and docker cp, but in a scenario where you're connected to a remote docker daemon, this would result in twice the network traffic.

My brain is hurting a bit from thinking about this for too long... maybe I'm missing something obvious here? @andyz-dev

Actually, you just need to start a container (without running) mounting the volume, then docker API can directly access to the volume. We can just reuse the k3s image as the dummy container for mounting the volume so we don't have pull another docker image unnecessarily.

A nice touch would be to have the dummy container mount the volume as read-write, for doing 'docker cp'. while the volume should be attached to the k3s nodes as read-only.

@iwilltry42
Copy link
Member Author

Actually, you just need to start a container (without running) mounting the volume, then docker API can directly access to the volume. We can just reuse the k3s image as the dummy container for mounting the volume so we don't have pull another docker image unnecessarily.

I think we're talking about the same thing in the end, but what do you mean with without running?
Because we need to have some logic present in that container to query the API (e.g. Docker CLI or some self-made script). Or would you simply curl it or something similar? 🤔

A nice touch would be to have the dummy container mount the volume as read-write, for doing 'docker cp'. while the volume should be attached to the k3s nodes as read-only.

Definitely a good idea I'd say 👍
Even though, as we'll have a volume there anyway now, we could save other cluster related stuff there if needed.

@andyz-dev
Copy link
Contributor

Actually, you just need to start a container (without running) mounting the volume, then docker API can directly access to the volume. We can just reuse the k3s image as the dummy container for mounting the volume so we don't have pull another docker image unnecessarily.

I think we're talking about the same thing in the end, but what do you mean with without running?
Because we need to have some logic present in that container to query the API (e.g. Docker CLI or some self-made script). Or would you simply curl it or something similar? 🤔

The following example may be helpful to illustrate what I mean:

  1. User (API equivalent) to do "docker volume create 'cluster-volume'", add the cluster-volume to all k3s cluster nodes (as read-only)

  2. Since docker don't allow images to be added to a volume directly, we will need to mount it to a container. So do (API equivalent) of 'docker container create -n dummy -v 'cluster-volume':/opt/images' k3s-mage'. Since we only create the container (without running it), we don't really care what image it is, might as well using the k3s-image to avoid potentially extra image pull.

  3. Now we can do 'docker cp tar-ball dummy:/opt/images' to copy a tar-ball into the volume. Once done, we can remove the dummy container. Using golang code against the Docker API, I believe you can also avoid saving the tar-ball into a local file system.

  4. Since we are not using any binaries within the container and we are not even running the container, we don't need to prepare a special image for it.

@iwilltry42
Copy link
Member Author

I understand what you want to do, but since docker cp runs from your local machine, it copies the tarball from your local machine (potentially over the network) to the dummy container running on the docker host.

That means, that first you have to get the tarball to your local machine.

As the goal of this PR is to export an image that you've built before using docker (which might be on the remote machine), you have to run docker save first to create the tarball.

Thus, you'd copy the tarball twice: from the docker host to your localhost (docker save) and then back (docker cp), which I'd say is quite inefficient.

If you already have the tarball locally due to some reason, then the proposed --tar option could be used to issue only the docker cp.


That's why I thought, we could completely skip the transfer of the tarball from the docker host to localhost and back by simply spawning a k3d-helper container on the docker host that mounts the same volume (just like you described in steps 1 and 2).
Now that helper container runs in privileged mode, mounting /var/run/docker.sock (shouldn't be a problem, since the k3s containers also run privileged), thus having access to the docker daemon/API and to the volume.
That container (can be FROM scratch with only a few MB binary) would then run docker save directly into the volume it has mounted.
Thus, we'd avoid any additional transfer of the image from the docker host to the localhost.

I don't think that this is a super clean way of doing it, but it has two big benefits:

  • It works in any environment, since we don't rely on the local filesystem but only talk to the docker API (even if it might be remote)
  • It should be faster, since we're skipping transfer of the tarball between localhost and docker host (which might be network traffic)

The drawbacks that I see:

  • 1 additional container (image that need to be pulled, while we might even be able to integrate it in the k3s image)

WDYT?

@andyz-dev
Copy link
Contributor

I agree with your analysis. k3d-helper approach would be more efficient in saving networking bandwidth than using the dummy container.

Here is my take on the design: I believe the majority of users will not be using remote docker since k3d is mainly aimed at laptop development. Not having to maintain another image makes k3d easier to use. I view k3d-helper as an optional solution that can be developed later when we hear more about the bandwidth concern. On the other hand, I do not oppose if we make k3d-helper as the primary solution.

@iwilltry42
Copy link
Member Author

I agree with your analysis. k3d-helper approach would be more efficient in saving networking bandwidth than using the dummy container.

Here is my take on the design: I believe the majority of users will not be using remote docker since k3d is mainly aimed at laptop development. Not having to maintain another image makes k3d easier to use. I view k3d-helper as an optional solution that can be developed later when we hear more about the bandwidth concern. On the other hand, I do not oppose if we make k3d-helper as the primary solution.

I agree with your belief that it's mostly being used with a local docker daemon (tbh, I most often didn't consider the other case, e.g. when I first created this PR). For the short term solution, I will go with the solution that you proposed (dummy container + save + cp), since it's easy to implement right now.
Anyway, I will give the other solution a take, since I don't think that it adds to much complexity, but I'd like to benchmark it against this solution 👍

Thanks for the healthy discussion and the honest feedback :)

@iwilltry42
Copy link
Member Author

Closing this for now until we found a proper solution without too many drawbacks that works for the use with a local and a remote docker daemon.

@minhnguyenvan95
Copy link

Can we use docker local as docker repo for k3d e,g: localhost:5000/some-repo:v23456

@iwilltry42
Copy link
Member Author

If you have a registry running then yes, but you have to edit the containerd config to be able to pull from there. There is another issue open regarding pulling from private registries and some discussions in slack (rancher-users, #k3d).

Is that what you mean?
This feature here is for migrating an image from local docker daemon (e.g. created by a docker build) into k3s' containerd.

@iwilltry42 iwilltry42 deleted the feature/load-images branch September 3, 2019 09:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants