Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remote snapshotter in podman #4739

Closed
siscia opened this issue Dec 21, 2019 · 35 comments
Closed

Remote snapshotter in podman #4739

siscia opened this issue Dec 21, 2019 · 35 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.

Comments

@siscia
Copy link

siscia commented Dec 21, 2019

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind feature

Description Remote snapshotter in podman

We are really interested in the possibility to use a remote snapshotter like the one provide in containerd also in podman.

A remote snapshotter is a piece a containerd plugin which is provided as input the name of the layer that containerd need and it either mounts the correct directory or return an error.

All this process is manage by "user-level" code/plugin.

Is something like this even possible in podman_

@openshift-ci-robot openshift-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Dec 21, 2019
@baude
Copy link
Member

baude commented Dec 22, 2019

care to contribute?

@siscia
Copy link
Author

siscia commented Dec 22, 2019

Maybe I can find the time myself or maybe we can find some resource to have somebody working on it.
It is something interesting for podman? If I had a PR ready, it would be merged?
I haven't really find the time to explore the codebase, who is responsible for this particular part of the codebase?

@AkihiroSuda
Copy link
Collaborator

cc @ktock

@mheon
Copy link
Member

mheon commented Dec 22, 2019 via email

@ktock
Copy link

ktock commented Dec 23, 2019

Thanks a lot for opening it! I'm keen to contribute to it.

I'm currently working on the implementation of remote snapshotter plugin which enables us to mount layers without pulling the actual contents. We can plug any filesystem into it so we currently support CRFS's stargz-based filesystem and we discuss to support CernVM-FS and other filesystems.
https://github.com/ktock/remote-snapshotter

Remote snapshotter currently supports containerd's snapshotter API but I think it's not hard to support graphdriver API as well.

I'll look deep into the codebase.

@siscia How do you think about this implementation strategy?

@ktock
Copy link

ktock commented Dec 23, 2019

I opened the discussion on containers/storage#498 .

@giuseppe
Copy link
Member

low level bits are being worked in https://github.com/giuseppe/crfs-plugin and fuse-overlayfs.

The rest of the implementation, as @mheon said, should go into containers/storage

Other remote file systems can be added in a similar way to crfs-plugin that can be used with fuse-overlayfs to lookup files from the image/lower layers

@github-actions
Copy link

A friendly reminder that this issue had no activity for 30 days.

@rhatdan
Copy link
Member

rhatdan commented Jan 24, 2020

@giuseppe Any more progress on this?

@siscia
Copy link
Author

siscia commented Jan 24, 2020

I am still trying to figure out what is the best approach for this.

I was expecting the storage to be layer-digest based, like each layer was indexed by its own hash.
Something like:

- storage
|- 123...
|- abc...
|- ....

Where 123... and abc... where the hash of the layer itself.

This does not seems to be the case.

Unfortunately I still haven't understood what hash is used to index the layer.

However this is quite a complication, a read-only remote snapshotter need to know how podman is looking for layer.

In my understanding this is managed by containers/image codebase that I am exploring now. However it is a big codebase and it is taking time.

Any feedback or help is very well appreciated.

I may be missing something in my analysis as well!

@rhatdan
Copy link
Member

rhatdan commented Feb 17, 2020

@siscia Are you still working on this?

@siscia
Copy link
Author

siscia commented Feb 17, 2020

Honestly we were planning to propose this as GSoC project.

Progress on this front are a little slow at the moment since the whole program didn't start yet.

Regarding this project I am busy with the bureaucracy from our side (mostly sorted out) and coming up with a good test for possible students.

@ktock
Copy link

ktock commented Jun 9, 2020

Recently I considered about the design of it and I wrote a PoC for it. We might need changes on both of containers/image and containers/storage and I opened threads (Pull Requests) for each repo:

Though they are still draft, could I get comments on it? Both of them are based on the perspective of stargz side, so I'm happy if CVMFS people give feedbacks on it.

@giuseppe
Copy link
Member

giuseppe commented Jun 9, 2020

there is a GSOC student working on adding CVMFS support to containers/storage.

@Mohitty could you take a look at the proposal?

@ktock we are planning on using the concept of "additional store" we have in containers/storage to emulate a remote snapshotter. Have you had a look at it?

@ktock
Copy link

ktock commented Jun 10, 2020

Good to hear! Please let me know if there is anything I can help because I'm currently working on a remote snapshotter implementation (containerd/stargz-snapshotter) in containerd community.

IIUC, the additional store functionality doesn't support layer discovery? I used Driver.Create API instead because stargz uses container registries as the backing remote store and it's hard to sync all layers metadata from these registries to nodes in advance. So stargz snapshotter discovers the targeting layer from registries and dynamically mounts it for each query to a layer digest. Please tell me if I'm missing something. BTW can CVMFS sync all layer metadata from the backing remote store to nodes in advance?

@Mohitty
Copy link

Mohitty commented Jun 10, 2020

there is a GSOC student working on adding CVMFS support to containers/storage.

@Mohitty could you take a look at the proposal?

@ktock we are planning on using the concept of "additional store" we have in containers/storage to emulate a remote snapshotter. Have you had a look at it?

Thanks @giuseppe
I'll take a look at it.

@ktock
Copy link

ktock commented Jun 17, 2020

@giuseppe @siscia @Mohitty

Thanks for comments. Based on #4739 (comment), I rethought the design of this functionality to leverage additional layer store. What I've done is adding layer discovery functionality for the store, which should be needed also for CVMFS integration. (containers/storage#644 , containers/image#956)

For some filesystems including stargz-based one, recognizing all available layers and storing the exhaustive list of *store.Layer in the additional store in advance is difficult. We need something like "layer discovery" functionality here, which allows clients (e.g. *storageImageSource.TryReusingBlob) to tell the store which layers they want, with some additional information (e.g. layer digest, diffID, image reference, etc). This allows the store to discover the specified layers from remote stores and to add the corresponding *store.Layer information to the list in the additional layer store. Then the later calls to the store APIs can recognize these layers.

For more details of the design, please see:

@rhatdan
Copy link
Member

rhatdan commented Sep 10, 2020

@ktock @giuseppe Still working on this?

@siscia
Copy link
Author

siscia commented Sep 14, 2020

We are close to merge a PR that will allow us to create the correct file-system structure to be used by containers/storage as additionalStorage.

It is CVMFS specific work thought.

@ktock
Copy link

ktock commented Sep 14, 2020

@siscia Do you need changes like discussed in containers/storage#644 (comment) ?

@siscia
Copy link
Author

siscia commented Sep 14, 2020

No we don't. We just use the additionalStorage interface of containers/storage.

However, while technically we don't need it, it would arguably be a nice feature to have a discoverability for layers.

At the moment all our layers are encoded in an huge JSON, which is suboptimal, still working, but...

Also, I must be honest, it is not clear to me yet, how we would use the interface you are proposing. We would need go code inside podman? maybe we should discuss in the other issue.

@rhatdan
Copy link
Member

rhatdan commented Dec 24, 2020

@siscia @ktock What is the latest on this issue?

@ktock
Copy link

ktock commented Dec 24, 2020

I'm currently working on "additional layer store" implementation based on containers/storage#644 (comment), which allows storage driver to use (possibly remotely mounted) exploded layers from that store without pulling them. This also enables the store to discover layers based on the annotation appended to layer blobs. I'll open draft PRs this week.

@ktock
Copy link

ktock commented Dec 26, 2020

Opened PRs for enabling this. #8837, containers/storage#795, containers/image#1109.

The structure will be easier to implement than the current additional image store, from filesystem implementer's perspective. This patch also enables layer discovery. So filesystems don't need to hold a large JSON blob that contains all layers available in the remote store, which will great fit with registry-based lazy pulling e.g. stargz/zstd:chunked.

@github-actions
Copy link

A friendly reminder that this issue had no activity for 30 days.

@rhatdan
Copy link
Member

rhatdan commented Jan 27, 2021

This is still being worked on .

@github-actions
Copy link

A friendly reminder that this issue had no activity for 30 days.

@github-actions
Copy link

A friendly reminder that this issue had no activity for 30 days.

@ktock
Copy link

ktock commented Mar 30, 2021

getting reviews in containers/storage#795 and containers/image#1109.

@github-actions
Copy link

A friendly reminder that this issue had no activity for 30 days.

@ktock
Copy link

ktock commented Apr 30, 2021

Both of containers/image#1109 and containers/storage#795 are merged! Thanks for the review. Though there are still limitations (containers/storage#795 (comment)), lazy pulling will be available on Podman/CRI-O soon. I'll continuously work on eliminating these limits.

I'll work on bumping up c/storage and c/image in Podman and CRI-O for applying these patches.
@giuseppe Can we have a new release tag of containers/image for containers/image#1109 ?

An example implementation of "Additional Layer Store" (a plugin for lazy pulling) for eStargz is available at containerd/stargz-snapshotter#301.

@giuseppe
Copy link
Member

giuseppe commented May 3, 2021

@vrothberg could we have a new tag for c/image?

@vrothberg
Copy link
Member

@vrothberg could we have a new tag for c/image?

Roger that. @rhatdan anything else you want in? Have to cut a new minor release (5.12.0).

@rhatdan
Copy link
Member

rhatdan commented May 3, 2021

No that is fine with me.

@vrothberg
Copy link
Member

I think this work is done now. Please reopen if I am wrong.

@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 21, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 21, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/feature Categorizes issue or PR as related to a new feature. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

No branches or pull requests

10 participants