Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clustering: zot scale-out cluster #125

Open
hallyn opened this issue Aug 5, 2020 · 4 comments
Open

clustering: zot scale-out cluster #125

hallyn opened this issue Aug 5, 2020 · 4 comments

Comments

@hallyn
Copy link
Contributor

hallyn commented Aug 5, 2020

We will want to support running a cluster of zot servers.

When a blob is uploaded, it should be distributed to all the nodes.

When fetching an image, the client should be able to fetch each
layer blob from a different server to load balance.

@tych0
Copy link
Contributor

tych0 commented Aug 5, 2020

Might be nice to do it with redirects, so you only have to talk to a zot in the cluster instead of the right one.

@rchincha
Copy link
Contributor

Couple of design considerations to consider:

  1. is the client aware of the members of a cluster or we do some sort of proxying
  2. given unique content-addressable blobs, routing can assume the quorum to be stable or unstable (DHT assumptions)

@rchincha rchincha changed the title zot cluster clustering: zot scale-out cluster Jan 29, 2021
@rchincha
Copy link
Contributor

#2041

@andaaron
Copy link
Contributor

Considerations on storing various data - in the context of the clustering discussion.
I wrote these a while back, some may have been addressed.

  • We support local file system and s3 in case of AWS for the image stores (responsible for storing image blobs)
  • In case of zot sync we only support local, because the 3rd party library we use for sync-ing uses local storage as an intermediate destination for the copy - a note for kubernetes/cloud use case
  • The information about dedupe is stored in a cache DB - on local file system (cache.db under root dir) or on DynamoDB in case of AWS - there are multiple such cache DBs, one per image store - these would probably need to be the same for all zot instances
  • Trivy is using local disk space to store CVE information and scan results (in the folder _trivy under the root directory) - 2 DBs, one for Java scanning and one for the rest, the Java one being huge (hundreds of MB if I rememeber correctly) - right now we are not storing CVE scan results anywhere else, we rely on Trivy and in-memory cache of the latest results.
  • zot user session authentication (for zui) uses a folder _sessions under the root directory to store session information - if we have multiple zot instances we need to design also for the session authentication use case
  • Right now we advertize zot with an embedded UI - in case of a cluster, would we have multiple UIs, or a separate UI service?
  • We have a metadata db (meta.db stored locally under the root directory, or as DynamoDB tables for cloud case) - this DB needs to be the same for all zots (we store information on manifests, configs, download counters, signature verification results)
  • We store certificates and private keys for signature verification locally regardless of storage type (we want to also support AWS), these files are under the root dir (folders _notation and _cosign) -> not sure if this is still applicable today for AWS.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants