Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

image layout: move to slash separation for blob storage layout #208

Closed
stevvooe opened this issue Aug 24, 2016 · 12 comments
Closed

image layout: move to slash separation for blob storage layout #208

stevvooe opened this issue Aug 24, 2016 · 12 comments
Assignees
Milestone

Comments

@stevvooe
Copy link
Contributor

Currently, blobs are stored with the follow format:

blobs/<digest>-<hex bytes>

This has caused confusion in the context of the :-separated hash format.

To make this more clear, the blobs should be stored in the following format:

blobs/sha256/<hex bytes>

cc @vbatts @philips

@stevvooe stevvooe added this to the v0.5.0 milestone Aug 24, 2016
@wking
Copy link
Contributor

wking commented Aug 24, 2016

Or we could use <algo>-<hex-digest> for both the blob names and the blob filenames, in which case there would be even less confusion.

I'm ok with using a path separator, but Windows folks would have to be careful to use the appropriate separator for the engine (e.g. / for image-layout backed by HTTPS, tar, … and \ for image-layout backed by local directories).

@philips
Copy link
Contributor

philips commented Aug 24, 2016

+1 to this concept.

@xiekeyang
Copy link
Contributor

If it should add a subdirectory by highest 8 bit of hash digest, like blobs/sha256/1a/<hex bytes>?
This map can speed up retrieve, and avoid lots blob files in one folder.
In future, the storage will play a repository, so it will collect lots of blobs.
This way is widely used in CAS system.

@glestaris
Copy link
Contributor

That would be nice. We can store multiple images under the same directory structure.

@stevvooe
Copy link
Contributor Author

@xiekeyang @glestaris We have discussed the convention of using a portion of the digest to partition the CAS files but have been unable to find any recent examples of where this actually speeds up retrieval.

Do you have any data supporting this design? What is the actual performance impact?

@wking
Copy link
Contributor

wking commented Aug 25, 2016

On Thu, Aug 25, 2016 at 07:06:12AM -0700, Stephen Day wrote:

@xiekeyang @glestaris We have discussed the convention of using a
portion of the digest to partition the CAS files but have been
unable to find any recent examples of where this actually speeds up
retrieval.

Previous discussion of hash sharding starting here 1.

@xiekeyang
Copy link
Contributor

@stevvooe @wking Got it, thanks a lot!

@philips
Copy link
Contributor

philips commented Aug 29, 2016

@runcom @s-urbaniak could you work-up a patch to make this change in the tooling? And then @stevvooe or I can make the english language change.

@stevvooe stevvooe self-assigned this Aug 29, 2016
@stevvooe
Copy link
Contributor Author

@philips I can take this on!

@philips
Copy link
Contributor

philips commented Aug 31, 2016

@stevvooe are you going to do the code, english or both?

@philips
Copy link
Contributor

philips commented Aug 31, 2016

@runcom volunteered to do the code side of this one.

@runcom
Copy link
Member

runcom commented Aug 31, 2016

created #230 for the code side

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants