Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: add "proxycache" storage type #443

Closed
bradfitz opened this issue May 14, 2014 · 9 comments
Closed

storage: add "proxycache" storage type #443

bradfitz opened this issue May 14, 2014 · 9 comments

Comments

@bradfitz
Copy link
Contributor

The FUSE code already does its own caching of blobs, but that code isn't available via a
registered storage type.

We should add pkg/blobserver/proxycache adding the "proxycache" storage type,
which requires:

-- an "origin" storage prefix
-- a "cache" prefix.
-- a "maxCacheSize"
-- a "usageMeta" sorted.KeyValue for tracking access times

Related: issue #443 so diskpacked can be an efficient cache.
@bradfitz
Copy link
Contributor Author

Comment 1:

I meant: Related: issue #442 so diskpacked can be an efficient cache.

@edrex
Copy link
Contributor

edrex commented May 14, 2014

Comment 2:

I'm interested in having fast storage be both a LRU cache and a write buffer. Could both
use cases be served by a single storage type?

@bradfitz
Copy link
Contributor Author

@bradfitz
Copy link
Contributor Author

Comment 4:

Labels changed: added storage-blobs.

Owner changed to ---.

Status changed to HelpWanted.

@stephens2424
Copy link
Contributor

Just as an fyi, I've started working on this here. I've got a few things to fix and clean up before I'd call it review-ready. This basically fills in a few TODO items I found in proxycache and adds some tests. The approach steps away from sorted.KeyValue and uses the lru package and the stats blobserver. There's a bit of redundant data being stored, so I'm not in love with it... but I do think the implementation is simpler using the imports. Still hacking a bit, but feel free to make suggestions if you find the time.

@edrex
Copy link
Contributor

edrex commented Dec 5, 2016

A nice (later) validation of this might be to remove the caching code from pkg/fs and have cmd/cammount wire one up at runtime (maybe with a flag to turn it off, which could be useful for debugging cache issues).

@stephens2424
Copy link
Contributor

This change is getting close, so I uploaded to the review server in case anyone wants a head start.

@stephens2424
Copy link
Contributor

Something in particular I could use a little input on: when we verify the cache, what should we do with the different errors?

  • cache is missing a ref the caller asked to check
    this seems like an easy call to just return the error here.

  • origin is missing a ref the cache has
    This probably means the item was deleted from the origin. Should we just delete it in the cache? Or should we put it back onto the origin?

  • the size of refs in the origin and cache differ
    In theory, this means a hash collision, but it's possible this case will occur at some point because of a bug of some kind. Maybe actually panic in this case? Lots of hi-jinx could probably happen if equal refs could show up with differing sizes (i.e. different content)

@stephens2424
Copy link
Contributor

So I went with returning an error in all the cases. (Only when you ask to verify specific refs does the cache missing a ref actually become an error)

I decided to punt on the BlobHub issue for now, but I think that's okay. Deleting blobs seems like it'd be a rare use case and changing them would involve finding a hash collision ... so even more rare.

I also think I found a bug in the go4.org/jsonconfig package while writing a test for the configuration piece of this. That last test is currently failing, but passes with this change applied: https://review.gerrithub.io/#/c/346661/.

Apart from that last item, this change is "done" and I would be delighted with some code review :)

https://camlistore-review.googlesource.com/#/c/8926/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants