Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FeatureRequest: provide a way to reuse byte slice allocated in readEntry #229

Open
choleraehyq opened this issue Jun 19, 2020 · 7 comments

Comments

@choleraehyq
Copy link

Currently we allocate a byte slice directly in encoding.go:readEntry. Can we provide a way to reuse slice here?

Here are my two possible solutions.
First one is provide a new function like

func (c *BigCache) GetWithReuseFunc(key string) (value []byte,reuseFunc func(*[]byte),err error)

once users don't need value anymore, they can use reuseFunc(&value) to give the slice back to BigCache, internally we can maintain some sync.Pools to reuse different sized slices.

Second one is

func (c *BigCache) GetWithPreallocatedSlice(key string, preallocated []byte) (value []byte, err error)

users can provide a preallocated slice to that function, just like io.Reader or https://github.com/golang/snappy/blob/master/decode.go#L57. If that slice is large enough, we can use and return it directly, otherwise, we return a newly allocated slice.

@cristaloleg
Copy link
Collaborator

Hi @choleraehyq thank you for the idea. I had this idea in mind, but completely left it. Currently I'm not 100% sure this will work with the underlying bytes queue (used internally), but if everything is fine, this feature is pretty useful (with a sync.Pool or any byte pool outside of BigCache).

I think the second API is more clear and is pretty popular, I think we can easily try to make it (unless there is no conflict with bytes queue behaviour).

@siennathesane
Copy link
Collaborator

I don't think this will work as intended with the internal BytesQueue because, and I could be wrong, we shard data across multiple queues to help spread things out more evenly, so passing in a slice to the queue would break things by default. If you added the slice to the end of the queue, that could work, but then the cache configurations wouldn't be correct.

Since you have to pass byte slices anyways, what is the specific use case that would require the difference between what you're suggesting vs what already exists? Are you asking if bigcache could reference a slice it isn't inherently managing?

@rupor-github
Copy link

If I am not mistaken something similar could be achieved with slightly different approach without changing underlying byte queue too much by providing "processing" closure to the API. This function runs under lock and it is up to the user if actual copy needs to be made and where to keep copied data. bigcache itself while working under lock does not ever need to copy data.

type CacheEntry struct {
	TS   uint64
	Hash uint64
	Key  []byte
	Data []byte
}

type Processor func(*CacheEntry) error

func (c *BigCache) GetWithProcessing(key string, processor Processor) error {
	hashedKey := c.hash.Sum64(key)
	shard := c.getShard(hashedKey)
	_, err := shard.get(key, hashedKey, processor)
	return err
}

For testing purposes I implemented this in my fork - seems to work very well.

@janisz
Copy link
Collaborator

janisz commented Aug 14, 2020

@rupor-github I like this approach: it's backward compatible and easy to implement.

On the other hand it's really powerful and might cause a hard to debug problems if not used correctly (e.g. when Processor changes data)

@siennathesane
Copy link
Collaborator

I do like the "processor" idea, but I am also concerned that it opens up too much surface area for issues when not used correctly. How will we surface an error from the user's processor implementation in a way which is fundamentally separate from a BigCache issue?

@rupor-github
Copy link

rupor-github commented Sep 2, 2020

@mxplusb - I share your concerns, however I do not see any efficient way to give user control over low level memory and have safety guarantees (with require explicit memory duplication). In my fork I implemented everything (including internal access) in terms of CacheEntry and defined set of Copy routines (replacing readKey, readEntry, etc) on it hoping to provide better documentation. However this is only documentation. Pinning entries in queue IMHO requires much more complex queue implementation/strategies.

@siennathesane
Copy link
Collaborator

Let's get this into the discussion for v3, I think it's a viable design decision, but I want to save it for a major version as it's nuanced and I want to make sure it's very well documented with examples.

// ref: #238

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants