Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(cache) Don't cache failed calls to resourceStore #1894

Merged
merged 6 commits into from
Apr 30, 2021

Conversation

lahabana
Copy link
Contributor

@lahabana lahabana commented Apr 27, 2021

We were caching errors to the resourceStore in the XDS cache
This was causing problems when the resourceStore was returning an error
we wouldn't attempt to get a value until the item was purged from the cache.
We now never cache errors, thus ensuring we recover quickly from resourceStore errors

Signed-off-by: Charly Molter charly@koyeb.com

@lahabana lahabana requested a review from a team as a code owner April 27, 2021 15:13
We were caching errors to the resourceStore in the XDS cache
This was causing problems when the resourceStore was returning an error
we wouldn't attempt to get a value until the item was purge from the cache.
We now never cache errors, thus ensuring we recover quickly from resourceStore errors

Signed-off-by: Charly Molter <charly@koyeb.com>
Copy link
Contributor

@jakubdyszkiewicz jakubdyszkiewicz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's a good one, thanks for introducing the test

Copy link
Contributor

@jakubdyszkiewicz jakubdyszkiewicz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about defer c.onceMap.Delete(mesh) as a first line?

Could you also do the same change in pkg/xds/cache/cla/cache.go?

@lahabana
Copy link
Contributor Author

Both sound like good suggestions. Updating this

@lahabana
Copy link
Contributor Author

@jakubdyszkiewicz there was a lot of redundancy between these 2 implementations I've refactored a bit to simplify and added an "error" result in the metric.

…mplementation

Signed-off-by: Charly Molter <charly@koyeb.com>
Copy link
Contributor

@jakubdyszkiewicz jakubdyszkiewicz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lobkovilya could you check the refactor?

}, nil
}

func (c *Cache) Get(ctx context.Context, key string, fn func(context.Context, string) (interface{}, error)) (interface{}, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, there was duplication indeed, but it's quite problematic to introduce a well-understandable and reusable abstraction here. I don't mind going with this once.Cache abstraction, but we have to do something with the interface because it's really unclear what Get does.

Maybe we can introduce new types:

type Key string
type Value interface{}
type Refresher func(context.Context, Key) (Value, error)

and rename Get to GetOrRefresh so it will look like:

func (c *Cache) GetOrRefresh(ctx context.Context, key Key, refresher Refresher) (Value, error)

and document this method with a comment something like:

// GetOrRefresh method returns cached Value if it was created within the last `expirationTime` time period.
// If the Value is older than `expirationTime` then Refresher will be launched and will update the Value.
// Regardless of the number of goroutines that have run into the expired Value, Refresher will be called only
// once, all goroutines will be blocked until the new Value returned by Refresher.
func (c *Cache) GetOrRefresh(ctx context.Context, k Key, r Refresher) (Value, error) {

WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good!
How about GetOrCompute() as Refresh sounds like it will only work when the item has been in the cache?

For the Key what's the motivation for adding the subtype?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main reason for questioning the introduction of subtypes is that the underlying lib go-cache uses string as key and interface{} as value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the end went for GetOrRetrieve and didn't add subtypes for key and value. Let me know what you think.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like GetOrRetrieve 👍

Subtypes help to make the signature more readable. If you define it like:

type FetcherFunc func(context.Context, string) (interface{}, error)

then it's not clear that FetcherFunc takes key as a parameter and returns value.

Also, interface{} is used twice - as a return value of GetOrRetrieve and as a return value of FetcherFunc. I think it makes sense to show that those interface{} are the same type. For example, if in the future we would like to introduce a new type for Value.

But I don't mind going with just string and interface{}

Signed-off-by: Charly Molter <charly@koyeb.com>
@lahabana
Copy link
Contributor Author

@Mergifyio update

@mergify
Copy link
Contributor

mergify bot commented Apr 30, 2021

Command update: success

Branch has been successfully updated

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ lahabana
❌ mergify[bot]
You have signed the CLA already but the status is still pending? Let us recheck it.

Signed-off-by: Charly Molter <charly@koyeb.com>
@jakubdyszkiewicz jakubdyszkiewicz merged commit e73c990 into kumahq:master Apr 30, 2021
@jakubdyszkiewicz
Copy link
Contributor

same thing fails on master, I don't want to block this PR from merging

mergify bot pushed a commit that referenced this pull request Apr 30, 2021
Signed-off-by: Charly Molter <charly@koyeb.com>
(cherry picked from commit e73c990)
jakubdyszkiewicz pushed a commit that referenced this pull request Apr 30, 2021
Signed-off-by: Charly Molter <charly@koyeb.com>
@lahabana lahabana deleted the fix/cacheCachingErrors branch March 29, 2024 12:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants