Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MIG-699] Add wrapper around cached client that waits for cache to catch up before returning from Create, Update #1035

Closed
djwhatle opened this issue Mar 27, 2021 · 2 comments · Fixed by #1037
Labels
kind/feature Categorizes issue or PR as related to a new feature.

Comments

@djwhatle
Copy link
Contributor

djwhatle commented Mar 27, 2021

This cache sync approach is block-on-write. I wrote about a block-on-read approach in #1034, but I don't think that will work with deletions

See #1034 for problem statement.

Describe the solution you'd like
Add a wrapper around cached client. When Create or Update is called, wait until the cache catches up by performing Get calls against the cache to verify that the new resourceVersion is retrievable from the cache.

Have some sane timeout like 3 seconds on this block in case a resource that was modified by us was immediately deleted and will never show up in the cache.

Also, we could provide a parameter on client calls so that the user can decide whether to block on cache sync. This would be useful if we know we won't be looking for a resource later on, but would potentially be unsafe.

@djwhatle djwhatle added the kind/feature Categorizes issue or PR as related to a new feature. label Mar 27, 2021
@djwhatle djwhatle changed the title Add wrapper around cached client that waits for cache to catch up before returning Add wrapper around cached client that waits for cache to catch up before returning from Create, Update, Delete. Mar 27, 2021
@djwhatle
Copy link
Contributor Author

djwhatle commented Mar 28, 2021

I put together a proof of concept for this and it's working really well so far. I've been able to observe the cache taking anywhere from 10ms - 500ms to catch up after an update is posted, but because I modified the client to block we never run into race conditions where something we expected is missing.

When I first added the cache (before I implemented sync blocking) I observed migration failure every time with DVM expecting resources (e.g. secrets) it had just created to exist. This problem went away immediately after getting cache-sync blocking Create and Update implemented.

I think providing the ability for the developer to decide if an API call should block on success or not will be important.

PV discovery with cache (354 ms)

image

PV discovery without cache (10.01 seconds)

image

New phase timing

image

@djwhatle djwhatle linked a pull request Mar 28, 2021 that will close this issue
5 tasks
@djwhatle
Copy link
Contributor Author

Remaining potential problems:

  • Need to make sure there aren't complications with migration operations starting to run before cache is full
    • E.g. if mig-controller starts in the middle of a migration, will the migration fail when it otherwise would not?
      • Is there a way for us to block anything from happening until cache fills?).
      • OR is there a way for us to just not start using the cached client until it catches up with most recent resourceVersion?

@djwhatle djwhatle changed the title Add wrapper around cached client that waits for cache to catch up before returning from Create, Update, Delete. Add wrapper around cached client that waits for cache to catch up before returning from Create, Update May 20, 2021
@djwhatle djwhatle changed the title Add wrapper around cached client that waits for cache to catch up before returning from Create, Update [MIG-699] Add wrapper around cached client that waits for cache to catch up before returning from Create, Update Jun 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant