Build A MVP Persistent Caching Proxy #342

sttts · 2021-12-21T12:55:25Z

Certain data in a kcp cluster is critical for operation and SPOF in a multi-region/AZ environment, e.g. org workspaces. It is feasible to build a consistent cache hierarchy which

persists objects to a local etcd or
keep data in memory if one makes the proxies highly available through multiple instances
and which serve consistent data by checking freshness of the cache on quorum reads. Using such a setup could give read-only availability to e.g. org workspaces.

Big question: do we run into time-travel problems as we do with pods and kubelets. So we must be super careful when answering consistent reads with potentially stale data in an outage situation. But for certain operation like the personal workspace virtual workspace from @davidfestal, a stale read is good enough.

Acceptance Criteria

We're just looking for an MVP implementation here:

shards push data to the proxy
no auth{n,z} needed
APIExport + APIResourceSchema hard-coded as types to push
dynamically determine which types to serve. For example, the built-in types (APIExports and APIResourceSchema) could be automatically served as CRDs.
turn on the watch cache in the cache server
have secondary informers in kcp reconcilers use the cache server

Items

authorization

shards should only be able to write their shard partition of the cache

data removal

unused resources could be automatically removed

shard management, a few loose ideas:

what will decide when a new shard needs to be created?
which cluster it gets deployed to?
making other shards aware of the new shard?

on kcp server

wire informer both ways this includes a controller to cope with not-yet-synced informers
start the second informer (aka. TemporaryRootShardKcpSharedInformerFactory) in a dedicated post-start-hook

The text was updated successfully, but these errors were encountered:

stevekuznetsov · 2022-01-03T16:17:03Z

2 seems quite a lot harder than 1 FWIW

sttts · 2022-01-05T10:00:47Z

The freshness problem we have either way. So I don't think one or the other is harder. The first is reboot safe, the second is operationally simpler.

ncdc · 2022-04-13T18:51:28Z

Clearing milestone to re-triage

sttts · 2022-06-14T14:46:28Z

Part of multi-release epic #1225

stevekuznetsov · 2022-09-16T12:53:48Z

@p0lyn0mial FYI for tracking, this is the issue for implementing the cache server, would be cool to link PRs to it as new ones come in

ncdc · 2022-12-05T20:19:34Z

@p0lyn0mial @sttts is this going to be completed this week for v0.10? It looks like we still have several outstanding items?

p0lyn0mial · 2022-12-06T09:44:38Z

@p0lyn0mial @sttts is this going to be completed this week for v0.10? It looks like we still have several outstanding items?

we need to finish the workspace refactoring before we can finish this feature.

p0lyn0mial · 2023-02-22T16:04:25Z

I think we can close this issue. Not all items have been implemented but it is unclear to me if we will use the cache server in the long-run. Perhaps it will be replaced by CRDB.

ncdc · 2023-02-22T16:04:37Z

👍
/close

openshift-ci · 2023-02-22T16:04:56Z

@ncdc: Closing this issue.

In response to this:

👍
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

sttts added the kind/feature Categorizes issue or PR as related to a new feature. label Dec 21, 2021

ncdc added this to the Prototype 4 milestone Feb 23, 2022

ncdc removed this from the v0.4.0 milestone Apr 13, 2022

ncdc added this to the v0.6.0 milestone May 31, 2022

sttts mentioned this issue Jun 14, 2022

Multi-Release-Epic: Sharding #1225

Closed

37 tasks

sttts modified the milestones: v0.6.0, TBD Jun 14, 2022

sttts added the area/sharding Issues or PRs related to sharding changes label Jun 14, 2022

stevekuznetsov modified the milestones: TBD, v0.9 Sep 16, 2022

stevekuznetsov changed the title ~~Consistent caching proxy~~ Build A MVP Persistent Caching Proxy Sep 16, 2022

stevekuznetsov mentioned this issue Sep 16, 2022

Prototype Workspace Life Cycle on Shards #354

Closed

5 tasks

This was referenced Oct 4, 2022

🌱 cache: replication controller: attach the shard annotation during object creation #2122

Merged

🌱 cache: run the replication controller when the cache server is enabled #2132

Merged

ncdc modified the milestones: v0.9, v0.10 Oct 5, 2022

This was referenced Oct 13, 2022

🌱 cache: indroduce cache-server-kubeconfig-file flag #2183

Merged

🐛 cache: take into account EmbeddedEtcd options #2188

Merged

🌱 e2e framework: introduce KcpConfigOption function #2197

Merged

pweil- mentioned this issue Oct 18, 2022

Clients of the Caching Server Need to be Cross-workspace list/watch resilient #348

Closed

3 tasks

This was referenced Nov 15, 2022

🌱 Implement part one from the workspace redesign proposal #2327

Closed

✨ WIP: kcp use the caching layer #2272

Closed

🐛 give the front proxy a distinct config for direct(internal) shard communication. #2382

Merged

ncdc modified the milestones: v0.10, v0.11 Dec 6, 2022

openshift-ci bot closed this as completed Feb 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Build A MVP Persistent Caching Proxy #342

Build A MVP Persistent Caching Proxy #342

sttts commented Dec 21, 2021 •

edited by p0lyn0mial

Loading

stevekuznetsov commented Jan 3, 2022

sttts commented Jan 5, 2022

ncdc commented Apr 13, 2022

sttts commented Jun 14, 2022

stevekuznetsov commented Sep 16, 2022

ncdc commented Dec 5, 2022

p0lyn0mial commented Dec 6, 2022

p0lyn0mial commented Feb 22, 2023 •

edited

Loading

ncdc commented Feb 22, 2023

openshift-ci bot commented Feb 22, 2023

Build A MVP Persistent Caching Proxy #342

Build A MVP Persistent Caching Proxy #342

Comments

sttts commented Dec 21, 2021 • edited by p0lyn0mial Loading

Acceptance Criteria

Items

authorization

data removal

shard management, a few loose ideas:

on kcp server

stevekuznetsov commented Jan 3, 2022

sttts commented Jan 5, 2022

ncdc commented Apr 13, 2022

sttts commented Jun 14, 2022

stevekuznetsov commented Sep 16, 2022

ncdc commented Dec 5, 2022

p0lyn0mial commented Dec 6, 2022

p0lyn0mial commented Feb 22, 2023 • edited Loading

ncdc commented Feb 22, 2023

openshift-ci bot commented Feb 22, 2023

sttts commented Dec 21, 2021 •

edited by p0lyn0mial

Loading

p0lyn0mial commented Feb 22, 2023 •

edited

Loading