Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage controller: tech debt #7165

Merged
merged 10 commits into from Mar 19, 2024
Merged

storage controller: tech debt #7165

merged 10 commits into from Mar 19, 2024

Conversation

jcsp
Copy link
Contributor

@jcsp jcsp commented Mar 18, 2024

This is a mixed bag of changes split out for separate review while working on other things, and batched together to reduce load on CI runners. Each commits stands alone for review purposes:

  • do_tenant_shard_split was a long function and had a synchronous validation phase at the start that could readily be pulled out into a separate function. This also avoids the special casing of ApiError::BadRequest when deciding whether an abort is needed on errors
  • Add a 'describe' API (GET on tenant ID) that will enable storcon-cli to see what's going on with a tenant
  • the 'locate' API wasn't really meant for use in the field. It's for tests: demote it to the /debug/ prefix
  • The Single placement policy was a redundant duplicate of Double(0), and Double was a bad name. Rename it Attached. (storage controller: remove 'Single' policy and rename Double #7107)
  • Some neon_local commands were added for debug/demos, which are now replaced by commands in storcon-cli (controller: add storcon-cli #7114 ). Even though that's not merged yet, we don't need the neon_local ones any more.

Closes #7107

Backward compat of Single/Double -> Attached(n) change

There are no prod sharded tenants yet: when this merges I will fix up staging databases by hand (it's a one-liner). It's not worth carrying backward compatibility deserialization code for this.

Checklist before requesting a review

  • I have performed a self-review of my code.
  • If it is a core feature, I have added thorough tests.
  • Do we need to implement analytics? if so did you add the relevant metrics to the dashboard?
  • If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section.

Checklist before merging

  • Do not forget to reformat commit message to not include the above checklist

@jcsp jcsp added c/storage/pageserver Component: storage: pageserver a/tech_debt Area: related to tech debt c/storage/controller Component: Storage Controller and removed c/storage/pageserver Component: storage: pageserver labels Mar 18, 2024
@jcsp jcsp requested review from arpad-m and VladLazar March 18, 2024 15:13
Copy link

github-actions bot commented Mar 18, 2024

2706 tests run: 2574 passed, 0 failed, 132 skipped (full report)


Code coverage* (full report)

  • functions: 28.3% (7131 of 25195 functions)
  • lines: 46.9% (43720 of 93287 lines)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
f6c776f at 2024-03-19T13:25:50.717Z :recycle:

libs/pageserver_api/src/controller_api.rs Outdated Show resolved Hide resolved
@jcsp jcsp marked this pull request as ready for review March 19, 2024 09:01
@jcsp jcsp requested a review from a team as a code owner March 19, 2024 09:01
@jcsp jcsp enabled auto-merge (squash) March 19, 2024 09:02
@jcsp jcsp merged commit a5d5c2a into main Mar 19, 2024
53 checks passed
@jcsp jcsp deleted the jcsp/monday-tech-debt branch March 19, 2024 16:08
jcsp added a commit that referenced this pull request Mar 21, 2024
Stacks on:
- #7165

Fixes while working on background optimization of scheduling after a
split:
- When a tenant has secondary locations, we weren't detaching the parent
shards' secondary locations when doing a split
- When a reconciler detaches a location, it was feeding back a
locationconf with `Detached` mode in its `observed` object, whereas it
should omit that location. This could cause the background reconcile
task to keep kicking off no-op reconcilers forever (harmless but
annoying).
- During shard split, we were scheduling secondary locations for the
child shards, but no reconcile was run for these until the next time the
background reconcile task ran. Creating these ASAP is useful, because
they'll be used shortly after a shard split as the destination locations
for migrating the new shards to different nodes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a/tech_debt Area: related to tech debt c/storage/controller Component: Storage Controller
Projects
None yet
Development

Successfully merging this pull request may close these issues.

storage controller: remove 'Single' policy and rename Double
2 participants