Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs/rfcs: add RFC for fast tenant migration/failover #5029

Merged
merged 10 commits into from
Sep 28, 2023
Merged

Conversation

jcsp
Copy link
Contributor

@jcsp jcsp commented Aug 17, 2023

Problem

Currently we don't have a way to migrate tenants from one pageserver to another without a risk of gap in availability.

Summary of changes

This follows on from #4919

Migrating tenants between pageservers is essential to operating a service
at scale, in several contexts:

  1. Responding to a pageserver node failure by migrating tenants to other pageservers
  2. Balancing load and capacity across pageservers, for example when a user expands their
    database and they need to migrate to a pageserver with more capacity.
  3. Restarting pageservers for upgrades and maintenance

Currently, a tenant may migrated by attaching to a new node,
re-configuring endpoints to use the new node, and then later detaching from the old node. This is safe once generation numbers are implemented, but does meet
our seamless/fast/efficient goals:

Checklist before requesting a review

  • I have performed a self-review of my code.
  • If it is a core feature, I have added thorough tests.
  • Do we need to implement analytics? if so did you add the relevant metrics to the dashboard?
  • If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section.

Checklist before merging

  • Do not forget to reformat commit message to not include the above checklist

@jcsp jcsp added c/storage/pageserver Component: storage: pageserver t/tech_design_rfc Issue type: tech design RFC c/control-plane Component: Control Plane labels Aug 17, 2023
@github-actions
Copy link

github-actions bot commented Aug 17, 2023

2520 tests run: 2403 passed, 0 failed, 117 skipped (full report)


Flaky tests (1)

Postgres 16

  • test_crafted_wal_end[last_wal_record_crossing_segment]: release

Code coverage (full report)

  • functions: 52.9% (8019 of 15145 functions)
  • lines: 81.2% (47016 of 57888 lines)

The comment gets automatically updated with the latest test results
1569446 at 2023-09-27T10:12:51.854Z :recycle:

@jcsp jcsp marked this pull request as ready for review August 31, 2023 09:56
Copy link
Contributor

@problame problame left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Read up to but not including ### Database schema for locations.

I'm good with the high-level procedure.

My main worry design-wise is the AttachedStale state and how it will interact with eviction.

I'd love to see it removed from this RFC and added in a later RFC.

I'll create a stacked PR with some editorial fixes.

docs/rfcs/027-pageserver-migration.md Outdated Show resolved Hide resolved
docs/rfcs/027-pageserver-migration.md Outdated Show resolved Hide resolved
docs/rfcs/027-pageserver-migration.md Outdated Show resolved Hide resolved
docs/rfcs/027-pageserver-migration.md Outdated Show resolved Hide resolved
docs/rfcs/027-pageserver-migration.md Outdated Show resolved Hide resolved
docs/rfcs/027-pageserver-migration.md Outdated Show resolved Hide resolved
docs/rfcs/027-pageserver-migration.md Outdated Show resolved Hide resolved
docs/rfcs/027-pageserver-migration.md Outdated Show resolved Hide resolved
docs/rfcs/027-pageserver-migration.md Outdated Show resolved Hide resolved
docs/rfcs/027-pageserver-migration.md Outdated Show resolved Hide resolved
@problame
Copy link
Contributor

problame commented Sep 1, 2023

Please pull in my editorial fixes from here: #5185

Copy link
Contributor

@koivunej koivunej left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some quick comments

docs/rfcs/027-pageserver-migration.md Outdated Show resolved Hide resolved
docs/rfcs/027-pageserver-migration.md Outdated Show resolved Hide resolved
docs/rfcs/027-pageserver-migration.md Outdated Show resolved Hide resolved
docs/rfcs/027-pageserver-migration.md Outdated Show resolved Hide resolved
docs/rfcs/027-pageserver-migration.md Outdated Show resolved Hide resolved
@jcsp
Copy link
Contributor Author

jcsp commented Sep 12, 2023

Updates:

  • Safety of AttachedMulti: note that the pageserver must bound the number of deletions it enqueues, in case it is left in this state too long.
  • Safety of AttachedStale: add a behavior where the pageserver will un-block S3 writes for a tenant if it receives evict_layers calls that can only be satisfied by uploading.
  • Scope: clarify in Non-Goals that the RFC does not aim to specify all possible tenant configuration transitions, but provides an API sufficient for whatever movements the control plane might like to do.
  • Scope: clarify that defining different storage strategies for different tiers of service is not included in this RFC.
  • Clarify that flushing to S3 includes flushing the heatmap.
  • Clarify that the heatmap is advisory and secondaries may retain additional layers which are in the IndexPart but not the heatmap, if disk space permits.
  • Disambiguate the location configuration API from the existing tenant config API
  • Make the permutations on migration for node down, no secondary, and permanent migration more prominent with section headings.

Copy link
Contributor

@problame problame left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the slew of comments. By and large, I'm good with this design though.

Copy link
Contributor

@problame problame left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to go!

docs/rfcs/028-pageserver-migration.md Show resolved Hide resolved
@jcsp jcsp enabled auto-merge (squash) September 27, 2023 10:21
@jcsp jcsp merged commit 6b4bb91 into main Sep 28, 2023
39 checks passed
@jcsp jcsp deleted the jcsp/rfc-migration branch September 28, 2023 09:07
jcsp added a commit that referenced this pull request Oct 5, 2023
…ocations (#5299)

## Problem

These changes are part of building seamless tenant migration, as
described in the RFC:
- #5029

## Summary of changes

- A new configuration type `LocationConf` supersedes `TenantConfOpt` for
storing a tenant's configuration in the pageserver repo dir. It contains
`TenantConfOpt`, as well as a new `mode` attribute that describes what
kind of location this is (secondary, attached, attachment mode etc). It
is written to a file called `config-v1` instead of `config` -- this
prepares us for neatly making any other profound changes to the format
of the file in future. Forward compat for existing pageserver code is
achieved by writing out both old and new style files. Backward compat is
achieved by checking for the old-style file if the new one isn't found.
- The `TenantMap` type changes, to hold `TenantSlot` instead of just
`Tenant`. The `Tenant` type continues to be used for attached tenants
only. Tenants in other states (such as secondaries) are represented by a
different variant of `TenantSlot`.
- Where `Tenant` & `Timeline` used to hold an Arc<Mutex<TenantConfOpt>>,
they now hold a reference to a AttachedTenantConf, which includes the
extra information from LocationConf. This enables them to know the
current attachment mode.
- The attachment mode is used as an advisory input to decide whether to
do compaction and GC (AttachedStale is meant to avoid doing uploads,
AttachedMulti is meant to avoid doing deletions).
- A new HTTP API is added at `PUT /tenants/<tenant_id>/location_config`
to drive new location configuration. This provides a superset of the
functionality of attach/detach/load/ignore:
  - Attaching a tenant is just configuring it in an attached state
  - Detaching a tenant is configuring it to a detached state
  - Loading a tenant is just the same as attaching it
- Ignoring a tenant is the same as configuring it into Secondary with
warm=false (i.e. retain the files on disk but do nothing else).

Caveats:
- AttachedMulti tenants don't do compaction in this PR, but they do in
the follow on #5397
- Concurrent updates to the `location_config` API are not handled
elegantly in this PR, a better mechanism is added in the follow on
#5367
- Secondary mode is just a placeholder in this PR: the code to upload
heatmaps and do downloads on secondary locations will be added in a
later PR (but that shouldn't change any external interfaces)

Closes: #5379

---------

Co-authored-by: Christian Schwarz <christian@neon.tech>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c/control-plane Component: Control Plane c/storage/pageserver Component: storage: pageserver t/tech_design_rfc Issue type: tech design RFC
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants