Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

very first version of tenant migration from one pageserver to another via remote storage #995

Merged
merged 3 commits into from
Jan 24, 2022

Conversation

LizardWizzard
Copy link
Contributor

@LizardWizzard LizardWizzard commented Dec 14, 2021

Currently this contains python test which makes some assumptions and uses quite hacky setup with second pageserver without zenith cli support

Resolves #896
Resolves #897
Resolves #898
Resolves #900

Copy link
Contributor

@SomeoneToIgnore SomeoneToIgnore left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks nice, actually, despite all the WIP stuff.

I'm a bit concerned by the sleep(n) things, but this is not a problem right now.
IMO later, we could call some http method periodically and limit the number of calls instead.

pageserver/src/http/routes.rs Outdated Show resolved Hide resolved
test_runner/batch_others/test_tenant_relocation.py Outdated Show resolved Hide resolved
zenith_utils/src/zid.rs Show resolved Hide resolved

new_pageserver_config_file_path = new_pageserver_dir / 'pageserver.toml'

# add remote storage mock to pageserver config
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe worth a todo, IMO, we would like to enable remote storage for every Python IT tests with one env var/command, so the file appending way is rather crude and temporary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think when your config changes land it will be convenient to support multiple pageservers directly from zenith cli

@LizardWizzard
Copy link
Contributor Author

@SomeoneToIgnore Thanks for the review!

IMO later, we could call some http method periodically and limit the number of calls instead.

Yes, I fully agree with that, we can have some utility in our python tests to wait for certain predicate to become true with a timeout on a waiting period. When all the bits will be in place I'll remove these sleep calls

@SomeoneToIgnore
Copy link
Contributor

#1079 helps with hacks around the remote storage

@LizardWizzard LizardWizzard force-pushed the tenant-rebalancing-v0 branch 4 times, most recently from ac3c7d2 to 189924e Compare January 13, 2022 08:57
@LizardWizzard LizardWizzard marked this pull request as ready for review January 13, 2022 12:48
@LizardWizzard
Copy link
Contributor Author

LizardWizzard commented Jan 13, 2022

I think in this form this patch can be accepted and TODO's are closed with follow ups. I used links in them to corresponding issues.

@arssher @lubennikovaav Please take a look at safekeeper changes, this mostly touches callmemaybe stuff

There is also a small document in docs/ so if you have any questions I'll be glad to cover them

Copy link
Contributor

@SomeoneToIgnore SomeoneToIgnore left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not looked thoroughly at the Python code, but LGTM overall

docs/pageserver-tenant-migration.md Outdated Show resolved Hide resolved
pageserver/src/http/routes.rs Show resolved Hide resolved
pageserver/src/remote_storage/local_fs.rs Show resolved Hide resolved
test_runner/fixtures/zenith_fixtures.py Outdated Show resolved Hide resolved
walkeeper/src/callmemaybe.rs Outdated Show resolved Hide resolved
walkeeper/src/handler.rs Outdated Show resolved Hide resolved
@LizardWizzard LizardWizzard force-pushed the tenant-rebalancing-v0 branch 5 times, most recently from b2c1ca3 to b10dc00 Compare January 17, 2022 11:29
@LizardWizzard
Copy link
Contributor Author

I've rebased this on top of shutdown improvements patch and added margins for some exact checks to reduce possible flakiness

@stepashka stepashka linked an issue Jan 17, 2022 that may be closed by this pull request
1 task
Copy link
Contributor

@SomeoneToIgnore SomeoneToIgnore left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I compile walkeeper package alone, I get an error:

❯ cargo check --package walkeeper --all-targets
    Checking walkeeper v0.1.0 (/Users/someonetoignore/Work/zenith/zenith/walkeeper)
error[E0433]: failed to resolve: could not find `select` in `tokio`
   --> walkeeper/src/callmemaybe.rs:220:16
    |
220 |         tokio::select! {
    |                ^^^^^^ could not find `select` in `tokio`

feels like tokio is missing a feature there?

Otherwise, the same "LGTM but not sure about Python magic code" impression.

walkeeper/src/send_wal.rs Outdated Show resolved Hide resolved
walkeeper/src/send_wal.rs Outdated Show resolved Hide resolved
@LizardWizzard
Copy link
Contributor Author

Thanks for the review @SomeoneToIgnore!

If I compile walkeeper package alone, I get an error:

It seems that the error appears on main too. I wonder should we pin somewhere (in zeinth_utils?) tokio dependency with all the needed features. This should simplify things a bit, and AFAIK cargo uses union of all features needed by crate/it's dependencies so this shouldn't affect compile times, but worth checking


### Implementation details

Now safekeeper needs to track which pageserver it is replicating to. This introduces complications into replication code and callmemaybe subscription state handling.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you elaborate on the "complications in the replication code"?

I wonder if we need special treatment of pageserver's feedback in get_replicas_state() on safekeeper, given that two streams can exist for the same timeline?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By complications I mean additional handling of pageserver connection string and the requirement to track which pageserver is actually a primary one, to avoid reconnections to it (which is not currently implemented but is present in the epic in the follow ups section)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given that two streams can exist for the same timeline?

At a first glance everything looks OK because we keep a Vec of replicas for particular timeline. Do you have some invariants in mind that should be checked?

let timelineid = spg.timeline.get().timelineid;
// TODO this expect is ugly, will it be better with node id?
// can we guarantee that this code is not executed during walproposer recovery?
let pageserver_connstr = pageserver_connstr
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lubennikovaav what do you think about this expect call? It was not triggered in tests so I assume this code path it not followed in walproposer recovery but can we guarantee somehow that recovery won't touch this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If walproposer is in recovery stop_lsn is set and this is checked in the code above this line.


### Implementation details

Now safekeeper needs to track which pageserver it is replicating to. This introduces complications into replication code:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW do you think we can address both of that with neondatabase/rfcs#16 ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the first one, if callmemaybe goes away there is no need to weak it for these changes.

The second one can go away too if safekeeper doesnt initiate connections (and doesnt reconnect) to pageserver. As you've answered this is exactly how it is proposed in the RFC, so yeah, both problems are absent with this RFC being implemented

This patch includes attach/detach http endpoints in pageservers. Some
changes in callmemaybe handling inside safekeeper and an integrational
test to check migration with and without load. There are still some
rough edges that will be addressed in follow up patches
@LizardWizzard LizardWizzard merged commit 458bc0c into main Jan 24, 2022
@LizardWizzard LizardWizzard deleted the tenant-rebalancing-v0 branch January 24, 2022 14:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants