Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tenant create / update-config: reject unknown fields #4267

Merged
merged 9 commits into from
May 19, 2023

Conversation

problame
Copy link
Contributor

@problame problame commented May 17, 2023

This PR enforces that the tenant create / update-config APIs reject requests with unknown fields.

This is a desirable property because some tenant config settings control the lifetime of user data (e.g., GC horizon or PITR interval).

Suppose we inadvertently rename the pitr_interval field in the Rust code.
Then, right now, a client that still uses the old name will send a tenant config request to configure a new PITR interval.
Before this PR, we would accept such a request, ignore the old name field, and use the pageserver.toml default value for what the new PITR interval is.
With this PR, we will instead reject such a request.

One might argue that the client could simply check whether the config it sent has been applied, using the /v1/tenant/.../config endpoint.
That is correct for tenant create and update-config.

But, attach will soon 1 grow the ability to have attach-time config as well.
If we ignore unknown fields and fall back to global defaults in that case, we risk data loss.
Example:

  1. Default PITR in pageservers is 7 days.
  2. Create a tenant and set its PITR to 30 days.
  3. For 30 days, fill the tenant continuously with data.
  4. Detach the tenant.
  5. Attach tenant.

Attach must use the 30-day PITR setting in this scenario.
If it were to fall back to the 7-day default value, we would lose 23 days of PITR capability for the tenant.

So, the PR that adds attach-time tenant config will build on the (clunky) infrastructure added in this PR

Implementation Notes

This could have been a simple #[serde(deny_unknown_fields)] but sadly,
that is documented- but silent-at-compile-time-incompatible with
#[serde(flatten)].

neon_local tenant config now uses the .remove() pattern + bail if
there are leftover config args. That's in line with what
neon_local tenant create does. We should dedupe that logic in a future PR.

Footnotes

  1. https://github.com/neondatabase/neon/pull/4255

…een used

This does it like `tenant create`.

We should dedupe these at a future point in time.
This could have been a simple `#[serde(deny_unknown_fields)]` but sadly,
that is documented, but compile-time-silently incompatible with
`#[serde(flatten)]`.
@problame problame marked this pull request as ready for review May 17, 2023 18:50
@problame problame requested review from a team as code owners May 17, 2023 18:50
@problame problame requested review from knizhnik and removed request for a team May 17, 2023 18:50
@github-actions
Copy link

github-actions bot commented May 17, 2023

992 tests run: 946 passed, 0 failed, 46 skipped (full report)


Flaky tests (2)

Postgres 14

  • test_close_on_connections_exit[endpoint]: ✅ debug
  • test_threshold_based_eviction: ✅ debug
The comment gets automatically updated with the latest test results
fc9653c at 2023-05-18T23:19:56.206Z :recycle:

@problame problame removed the request for review from knizhnik May 17, 2023 19:29
@problame problame requested a review from skyzh May 17, 2023 20:14
@problame
Copy link
Contributor Author

@skyzh you know a better way than the awful macros?

Copy link
Member

@skyzh skyzh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM and we can see if it works with #[serde(deny_unknown_fields)] in the future 🤣

@skyzh
Copy link
Member

skyzh commented May 17, 2023

or I can take a look today or tomorrow to see if we can get rid of all of these... we may merge this first?

@skyzh
Copy link
Member

skyzh commented May 17, 2023

It seems that serde will apply all unrecognized fields to the flatten element, and therefore it should work by simply deny unknown fields on the flattened struct. serde-rs/serde#941 Trying to fix...

@problame
Copy link
Contributor Author

I‘m out tomorrow. If you don’t know a better way OTOH I would say let’s merge this and not waste more time on the serde problems.

Would still like to hear Shany‘s or Dmitry‘a opinion on the API-level change / rationale for this PR. They can click squash&merge tomorrow

@skyzh
Copy link
Member

skyzh commented May 17, 2023

related issue: serde-rs/serde#1600, probably we will need a HashMap in tenant config to collect unknown fields.

@skyzh
Copy link
Member

skyzh commented May 17, 2023

I just pushed a new commit and it seems to work for all test cases except test_unknown_fields_cli_config. Fixing...

@skyzh skyzh force-pushed the problame/tenant-config-reject-unknown-fields branch from 7722b2c to f60cdc2 Compare May 17, 2023 20:43
Signed-off-by: Alex Chi <iskyzh@gmail.com>
@skyzh skyzh force-pushed the problame/tenant-config-reject-unknown-fields branch from f60cdc2 to 88ad410 Compare May 17, 2023 20:44
@skyzh
Copy link
Member

skyzh commented May 17, 2023

Oh because I forgot to recompile neon_local. This PR should work now 🤪

Copy link
Contributor

@LizardWizzard LizardWizzard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me deny_unknown_fields worked. See https://github.com/neondatabase/neon/compare/dkr/unknown-fields?expand=1

I saw the in the doc that this not supported:

Note: this attribute is not supported in combination with flatten, neither on the outer struct nor on the flattened field.

IMO If we have tests it feels pretty safe to use it.

Regarding tests. I see that this can be done entirely in unit tests. For cli arguments it should be possible to craft the match object manually (I havent looked really deep, but I feels that it should be possible to just parse args from provided string).

For http API we testing only serde code here, so it feels that usual unit tests with some strings provided to serde_json::from_str should suffice and running pageserver is not needed.

@skyzh do you want to drive this? Or I can implement suggested changes

@skyzh
Copy link
Member

skyzh commented May 18, 2023

Sounds good. I think we can also use serde for cli arguments and use a single unit test for only ser/deserialize. I’ll do some refactors and push to this PR.

@skyzh
Copy link
Member

skyzh commented May 18, 2023

… and interesting, I thought I was doing exactly the same thing yesterday by adding deny unknown fields on the outer struct and it didn’t pass tests.

Signed-off-by: Alex Chi <iskyzh@gmail.com>
@skyzh
Copy link
Member

skyzh commented May 18, 2023

yeah it worked well (and developers just need to remember add deny_unknown_fields on all structs that flattens tenant config). Adding derive(Debug) to some structs because unwrap_err requires it.

Signed-off-by: Alex Chi <iskyzh@gmail.com>
@skyzh
Copy link
Member

skyzh commented May 18, 2023

Created a separate PR for parsing cli config using serde to avoid making this PR too complex -> #4273

@skyzh skyzh requested a review from LizardWizzard May 18, 2023 16:26
Signed-off-by: Alex Chi <iskyzh@gmail.com>
Copy link
Contributor

@LizardWizzard LizardWizzard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@skyzh skyzh merged commit b391c94 into main May 19, 2023
27 checks passed
@skyzh skyzh deleted the problame/tenant-config-reject-unknown-fields branch May 19, 2023 01:16
@problame
Copy link
Contributor Author

For the record, when I tried to use use_unknown_fields, I put it on the TenantConfig:

diff --git a/libs/pageserver_api/src/models.rs b/libs/pageserver_api/src/models.rs
index 3bfedd14e..0ef2f6f9e 100644
--- a/libs/pageserver_api/src/models.rs
+++ b/libs/pageserver_api/src/models.rs
@@ -132,7 +132,6 @@ pub struct TimelineCreateRequest {
 
 #[serde_as]
 #[derive(Serialize, Deserialize, Debug, Default)]
-#[serde(deny_unknown_fields)]
 pub struct TenantCreateRequest {
     #[serde(default)]
     #[serde_as(as = "Option<DisplayFromStr>")]
@@ -150,6 +149,7 @@ impl std::ops::Deref for TenantCreateRequest {
 }
 
 #[derive(Serialize, Deserialize, Debug, Default)]
+#[serde(deny_unknown_fields)]
 pub struct TenantConfig {
     pub checkpoint_distance: Option<u64>,
     pub checkpoint_timeout: Option<String>,

It compile-time-passes but runtime-fails the previously-Python-now-Rust-unit-tests in that case. So, my sanity is restored.

problame added a commit that referenced this pull request May 24, 2023
This PR adds support for supplying the tenant config upon /attach.

Before this change, when relocating a tenant using `/detach` and
`/attach`, the tenant config after `/attach` would be the default config
from `pageserver.toml`.
That is undesirable for settings such as the PITR-interval: if the
tenant's config on the source was `30 days` and the default config on
the attach-side is `7 days`, then the first GC run would eradicate 23
days worth of PITR capability.

The API change is backwards-compatible: if the body is empty, we
continue to use the default config.
We'll remove that capability as soon as the cloud.git code is updated to
use attach-time tenant config
(#4282 keeps track of this).

unblocks neondatabase/cloud#5092 
fixes #1555
part of #886 (Tenant
Relocation)

Implementation
==============

The preliminary PRs for this work were (most-recent to least-recent)

* #4279
* #4267
* #4252
* #4235
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants