Self-service private-link configuration#2944
Merged
Merged
Conversation
d41b7a5 to
e9736cb
Compare
4a99da9 to
6ab0520
Compare
6ab0520 to
55164df
Compare
6ec128a to
5bf9a1f
Compare
GregorShear
reviewed
Jun 12, 2026
5bf9a1f to
6e4b40c
Compare
…async-graphql derives
The GraphQL API and the data-plane controller need to speak the same `PrivateLink` shape: DPC reads the `private_links json[]` column off `data_planes` and ships it into the est-dry-dock Pulumi stack, and the GraphQL surface needs both typed read and a typed input for the mutation that replaces the column. Defining the types once in `models` is the only way to share them without making DPC depend on the GraphQL crate.
* Move `PrivateLink`, `AWSPrivateLink`, `AzurePrivateLink`, `GCPPrivateServiceConnect` to `models::private_links`, re-export from `data-plane-controller::shared::stack` so existing DPC callers keep compiling unchanged.
* Add async-graphql derives under the existing `async-graphql` feature: `Union` + `OneofObject` on `PrivateLink`, `SimpleObject` + `InputObject` on each provider struct. Same Rust enum drives `union PrivateLink` on output and `input PrivateLinkInput @oneOf` on input, so the two GraphQL surfaces cannot drift.
* Azure `dns_name` and `resource_type` flip from `String`-with-empty-as-sentinel to `Option<String>` with a serde adapter that normalizes both an absent field and an empty string to `None` on read and omits both `None` and `Some("")` on write. Wire format byte-for-byte unchanged; the GraphQL surface exposes them as nullable instead of requiring clients to know the empty-string convention.
* AWS `service_region: Option<String>` carries the master-side change (commit b1d51ed) into the moved type, with `skip_serializing_if = "Option::is_none"` so the column stays absent (not `null`) when unset, preserving est-dry-dock's existing fallback to `region`.
6e4b40c to
9c34298
Compare
…d `updateDataPlanePrivateLinks` mutation Exposes the configured private-link endpoints on the `dataPlanes` query as a typed union against `models::PrivateLink`, alongside the three provisioning-result endpoint columns (`awsLinkEndpoints`, `azureLinkEndpoints`, `gcpPscEndpoints`) as opaque JSON arrays. Adds a mutation that replaces the entire `private_links` column on a private data plane; the data-plane controller picks up the new configuration on its next poll and converges via Pulumi. * `dataPlanes` selects the raw `private_links` JSON once per page and parses each entry lazily in a `ComplexObject` resolver, so a malformed historical row produces a descriptive field-level error naming the data plane and the failing index rather than breaking the whole query selection. * `updateDataPlanePrivateLinks(dataPlaneName, privateLinks)` accepts `[PrivateLinkInput!]!` directly against the same `models::PrivateLink` types via async-graphql's `@oneOf` input; supplying multiple branches in a single element is rejected by GraphQL validation before the handler runs. * Name gating is a structural check: the name must sit under `ops/dp/private/<tenant>/...`. Anything more specific (cluster suffix shape, owning prefix shape) is the data plane's problem; an unknown name falls out as "not found" when the UPDATE matches zero rows. Per-provider validation is intentionally not duplicated here: the authoritative schema is the est-dry-dock Pydantic model on the consumer side, and a bad shape surfaces via DPC convergence. * Interim authorization requires `read` on the private data-plane name, mirroring the existing deployment mutation shape. This will move to `manage_dataplane` once the orthogonal capability model lands. * Returns `Boolean!` so the mutation surface is minimal; callers that need the post-write state re-query `dataPlanes`.
Adds two capability bits, `ViewDataPlanePrivateNetworking` and `ModifyDataPlanePrivateNetworking`, composed by a new `ManageDataPlane` bundle. The bundle is granted on the role-grant edge `<tenant>/ -> ops/dp/private/<tenant>/` at provisioning time, so tenant admins (whose `Admin` bundle composes `ManageDataPlane`) inherit both bits on their private data planes; non-admin tenant grantees intersect to their own bundle at the edge and never reach these bits.
* `private_links` resolver on `DataPlane` returns an empty list to callers that lack `ViewDataPlanePrivateNetworking` on the data plane name.
* `updateDataPlanePrivateLinks` mutation requires `ModifyDataPlanePrivateNetworking` (replacing the interim legacy `read` check).
* `evaluate_names_authorization` accepts any `Into<CapabilitySet> + Display + Copy`, letting call sites pass either legacy `models::Capability` or a single `models::authz::Capability` bit; legacy callers keep working unchanged via the `From<legacy>` impl.
* `models::authz::Capability` gets a `Display` impl that delegates to `Debug`, so error messages reference each bit by its Rust identifier.
* `create_data_plane.rs` stamps `bundles = '{manage_data_plane}'` on the new role grant. The accompanying two-step migration adds the enum value and backfills existing rows -- `alter type add value` cannot be used in the same transaction that subsequently references the new value.
9c34298 to
4095dbd
Compare
GregorShear
approved these changes
Jun 16, 2026
GregorShear
added a commit
that referenced
this pull request
Jun 16, 2026
…lities The service-account management surface was asymmetric: listing authorized on the fine-grained `ManageServiceAccounts` capability while every mutation required the full `Admin` bundle. That produced a class of principal (a `TeamAdmin` without `Admin`) who could see service accounts but manage none of them, and it meant the named capability the feature defines was never the actual gate for any write. Make the surface track the capabilities it defines: - Anchor checks (create/add-grant/remove-grant/create-key/revoke-key, all on the account's `catalog_name`) now require `ManageServiceAccounts`, matching the listing query. A `TeamAdmin` can fully manage accounts and their keys without holding `Admin`. - The per-grant prefix check now requires `CreateGrant` on each granted prefix rather than `Admin`. This is the anti-escalation guard — a caller still can't hand a service account reach they couldn't grant anyone — but it keys off the capability that authorizes granting, not the whole Admin bundle. Human-user grant creation still lives in PostgREST; when it migrates to GraphQL it should adopt this same `CreateGrant` gate. To express fine-grained capabilities at these call sites, generalize `evaluate_names_authorization` and `verify_authorization` to accept anything that converts into a `CapabilitySet` (legacy `models::Capability`, a single `models::authz::Capability` bit, or an explicit set). The BFS primitive already operated on `CapabilitySet`, so this only lifts the wrapper signatures; existing legacy-capability callers are unaffected. Add a `Display` impl for `models::authz::Capability` so denial messages render the required capability. These three changes mirror the same generalization in #2944 so the two branches converge cleanly on whichever merges second. Add `test_team_admin_manages_without_full_admin`, which seeds a caller holding only the `team_admin` bundle (no `Admin`) and asserts both the positive path (manage accounts, mint keys, grant prefixes within reach) and the anti-escalation boundary (cannot grant a prefix they lack `CreateGrant` on). Also correct the migration comments, which claimed API keys are never exchanged for a JWT — the token-exchange endpoint mints a short-lived access token from a key.
GregorShear
added a commit
that referenced
this pull request
Jun 17, 2026
…lities The service-account management surface was asymmetric: listing authorized on the fine-grained `ManageServiceAccounts` capability while every mutation required the full `Admin` bundle. That produced a class of principal (a `TeamAdmin` without `Admin`) who could see service accounts but manage none of them, and it meant the named capability the feature defines was never the actual gate for any write. Make the surface track the capabilities it defines: - Anchor checks (create/add-grant/remove-grant/create-key/revoke-key, all on the account's `catalog_name`) now require `ManageServiceAccounts`, matching the listing query. A `TeamAdmin` can fully manage accounts and their keys without holding `Admin`. - The per-grant prefix check now requires `CreateGrant` on each granted prefix rather than `Admin`. This is the anti-escalation guard — a caller still can't hand a service account reach they couldn't grant anyone — but it keys off the capability that authorizes granting, not the whole Admin bundle. Human-user grant creation still lives in PostgREST; when it migrates to GraphQL it should adopt this same `CreateGrant` gate. To express fine-grained capabilities at these call sites, generalize `evaluate_names_authorization` and `verify_authorization` to accept anything that converts into a `CapabilitySet` (legacy `models::Capability`, a single `models::authz::Capability` bit, or an explicit set). The BFS primitive already operated on `CapabilitySet`, so this only lifts the wrapper signatures; existing legacy-capability callers are unaffected. Add a `Display` impl for `models::authz::Capability` so denial messages render the required capability. These three changes mirror the same generalization in #2944 so the two branches converge cleanly on whichever merges second. Add `test_team_admin_manages_without_full_admin`, which seeds a caller holding only the `team_admin` bundle (no `Admin`) and asserts both the positive path (manage accounts, mint keys, grant prefixes within reach) and the anti-escalation boundary (cannot grant a prefix they lack `CreateGrant` on). Also correct the migration comments, which claimed API keys are never exchanged for a JWT — the token-exchange endpoint mints a short-lived access token from a key.
GregorShear
added a commit
that referenced
this pull request
Jun 17, 2026
…lities The service-account management surface was asymmetric: listing authorized on the fine-grained `ManageServiceAccounts` capability while every mutation required the full `Admin` bundle. That produced a class of principal (a `TeamAdmin` without `Admin`) who could see service accounts but manage none of them, and it meant the named capability the feature defines was never the actual gate for any write. Make the surface track the capabilities it defines: - Anchor checks (create/add-grant/remove-grant/create-key/revoke-key, all on the account's `catalog_name`) now require `ManageServiceAccounts`, matching the listing query. A `TeamAdmin` can fully manage accounts and their keys without holding `Admin`. - The per-grant prefix check now requires `CreateGrant` on each granted prefix rather than `Admin`. This is the anti-escalation guard — a caller still can't hand a service account reach they couldn't grant anyone — but it keys off the capability that authorizes granting, not the whole Admin bundle. Human-user grant creation still lives in PostgREST; when it migrates to GraphQL it should adopt this same `CreateGrant` gate. To express fine-grained capabilities at these call sites, generalize `evaluate_names_authorization` and `verify_authorization` to accept anything that converts into a `CapabilitySet` (legacy `models::Capability`, a single `models::authz::Capability` bit, or an explicit set). The BFS primitive already operated on `CapabilitySet`, so this only lifts the wrapper signatures; existing legacy-capability callers are unaffected. Add a `Display` impl for `models::authz::Capability` so denial messages render the required capability. These three changes mirror the same generalization in #2944 so the two branches converge cleanly on whichever merges second. Add `test_team_admin_manages_without_full_admin`, which seeds a caller holding only the `team_admin` bundle (no `Admin`) and asserts both the positive path (manage accounts, mint keys, grant prefixes within reach) and the anti-escalation boundary (cannot grant a prefix they lack `CreateGrant` on). Also correct the migration comments, which claimed API keys are never exchanged for a JWT — the token-exchange endpoint mints a short-lived access token from a key.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR exposes the
data_planes.private_linksconfiguration through the GraphQL API. Today private data-plane customers file a support request and we edit the column by hand.dataPlanesquery: private-networking fieldsprivateLinksreturns the configured endpoints as a typed union (AWSPrivateLink | AzurePrivateLink | GCPPrivateServiceConnect).awsLinkEndpoints/azureLinkEndpoints/gcpPscEndpointsexpose the provisioning results the data-plane controller writes back, as opaque JSON.ViewDataPlanePrivateNetworkingcapability and fail closed to an empty list.updateDataPlanePrivateLinksmutation[PrivateLinkInput!]!, a oneof input mirroring the union, and replaces the entireprivate_linksarray on a private data plane; partial updates are intentionally not supported. The DPC converges to the new configuration on its next poll.ModifyDataPlanePrivateNetworkingon the data-plane name, after a structural check that the name is underops/dp/private/<tenant>/.Authorization model
ViewDataPlanePrivateNetworkingandModifyDataPlanePrivateNetworking.Viewerbundle (and therefore legacyread) carries the View bit:readon a data-plane prefix already conveys deploy-level trust, so viewing the plane's networking configuration comes with it.Modifyreaches a tenant admin through the conjunction of two grants: theops/dp/private/<tenant>/role_grant must carry the newManageDataPlanebundle (create_data_planeinstalls it at provisioning time; a migration backfills existing private data planes), and the user's own bits at the tenant must include the bit, which today meansadminsinceAdminfoldsManageDataPlanein. Bits intersect along the delegation chain, so neither grant alone conveysModify.ModifyDataPlanePrivateNetworkingto a non-admin, create a user_grant on the tenant carrying{manage_data_plane, delegate}(Editoralready carriesDelegate, so editors need onlymanage_data_plane).Shared models
PrivateLinktypes moved fromdata-plane-controller::shared::stacktomodels, with feature-gated async-graphql derives so one set of structs backs the DB column, the GraphQL schema, and the DPC (which re-exports them).dns_name/resource_typeare nowOption<String>. The previous shape wasStringwithskip_serializing_if = "String::is_empty", so""and absent were already indistinguishable on the wire: both serialized as a missing field. The new deserializer normalizes an incoming""toNone, and serialization stays field-absent, so the stored bytes and the stack config est-dry-dock reads are identical for every historical row shape (absent,"", or a value); a round-trip test inmodels::private_linkspins this.""and use truthiness checks. The visible effect is confined to GraphQL, where the fields are nullable and rows that stored""surface asnull.Reviewer notes
Trade-offs discussed during review, recorded so they don't need to be re-litigated:
authorized()helper ingraphql/mod.rs; what remains per resolver is the gate call plus a clone-and-wrap body. Agated_endpointshelper bundling the gate and the JSON mapping was tried and dropped as trivial-body indirection under a vague name. Keeping the gate visible inside each resolver keeps the authorization audit local; the cost is that a change to the denial behavior must be edited in four places.Option<String>was chosen over two alternatives: keepingString(zero glue, but the""sentinel becomes part of the public GraphQL contract asString!), and mapping""tonullat the GraphQL layer only (DPC untouched, but it requires an Azure-only split of input and output types while the model keeps the sentinel). Normalizing inmodelscosts a small serde adapter plus a round-trip test, and makes the nullable contract hold everywhere the type is used.ViewDataPlanePrivateNetworkingis kept as its own bit even thoughreadcurrently implies it viaViewer. The rejected alternative was to drop the bit and let DP-node visibility be the gate, which would have deleted the field gates entirely. Keeping the bit means each field gate names its capability explicitly, and the View / Modify split survives future changes to whatreadimplies.Deployment note
Migrations are applied manually; the sequencing matters:
create_data_planebinds themanage_data_planevalue, so provisioning a private data plane on the new code errors if the value is missing. (Two migration files becausealter type add valuecannot be referenced in the same transaction.)bundlesstrictly in the snapshot loader, so any row carryingmanage_data_planebreaks theirrole_grantsfetch. For the same reason, do not provision a private data plane mid-rollout while old pods are live.Fixes #2773