Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

protectedts: rework protected timestamp storage for multi-tenant model #73727

Closed
22 of 23 tasks
adityamaru opened this issue Dec 13, 2021 · 7 comments
Closed
22 of 23 tasks
Assignees
Labels
A-disaster-recovery A-kv Anything in KV that doesn't belong in a more specific category. branch-master Failures on the master branch. C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-disaster-recovery

Comments

@adityamaru
Copy link
Contributor

adityamaru commented Dec 13, 2021

This is a tracking issue for the work that will be required to support protected timestamps in a multi-tenant mode. This is a very high level breakdown of tasks, the understanding of which might evolve as we get to coding them.

RFC #74685

Client side:

KV side

Stability followups (requirements):

Stability followups (nice to have):

  • Add support for tenant spans to spanconfig.Store and spanconfig.KVSubscriber datadriven tests.

Dropped approach for RPCS:

Epic: CRDB-10306

Jira issue: CRDB-11712

@adityamaru adityamaru added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) A-disaster-recovery labels Dec 13, 2021
@adityamaru adityamaru added this to Triage in Disaster Recovery Backlog via automation Dec 13, 2021
@blathers-crl
Copy link

blathers-crl bot commented Dec 13, 2021

cc @cockroachdb/bulk-io

@adityamaru adityamaru changed the title protectedts: rework protected timestamp storage interface for multi-tenant model protectedts: rework protected timestamp storage for multi-tenant model Dec 13, 2021
adityamaru added a commit to adityamaru/cockroach that referenced this issue Dec 22, 2021
Protected timestamp records are moving away from
the notion of protecting spans, and instead will operate
on objects. Objects will be defined as:

- Cluster
- Tenant
- Schema objects (database and table)

This change deprecates the Spans field on `ptpb.Record`.
Additionally, it adds a `oneof` target field that reflects
which of the above objects the record corresponds to.
This information will be needed by the SQLTranslator/Reconciler
in future work to emit SpanConfigurations based on the type
of object the job is protecting.

Informs: cockroachdb#73727

Release note: None
adityamaru added a commit to adityamaru/cockroach that referenced this issue Dec 22, 2021
Protected timestamp records are moving away from
the notion of protecting spans, and instead will operate
on objects. Objects will be defined as:

- Cluster
- Tenant
- Schema objects (database and table)

This change deprecates the Spans field on `ptpb.Record`.
Additionally, it adds a `oneof` target field that reflects
which of the above objects the record corresponds to.
This information will be needed by the SQLTranslator/Reconciler
in future work to emit SpanConfigurations based on the type
of object the job is protecting.

Informs: cockroachdb#73727

Release note: None
adityamaru added a commit to adityamaru/cockroach that referenced this issue Dec 23, 2021
Protected timestamp records are moving away from
the notion of protecting spans, and instead will operate
on objects. Objects will be defined as:

- Cluster
- Tenant
- Schema objects (database and table)

This change deprecates the Spans field on `ptpb.Record`.
Additionally, it adds a `oneof` target field that reflects
which of the above objects the record corresponds to.
This information will be needed by the SQLTranslator/Reconciler
in future work to emit SpanConfigurations based on the type
of object the job is protecting.

Informs: cockroachdb#73727

Release note: None
adityamaru added a commit to adityamaru/cockroach that referenced this issue Dec 27, 2021
This change adds a `BYTES` column to the system.pts_records
table, that will store the `Target` on a protected timestamp
record. We do not populate the column in this PR, that will
be done in a followup.

Additionally, we add a tenant migration to `ALTER TABLE ... ADD COLUMN`
to run on the upgrade path for older clusters.

Informs: cockroachdb#73727

Release note (sql change): `system.protected_timestamp_records` table
now has an additional `target` column that will store the target a
record protects. This target can either be the entire cluster, a list
of tenant keyspaces, or a list of database/table keyspaces.
adityamaru added a commit to adityamaru/cockroach that referenced this issue Dec 27, 2021
This change touches all jobs that create a protected timestamp
record before calling `Protect`. Previously, the created record
would contain the spans that the record was going to protect.

With this change, the record will also populate the `Target`
field on `ptpb.Record` with the object it is going to protect.
This can be one of:
- Cluster
- Tenants
- Schema object (database or table)

For backups, this target field is determined based on the targets
passed in by the user via the `BACKUP <target>` query.

For changefeeds, this target is the group of tables on which the
changefeed is being started + `system.descriptors` table.

For the streaming job, this target is the tenant that is being
streamed.

This change does not touch any of the several test files that create
a raw `ptpb.Record` for testing purposes. That will come in a follow
up PR where we actually teach `Protect` to validate and make use
of this `Target` field.

Informs: cockroachdb#73727

Release note: None
adityamaru added a commit to adityamaru/cockroach that referenced this issue Dec 28, 2021
Protected timestamp records now apply to a `ptpb.Target` instead
of `roachpb.Spans`. Protect will now write the `Target` on the record to the
`target` column in the system.pts_records table, and `GetRecord` will read
the `target` column when constructing a `ptpb.Record` to return to the
caller.

If we are running in a mixed version state where the relevant migration
to the system.pts_records table has not run, we use the existing logic
to protect Spans.

Once the migration has run, all calls to protect must
have a non-nil `Target` field on the record. While the new subsystem is
under development, we write both the `Spans` and the `Target` post migration.
Once the GC queue has been taught to use the `Target` field when making GC decisions,
we will write an empty `Spans` field in the `Protect` method, before persisting
state in the system table. All new records will no longer be reliant on
the Spans column, paving the way for deleting all Span related logic
in a future release.

This change tweaks the tests in `ptstorage_test` to populate the `Target`
field. All other test changes in `ptverifier`, `ptcache`, `kvserver` are
only to get past the "target is nil" validation step. These tests will
be changed as and when we develop those pieces of the subsystem.

Informs: cockroachdb#73727

Release note: None
adityamaru added a commit to adityamaru/cockroach that referenced this issue Dec 28, 2021
@adityamaru adityamaru removed the release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. label Mar 8, 2022
@adityamaru
Copy link
Contributor Author

PTS is now wired up end to end to work in a multi-tenant cluster. We will track follow-up release blockers in their own issues.

craig bot pushed a commit that referenced this issue Mar 9, 2022
75883: ptverifier: delete the protected timestamp Verifier r=miretskiy,arulajmani a=adityamaru

Verification semantics are not supported in the new multi-tenant
protected timestamp subsystem. For details please refer to the `Verification` section in the
RFC at #74685.

This change deletes the Verifier, and all references of it.

Informs: #73727

Release note: None

Release justification: low risk change to existing functionality

Co-authored-by: Aditya Maru <adityamaru@gmail.com>
adityamaru added a commit to adityamaru/cockroach that referenced this issue Mar 10, 2022
Verification semantics are not supported in the new multi-tenant
protected timestamp subsystem. For details about why this is the
case please refer to the `Verification` section in the
RFC at cockroachdb#74685.

This change deletes the Verifier, and all references of it.

Informs: cockroachdb#73727

Release note: None

Release justification: low risk change to existing functionality
adityamaru added a commit to adityamaru/cockroach that referenced this issue Mar 10, 2022
This change switches the `EnableProtectedTimestampForMultiTenant`
testing knob to `DisableProtectedTimestampForMultiTenant`. This means
that all tests will now run with the ptpb.Target and spanconfig backed
protectedts infrastructure by default.

Informs: cockroachdb#73727

Release note: None

Release justification: non-production code changes
adityamaru added a commit to adityamaru/cockroach that referenced this issue Mar 10, 2022
This change switches the `EnableProtectedTimestampForMultiTenant`
testing knob to `DisableProtectedTimestampForMultiTenant`. This means
that all tests will now run with the ptpb.Target and spanconfig backed
protectedts infrastructure by default.

Informs: cockroachdb#73727

Release note: None

Release justification: non-production code changes
adityamaru added a commit to adityamaru/cockroach that referenced this issue Mar 10, 2022
This change adds a constructor method that returns
a new `spanconfig.Record` and conditionally performs
some validation if the target is a SystemTarget.

Informs: cockroachdb#73727

Release note: None

Release justification: low-risk updates to new functionality
adityamaru added a commit to adityamaru/cockroach that referenced this issue Mar 10, 2022
This change adds a constructor method that returns
a new `spanconfig.Record` and conditionally performs
some validation if the target is a SystemTarget.

Informs: cockroachdb#73727

Release note: None

Release justification: low-risk updates to new functionality
adityamaru added a commit to adityamaru/cockroach that referenced this issue Mar 11, 2022
This change switches the `EnableProtectedTimestampForMultiTenant`
testing knob to `DisableProtectedTimestampForMultiTenant`. This means
that all tests will now run with the ptpb.Target and spanconfig backed
protectedts infrastructure by default.

Informs: cockroachdb#73727

Release note: None

Release justification: non-production code changes
adityamaru added a commit to adityamaru/cockroach that referenced this issue Mar 11, 2022
This change switches the `EnableProtectedTimestampForMultiTenant`
testing knob to `DisableProtectedTimestampForMultiTenant`. This means
that all tests will now run with the ptpb.Target and spanconfig backed
protectedts infrastructure by default.

Informs: cockroachdb#73727

Release note: None

Release justification: non-production code changes
adityamaru added a commit to adityamaru/cockroach that referenced this issue Mar 11, 2022
This change adds a constructor method that returns
a new `spanconfig.Record` and conditionally performs
some validation if the target is a SystemTarget.

Informs: cockroachdb#73727

Release note: None

Release justification: low-risk updates to new functionality
craig bot pushed a commit that referenced this issue Mar 11, 2022
77349: spanconfig: add Record constructor and validation r=adityamaru a=adityamaru

This change adds a constructor method that returns
a new `spanconfig.Record` and conditionally performs
some validation if the target is a SystemTarget.

Informs: #73727

Release note: None

Release justification: low-risk updates to new functionality

77608: optbuilder: do not create invalid casts when building COALESCE and IF r=mgartner a=mgartner

The optbuilder no longer creates invalid casts when building COALESCE
and IF expressions that have children with different types. Expressions
that previously caused internal errors now result in user-facing errors.
Both UNION and CASE expressions had similar bugs that were recently
fixed in #75219 and #76193.

This commit also updates the `tree.ReType` function to return `ok=false`
if there is no valid cast to re-type the expression to the given type.
This forces callers to explicitly deal with situations where re-typing
is not possible and it ensures that the function never creates invalid
casts. This will make it easier to track down future related bugs
because internal errors should originate from the call site of
`tree.ReType` rather than from logic further along in the optimization
process (in the case of #76807 the internal error originated from the
logical props builder when it attempted to lookup the volatility of the
invalid cast).

This commit also adds special logic to make casts from any tuple type to
`types.AnyTuple` valid immutable, implicit casts. Evaluation of these
casts are no-ops. Users cannot construct these casts, but they are built
by optbuilder in some cases.

Fixes #76807

Release justification: This is a low-risk change that fixes a minor bug.

Release note (bug fix): A bug has been fixed that caused internal errors
when COALESCE and IF expressions had inner expressions with different
types that could not be cast to a common type.


77632: ui: new plan table on statement details r=maryliag a=maryliag

Previously, the Explain Plan tab on Statement Details was
showing only one plan. This commit introduces a table of plan
with their respective executions stats.
When a plan is clicked on the table, it shows the Plan and
its statistics.

Fixes #72129

Page on DB Console: [video](https://www.loom.com/share/0898f48021eb4037a6f86760053a5e85)
Page on CC Console: [video](https://www.loom.com/share/84d8ec40ae7e4eb19291788721ab7133)

<img width="1058" alt="Screen Shot 2022-03-10 at 2 42 25 PM" src="https://user-images.githubusercontent.com/1017486/157742210-12b79f5d-274c-48e6-8fb6-dafc74fd25b3.png">
<img width="988" alt="Screen Shot 2022-03-10 at 2 42 36 PM" src="https://user-images.githubusercontent.com/1017486/157742211-daa2a07c-b025-4b36-a49a-4cafe7117bc8.png">



Release justification: Category 4
Release note (ui change): Explain Plan tab on Statement Details
shows statistics for all the plans executed by the selected statement
on the selected period.

77688: ci: ensure all nightlies post github issues when tests fail r=rail a=rickystewart

Release justification: ensure failing nightlies post github issues
Release note: None

Co-authored-by: Aditya Maru <adityamaru@gmail.com>
Co-authored-by: Marcus Gartner <marcus@cockroachlabs.com>
Co-authored-by: Marylia Gutierrez <marylia@cockroachlabs.com>
Co-authored-by: Ricky Stewart <ricky@cockroachlabs.com>
adityamaru added a commit to adityamaru/cockroach that referenced this issue Mar 14, 2022
This change switches the `EnableProtectedTimestampForMultiTenant`
testing knob to `DisableProtectedTimestampForMultiTenant`. This means
that all tests will now run with the ptpb.Target and spanconfig backed
protectedts infrastructure by default.

Informs: cockroachdb#73727

Release note: None

Release justification: non-production code changes
craig bot pushed a commit that referenced this issue Mar 14, 2022
77478: protectedts: flip knob to enable multi-tenant PTS in clusters r=arulajmani,samiskin a=adityamaru

This change switches the `EnableProtectedTimestampForMultiTenant`
testing knob to `DisableProtectedTimestampForMultiTenant`. This means
that all tests and clusters will now run with the ptpb.Target and spanconfig backed
protectedts infrastructure by default.



Informs: #73727

Release note: None

Release justification: high impact change to enable new functionality.

Co-authored-by: Aditya Maru <adityamaru@gmail.com>
@arulajmani
Copy link
Collaborator

I think it's safe to say this thing is done, so I'm going to close out this issue.

Disaster Recovery Backlog automation moved this from Triage to Done Apr 27, 2022
@mari-crl
Copy link
Contributor

ℹ️ Hello! I am a human and not at all a robot! Look at my very human username! 🤖 🎶
🤔 Although I tried very hard to figure out what to do with this issue, more powerful human brains will need to help me. (specifically: Both Github and Jira issues were modified after exalate problems) 😖
🔄 Please visit this issue's mirror at CRDB-11712 and try to sync the two sides up manually. 🌟
✅ When you're finished, comment saying as much asn a member of Developer Infrastructure will be along to finish linking. 🔗
🚫 Note that until this is done, this issue is not and will not be synced to Jira with Exalate. 🚫
😅 Feeling lost? Don't worry about it! A member of @cockroachdb/exalate-22-cleanup-team will be along shortly to help! 👍
👷 Developer Infrastructure members: when ready, open Exalate from the right-hand menu of the mirror issue in Jira, then choose Connect and enter this issue's URN: cockroachdb/cockroach-73727. Either way, delete this comment when you're done. 🔑
🙏 Thank you for your compliance, my fellow humans! 🤖 👋

@mwang1026
Copy link

sync done

@rail
Copy link
Member

rail commented May 26, 2022

Manually synced with Jira

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-disaster-recovery A-kv Anything in KV that doesn't belong in a more specific category. branch-master Failures on the master branch. C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-disaster-recovery
Development

No branches or pull requests

5 participants