Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: use the table descriptor ADD state correctly #20714

Open
vivekmenezes opened this issue Dec 14, 2017 · 5 comments
Open

sql: use the table descriptor ADD state correctly #20714

vivekmenezes opened this issue Dec 14, 2017 · 5 comments
Labels
A-schema-changes A-schema-descriptors Relating to SQL table/db descriptor handling. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. S-3-erroneous-edge-case Database produces or stores erroneous data without visible error/warning, in rare edge cases. T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions)

Comments

@vivekmenezes
Copy link
Contributor

vivekmenezes commented Dec 14, 2017

A new table descriptor being added when referencing another table has to be added properly. An FK reference is implemented using two references (forward from new table to its FK reference, and backward from the FK referee to the new table). The forward reference is obvious, the backward reference on the other hand is used when updating the referee. Updating a referee table for example must ensure that it is not violating the FK reference. This becomes problematic if a node has cached a referee table without the backward reference and SQL is already putting data into the new table.

We solve this problem by placing a newly created table in the ADD state and then transition it into the PUBLIC state once all leased referee descriptors are using the backward link. A descriptor in the ADD state cannot be used to read/write to and so the table descriptor leasing logic disallows a node from holding a lease against such a table.

However one can have a situation where a referee sees a back reference to a table, the local node has a leased version of the new table in the ADD state, but the new table has transitioned to the PUBLIC state and another node has added data into the new table. Under the circumstances the referee must apply the backward FK checks it would normally apply.

This can be resolved by allowing the new table to be leased in the ADD state but ensure that only fk checks and cascade operations can use the descriptor. This can mean that the normal API still disallow returning descriptors in the ADD state while allowing them to be leased internally, and the FK operations use a separate API that allows grabbing references on leased descriptors in the ADD state.

Jira issue: CRDB-5914

@knz knz added A-bulkio-schema-changes A-schema-descriptors Relating to SQL table/db descriptor handling. labels Apr 28, 2018
@knz knz added this to Triage in Bulk IO via automation Apr 28, 2018
@knz knz added the C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) label Apr 28, 2018
@vivekmenezes vivekmenezes added this to Triage in Bulk IO via automation May 24, 2018
@vivekmenezes vivekmenezes moved this from Triage to Milestone June 25 - Todo in Bulk IO May 24, 2018
@maddyblue maddyblue moved this from Milestone June 25 - Todo to Milestone July 23rd in Bulk IO Jun 12, 2018
@vivekmenezes vivekmenezes moved this from Milestone July 23rd to Backlog high priority in Bulk IO Jun 20, 2018
@vivekmenezes vivekmenezes moved this from Backlog high priority to Milestone July 23rd in Bulk IO Jun 20, 2018
@vivekmenezes vivekmenezes moved this from Milestone July 23rd to Backlog high priority in Bulk IO Jun 20, 2018
@vivekmenezes vivekmenezes added this to Triage in Disaster Recovery Backlog via automation Jul 24, 2018
@vivekmenezes vivekmenezes removed this from Backlog high priority in Bulk IO Jul 24, 2018
@vivekmenezes vivekmenezes moved this from Triage to Backlog in Disaster Recovery Backlog Jul 24, 2018
@vivekmenezes vivekmenezes moved this from Backlog to New schema lease in Disaster Recovery Backlog Oct 29, 2018
@vivekmenezes
Copy link
Contributor Author

This problem will be resolved by the new schema lease mechanism which will lease the entire schema at a timestamp, thereby ensuring that a node sees the reference and the back-reference in the PUBLIC state.

@dt dt moved this from New schema lease to Backlog in Disaster Recovery Backlog May 21, 2019
@dt dt moved this from Untriaged to Backlog: Constrains in Disaster Recovery Backlog Jul 16, 2019
@dt dt moved this from Backlog: Constrains to Backlog: Table/Index Create/Drop in Disaster Recovery Backlog Sep 9, 2019
@mwang1026 mwang1026 added this to Triage in SQL Foundations via automation Jan 27, 2020
@dt dt removed this from Backlog: Table/Index Create/Drop in Disaster Recovery Backlog Mar 6, 2020
@thoszhang thoszhang added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. S-3-erroneous-edge-case Database produces or stores erroneous data without visible error/warning, in rare edge cases. and removed C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) labels Apr 28, 2020
@thoszhang thoszhang moved this from Triage to Cold storage in SQL Foundations Apr 28, 2020
@thoszhang thoszhang moved this from Cold storage to 20.2 Backlog in SQL Foundations Apr 28, 2020
@jlinder jlinder added the T-sql-schema-deprecated Use T-sql-foundations instead label Jun 16, 2021
@ajwerner
Copy link
Contributor

This seems quite real. See

// resolveTable resolves a table StableID. Returns nil if the table is in the
// process of being added, in which case it is safe to ignore any FK
// relation with the table.
func resolveTable(ctx context.Context, catalog cat.Catalog, id cat.StableID) cat.Table {
ref, isAdding, err := catalog.ResolveDataSourceByID(ctx, cat.Flags{}, id)
if err != nil {
if isAdding {
// The table is in the process of being added.
return nil
}
panic(err)
}
return ref.(cat.Table)
}

@ajwerner
Copy link
Contributor

ajwerner commented Aug 2, 2021

My current thinking on how to solve this relates to the fact that we validate cross references at the present snapshot when we perform leasing. That means that when we lease a descriptor, we know the modification timestamps and versions of all of the referenced descriptors. If the leased descriptor tracked this, then we could enforce that if any of those referenced descriptors is leased by a transaction, that it must be the newer one. This might pick up false dependencies, but they should be short-lived.

@postamar postamar moved this from Backlog to 22.2 General in SQL Foundations Mar 23, 2022
@ajwerner
Copy link
Contributor

The above analysis is barely relevant to the problem at hand. The above protocol might be useful in some other situations, but here I think we indeed need the optimizer to plan foreign key cascades to ADDing tables in order to avoid missing writes. I think we added the necessary pieces in the catalog to resolve the cascades, so really it seems like fixing this can be done from the optimizer.

@ajwerner
Copy link
Contributor

cc @mgartner

@postamar postamar moved this from General to Backlog in SQL Foundations Sep 19, 2022
@postamar postamar moved this from Backlog to Cold storage in SQL Foundations Nov 10, 2022
@exalate-issue-sync exalate-issue-sync bot added T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions) and removed T-sql-schema-deprecated Use T-sql-foundations instead labels May 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-schema-changes A-schema-descriptors Relating to SQL table/db descriptor handling. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. S-3-erroneous-edge-case Database produces or stores erroneous data without visible error/warning, in rare edge cases. T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions)
Projects
SQL Foundations
  
Cold storage
Development

No branches or pull requests

5 participants