New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: potential inconsistency after certain schema changes #44007
Comments
Not that I know how gossip works, but could grouping descriptors modified in a txn into the same gossip message solve this problem? That would be difficult though, because different ranges could be affected by the descriptor writes. |
We need to define what is meant by "inconsistent state". Let's make this more concrete with descriptor versions.
Any combination of these pairs is valid (i.e.
They will be gossiped at the same time but I don't think that's relevant to the discussion at hand. Let's forget about gossip and its role in cache invalidation as it's not that critical and is more of an optimization. |
Generally, I'd imagine that either I think for operations that modify the descriptors we don't have to worry about this, so we just have to look at operations that just read the descriptors. Take an insert on a table with a foreign key reference. If the origin table has the reference, but the referenced table doesn't, should we still check that the constraint is validated on the insert? In the case of primary key changes, we aren't going to be deleting these references, but instead changing the objects on the descriptors that they point to (for example, in the interleaved case, we want to change the indexID). If we have one of these version mismatched pairs, then these references could point to objects which no longer exist on the newer version descriptors, which is definitely a problem. |
It is critical for schema changes to worry about exactly this.
I disagree, great care has been taken to make sure that the transitions between versions are safe. Generally by modifying either the logical or physical schema layout of a table in one step but never both.
I'm still not sure I see a hazard at this point. Can you spell out a problematic case where we'll hit an error if in
This implies that the protocol for schema changes when changing a primary key will need to step through multiple versions. |
Either seems valid to me. That's a good bit of complexity to put into the optimizer to detect that case. |
Is it not wrong for one node (on A1 B2) to have different behavior than another node on (A2 B1)? Also, my understanding of the schema change system is that multiple stages are there to ensure that everything works when different versions of the same table exist in the cluster, not different versions of different tables. |
Not obviously, no. It seems wrong for there to be more than 2 behaviors total but if
That seems correct to me. Perhaps we need to add a dependency mechanism. When a table refers to another it should do so at a minimum version or something. Then we could make it illegal to use mixed versions which are incompatible. |
Oh that could work! Though I'm not sure how we could migrate all the existing descriptors to do that. To clarify what I meant above "I think for operations that modify the descriptors we don't have to worry about this, so we just have to look at operations that just read the descriptors." -- the operations that modify descriptors don't have to worry about the mixed version cases, because they will always read (A2, B2) right? |
At least in the scope of execution of a transaction I believe that's roughly correct. A statement that modifies descriptors will have references to both the old versions and the uncommitted copies. For the rest of the transaction the uncommitted versions will get used. They are stored in the connExecutor in the
Does it matter? Certainly it matters for the upcoming primary key change work but does it matter prior to that work? I'm not clear on whether we've identified a hazardous case that exists today. |
Can you clarify this?
I'll look around more and see if I can find something, but as of now, no. |
Say we add a column to a table in a transaction. We cannot write to that column during that transaction. If we did then other concurrent transactions wouldn't know how to interpret data in the table. See #43057 / #42061 (comment) for that issue. |
FWIW, the "issue" that I'm for-seeing with primary key changes on interleaved tables exists right now when dropping an interleaved index. In this case, |
Sounds like you found a bug. |
Actually, I'm not sure. It seems like the only uses are in places which modify the table descriptor, so they wouldn't enter this mixed version state in the first place. |
@ajwerner and I discussed this offline. The summary of the discussion was:
Please update this if I missed anything or remembered incorrectly. |
In some cases (such as interleaved tables or foreign keys), a table A has a reference to table B, and table B maintains a back reference to A. At least in cases where we drop an object that requires also removing the back reference from referenced table, I believe there is a chance for nodes to be in an inconsistent state.
For example, take the code path for dropping an index, with cascade. If the index being dropped is in use for a foreign key constraint, then the constraint is also dropped. What this means is that the appropriate foreign key constraint is removed from both the origin table and the referenced table in the same transaction.
Now because descriptors are cached and gossip is used to invalidate the caches, a node could end up in an inconsistent state with this deleted foreign key constraint. Imagine that a node gets the invalidation for the origin table, but has not yet received the gossip update for the referenced table -- then it will be in a state where the origin table says there isn't a foreign key, while the referenced table says that there is one. (assuming my understanding of gossip is correct)
I believe this problem could affect most places where we drop resources that have this reference and back reference (like dropping an interleaved index etc.), and I don't see if there is something in place that avoids this. It also feels like it is affects #43759 for the same reason.
Jira issue: CRDB-5259
The text was updated successfully, but these errors were encountered: