New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
spanconfigkvsubscriber: replica of a GC'ed span does not respect SystemSpanConfigs #113867
Labels
A-kv
Anything in KV that doesn't belong in a more specific category.
branch-release-23.1
Used to mark GA and release blockers and technical advisories for 23.1
C-bug
Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
O-testcluster
Issues found or occurred on a test cluster, i.e. a long-running internal cluster
P-3
Issues/test failures with no fix SLA
Comments
adityamaru
added
C-bug
Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
A-kv
Anything in KV that doesn't belong in a more specific category.
GA-blocker
O-testcluster
Issues found or occurred on a test cluster, i.e. a long-running internal cluster
labels
Nov 6, 2023
Hi @adityamaru, please add branch-* labels to identify which branch(es) this release-blocker affects. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
I've slapped on the GA-blocker label, but technically this is not a regression from 22.2 when this infrastructure was first introduced so we can re-evaluate. |
adityamaru
added a commit
to adityamaru/cockroach
that referenced
this issue
Nov 7, 2023
WIP Informs: cockroachdb#113867 Release note (bug fix): WIP
adityamaru
added a commit
to adityamaru/cockroach
that referenced
this issue
Nov 8, 2023
Previously, if a range did not have an overlapping span config then `ForEachOverlappingSpanConfig` would not apply the relevant system span configs that may still apply to that range. One situation in which we could end up with such a range when a table is dropped and all of its data (including the range deletion tombstone installed by the drop) is GC'ed, the associated schema change GC job will delete the table's span config. In this case, we will not find any overlapping span configs for the table's span, but a system span config, such as a cluster wide protection policy, may still be applicable to the replica with the empty table span. A scan of that span AOST the timestamp at which we wrote the protection policy could result in a BatchTimestampBeforeGCError. This change fixes the above bug by applying a fallback config to the span with no overlapping span configs, and combining the system span configs that apply to the span. The change adds a red-green test to exercise this logic. Informs: cockroachdb#113867 Release note (bug fix): an empty range corresponding to a drop table did not respect system level span configurations such as protected timestamps, potentially causing reads above the protected timestamp to fail
craig bot
pushed a commit
that referenced
this issue
Nov 21, 2023
Previously, if a range did not have an overlapping span config then `ForEachOverlappingSpanConfig` would not apply the relevant system span configs that may still apply to that range. One situation in which we could end up with such a range when a table is dropped and all of its data (including the range deletion tombstone installed by the drop) is GC'ed, the associated schema change GC job will delete the table's span config. In this case, we will not find any overlapping span configs for the table's span, but a system span config, such as a cluster wide protection policy, may still be applicable to the replica with the empty table span. A scan of that span AOST the timestamp at which we wrote the protection policy could result in a BatchTimestampBeforeGCError. This change fixes the above bug by applying a fallback config to the span with no overlapping span configs, and combining the system span configs that apply to the span. The change adds a red-green test to exercise this logic. Informs: #113867 Release note (bug fix): an empty range corresponding to a drop table did not respect system level span configurations such as protected timestamps, potentially causing reads above the protected timestamp to fail
craig bot
pushed a commit
that referenced
this issue
Nov 21, 2023
114050: spanconfigkvsubscriber: fix missing system span configs r=stevendanna a=adityamaru Previously, if a range did not have an overlapping span config then `ForEachOverlappingSpanConfig` would not apply the relevant system span configs that may still apply to that range. One situation in which we could end up with such a range when a table is dropped and all of its data (including the range deletion tombstone installed by the drop) is GC'ed, the associated schema change GC job will delete the table's span config. In this case, we will not find any overlapping span configs for the table's span, but a system span config, such as a cluster wide protection policy, may still be applicable to the replica with the empty table span. A scan of that span AOST the timestamp at which we wrote the protection policy could result in a BatchTimestampBeforeGCError. This change fixes the above bug by applying a fallback config to the span with no overlapping span configs, and combining the system span configs that apply to the span. The change adds a red-green test to exercise this logic. Informs: #113867 Release note (bug fix): an empty range corresponding to a drop table did not respect system level span configurations such as protected timestamps, potentially causing reads above the protected timestamp to fail 114802: obsservice: creating Insights Pipeline r=maryliag a=maryliag Note to Reviewers: only last commit is relevant to this PR --- Create basic structure for Insights Pipeline. Currently the Validator (always return that is valid) and Processor (just printing the value) are very simple. This commit is to introduce that structure so can be iterated on. https://www.loom.com/share/b568f514753542d9bd5e551b530f917d Part Of [CC-26215](https://cockroachlabs.atlassian.net/browse/CC-26215) Release note: None 114823: types: make Family.Name return redact.SafeString r=yuzefovich a=yuzefovich All family names are safe constant strings. Epic: None Release note: None Co-authored-by: adityamaru <adityamaru@gmail.com> Co-authored-by: maryliag <marylia@cockroachlabs.com> Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>
blathers-crl bot
pushed a commit
that referenced
this issue
Nov 21, 2023
Previously, if a range did not have an overlapping span config then `ForEachOverlappingSpanConfig` would not apply the relevant system span configs that may still apply to that range. One situation in which we could end up with such a range when a table is dropped and all of its data (including the range deletion tombstone installed by the drop) is GC'ed, the associated schema change GC job will delete the table's span config. In this case, we will not find any overlapping span configs for the table's span, but a system span config, such as a cluster wide protection policy, may still be applicable to the replica with the empty table span. A scan of that span AOST the timestamp at which we wrote the protection policy could result in a BatchTimestampBeforeGCError. This change fixes the above bug by applying a fallback config to the span with no overlapping span configs, and combining the system span configs that apply to the span. The change adds a red-green test to exercise this logic. Informs: #113867 Release note (bug fix): an empty range corresponding to a drop table did not respect system level span configurations such as protected timestamps, potentially causing reads above the protected timestamp to fail
I'll keep this open until the 23.1 backport merges |
adityamaru
added
branch-release-23.1
Used to mark GA and release blockers and technical advisories for 23.1
and removed
GA-blocker
labels
Nov 22, 2023
cockroach-teamcity
pushed a commit
to cockroach-teamcity/cockroach
that referenced
this issue
Nov 27, 2023
Previously, if a range did not have an overlapping span config then `ForEachOverlappingSpanConfig` would not apply the relevant system span configs that may still apply to that range. One situation in which we could end up with such a range when a table is dropped and all of its data (including the range deletion tombstone installed by the drop) is GC'ed, the associated schema change GC job will delete the table's span config. In this case, we will not find any overlapping span configs for the table's span, but a system span config, such as a cluster wide protection policy, may still be applicable to the replica with the empty table span. A scan of that span AOST the timestamp at which we wrote the protection policy could result in a BatchTimestampBeforeGCError. This change fixes the above bug by applying a fallback config to the span with no overlapping span configs, and combining the system span configs that apply to the span. The change adds a red-green test to exercise this logic. Informs: cockroachdb#113867 Release note (bug fix): an empty range corresponding to a drop table did not respect system level span configurations such as protected timestamps, potentially causing reads above the protected timestamp to fail
adityamaru
added a commit
to adityamaru/cockroach
that referenced
this issue
Dec 1, 2023
Previously, if a range did not have an overlapping span config then `ForEachOverlappingSpanConfig` would not apply the relevant system span configs that may still apply to that range. One situation in which we could end up with such a range when a table is dropped and all of its data (including the range deletion tombstone installed by the drop) is GC'ed, the associated schema change GC job will delete the table's span config. In this case, we will not find any overlapping span configs for the table's span, but a system span config, such as a cluster wide protection policy, may still be applicable to the replica with the empty table span. A scan of that span AOST the timestamp at which we wrote the protection policy could result in a BatchTimestampBeforeGCError. This change fixes the above bug by applying a fallback config to the span with no overlapping span configs, and combining the system span configs that apply to the span. The change adds a red-green test to exercise this logic. Informs: cockroachdb#113867 Release note (bug fix): an empty range corresponding to a drop table did not respect system level span configurations such as protected timestamps, potentially causing reads above the protected timestamp to fail
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
A-kv
Anything in KV that doesn't belong in a more specific category.
branch-release-23.1
Used to mark GA and release blockers and technical advisories for 23.1
C-bug
Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
O-testcluster
Issues found or occurred on a test cluster, i.e. a long-running internal cluster
P-3
Issues/test failures with no fix SLA
In our test cluster we observed two ranges corresponding to dropped tables not respecting protection policies enforced by SystemSpanConfigs. The replicas of these empty ranges were still evaluating batch requests and causing them to fail with
‹ERROR: batch timestamp 1698960957.937065386,0 must be after replica GC threshold 1698967771.741796894,0 (SQLSTATE XXUUU)›
. The running theory is that with the following sequence of operations:We end up in a situation where the
spanconfigstore
does not find any overlapping span configs for the empty range and so does not run the logic to check if any system span configs apply to that range - https://github.com/cockroachdb/cockroach/blob/master/pkg/spanconfig/spanconfigstore/store.go#L199. In this way it misses any protection policies that should hold up the GCThreshold and allows GC to move past the protected timestamp. We think it makes sense to apply a default zone config with the system span configs combined into it to such ranges. We are still attempting to reproduce this locally.Jira issue: CRDB-33239
The text was updated successfully, but these errors were encountered: