Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

View building crashes on large partitions with range tombstones. #14503

Closed
michoecho opened this issue Jul 4, 2023 · 9 comments · Fixed by #14502
Closed

View building crashes on large partitions with range tombstones. #14503

michoecho opened this issue Jul 4, 2023 · 9 comments · Fixed by #14502

Comments

@michoecho
Copy link
Contributor

View update routines accept mutation objects.
But what comes out of staging sstable readers is a stream of mutation_fragment_v2 objects.
To build view updates after a repair/streaming, we have to convert the fragment stream into mutations. This is done by piping the stream to mutation_rebuilder_v2.

To keep memory usage limited, the stream for a single partition might have to be split into multiple partial mutation objects. view_update_consumer does that, but in improper way -- when the split/flush happens inside an active range tombstone, the range tombstone isn't closed properly. This is illegal, and triggers an internal error.

michoecho added a commit to michoecho/scylla that referenced this issue Jul 4, 2023
View update routines accept `mutation` objects.
But what comes out of staging sstable readers is a stream of
mutation_fragment_v2 objects.
To build view updates after a repair/streaming, we have to
convert the fragment stream into `mutation`s. This is done by piping
the stream to mutation_rebuilder_v2.

To keep memory usage limited, the stream for a single partition might
have to be split into multiple partial `mutation` objects.
view_update_consumer does that, but in improper way -- when the
split/flush happens inside an active range tombstone, the range
tombstone isn't closed properly. This is illegal, and triggers an
internal error.

This patch fixes the problem by closing the active range tombstone
(and reopening in the same position in the next `mutation` object).

The tombstone is closed just after the last seen clustered position.
This is not necessary for correctness -- for example we could delay
all processing of the range tombstone until we see its end
bound -- but it seems like the most natural semantic.

Fixes scylladb#14503
@michoecho
Copy link
Contributor Author

michoecho commented Jul 4, 2023

Refs scylladb/scylla-enterprise#3072

@michoecho michoecho added backport/5.2 Issues that should be backported to 5.2 branch once they'll be fixed Requires-Backport-to-5.1 labels Jul 4, 2023
@michoecho michoecho self-assigned this Jul 4, 2023
michoecho added a commit to michoecho/scylla that referenced this issue Jul 4, 2023
View update routines accept `mutation` objects.
But what comes out of staging sstable readers is a stream of
mutation_fragment_v2 objects.
To build view updates after a repair/streaming, we have to
convert the fragment stream into `mutation`s. This is done by piping
the stream to mutation_rebuilder_v2.

To keep memory usage limited, the stream for a single partition might
have to be split into multiple partial `mutation` objects.
view_update_consumer does that, but in improper way -- when the
split/flush happens inside an active range tombstone, the range
tombstone isn't closed properly. This is illegal, and triggers an
internal error.

This patch fixes the problem by closing the active range tombstone
(and reopening in the same position in the next `mutation` object).

The tombstone is closed just after the last seen clustered position.
This is not necessary for correctness -- for example we could delay
all processing of the range tombstone until we see its end
bound -- but it seems like the most natural semantic.

Fixes scylladb#14503
michoecho added a commit to michoecho/scylla that referenced this issue Jul 5, 2023
…andom_mutations

A random mutation test for view_updating_consumer's buffering logic.
Reproduces scylladb#14503.
michoecho added a commit to michoecho/scylla that referenced this issue Jul 5, 2023
…ator_buffering

This patch adds a full-range tombstone to the compacted mutation.
This raises the coverage of the test. In particular, it reproduces
issue scylladb#14503, which should have been caught by this test, but wasn't.
tgrabiec added a commit that referenced this issue Jul 5, 2023
…_consumer' from Michał Chojnowski

View update routines accept `mutation` objects.
But what comes out of staging sstable readers is a stream of mutation_fragment_v2 objects.
To build view updates after a repair/streaming, we have to convert the fragment stream into `mutation`s. This is done by piping the stream to mutation_rebuilder_v2.

To keep memory usage limited, the stream for a single partition might have to be split into multiple partial `mutation` objects. view_update_consumer does that, but in improper way -- when the split/flush happens inside an active range tombstone, the range tombstone isn't closed properly. This is illegal, and triggers an internal error.

This patch fixes the problem by closing the active range tombstone (and reopening in the same position in the next `mutation` object).

The tombstone is closed just after the last seen clustered position. This is not necessary for correctness -- for example we could delay all processing of the range tombstone until we see its end bound -- but it seems like the most natural semantic.

Fixes #14503

Closes #14502

* github.com:scylladb/scylladb:
  test: view_build_test: add range tombstones to test_view_update_generator_buffering
  test: view_build_test: add test_view_udate_generator_buffering_with_random_mutations
  view_updating_consumer: make buffer limit a variable
  view: fix range tombstone handling on flushes in view_updating_consumer
patjed41 pushed a commit to patjed41/scylladb that referenced this issue Jul 7, 2023
View update routines accept `mutation` objects.
But what comes out of staging sstable readers is a stream of
mutation_fragment_v2 objects.
To build view updates after a repair/streaming, we have to
convert the fragment stream into `mutation`s. This is done by piping
the stream to mutation_rebuilder_v2.

To keep memory usage limited, the stream for a single partition might
have to be split into multiple partial `mutation` objects.
view_update_consumer does that, but in improper way -- when the
split/flush happens inside an active range tombstone, the range
tombstone isn't closed properly. This is illegal, and triggers an
internal error.

This patch fixes the problem by closing the active range tombstone
(and reopening in the same position in the next `mutation` object).

The tombstone is closed just after the last seen clustered position.
This is not necessary for correctness -- for example we could delay
all processing of the range tombstone until we see its end
bound -- but it seems like the most natural semantic.

Fixes scylladb#14503
patjed41 pushed a commit to patjed41/scylladb that referenced this issue Jul 7, 2023
…andom_mutations

A random mutation test for view_updating_consumer's buffering logic.
Reproduces scylladb#14503.
patjed41 pushed a commit to patjed41/scylladb that referenced this issue Jul 7, 2023
…ator_buffering

This patch adds a full-range tombstone to the compacted mutation.
This raises the coverage of the test. In particular, it reproduces
issue scylladb#14503, which should have been caught by this test, but wasn't.
@michoecho
Copy link
Contributor Author

@scylladb/scylla-maint Pinging for backports.

@nyh
Copy link
Contributor

nyh commented Jul 11, 2023

@scylladb/scylla-maint Pinging for backports.

I'll do it now.

nyh pushed a commit that referenced this issue Jul 11, 2023
…_consumer' from Michał Chojnowski

View update routines accept `mutation` objects.
But what comes out of staging sstable readers is a stream of mutation_fragment_v2 objects.
To build view updates after a repair/streaming, we have to convert the fragment stream into `mutation`s. This is done by piping the stream to mutation_rebuilder_v2.

To keep memory usage limited, the stream for a single partition might have to be split into multiple partial `mutation` objects. view_update_consumer does that, but in improper way -- when the split/flush happens inside an active range tombstone, the range tombstone isn't closed properly. This is illegal, and triggers an internal error.

This patch fixes the problem by closing the active range tombstone (and reopening in the same position in the next `mutation` object).

The tombstone is closed just after the last seen clustered position. This is not necessary for correctness -- for example we could delay all processing of the range tombstone until we see its end bound -- but it seems like the most natural semantic.

Fixes #14503

Closes #14502

* github.com:scylladb/scylladb:
  test: view_build_test: add range tombstones to test_view_update_generator_buffering
  test: view_build_test: add test_view_udate_generator_buffering_with_random_mutations
  view_updating_consumer: make buffer limit a variable
  view: fix range tombstone handling on flushes in view_updating_consumer

(cherry picked from commit c25201c)
nyh pushed a commit that referenced this issue Jul 11, 2023
…_consumer' from Michał Chojnowski

View update routines accept `mutation` objects.
But what comes out of staging sstable readers is a stream of mutation_fragment_v2 objects.
To build view updates after a repair/streaming, we have to convert the fragment stream into `mutation`s. This is done by piping the stream to mutation_rebuilder_v2.

To keep memory usage limited, the stream for a single partition might have to be split into multiple partial `mutation` objects. view_update_consumer does that, but in improper way -- when the split/flush happens inside an active range tombstone, the range tombstone isn't closed properly. This is illegal, and triggers an internal error.

This patch fixes the problem by closing the active range tombstone (and reopening in the same position in the next `mutation` object).

The tombstone is closed just after the last seen clustered position. This is not necessary for correctness -- for example we could delay all processing of the range tombstone until we see its end bound -- but it seems like the most natural semantic.

Fixes #14503

Closes #14502

* github.com:scylladb/scylladb:
  test: view_build_test: add range tombstones to test_view_update_generator_buffering
  test: view_build_test: add test_view_udate_generator_buffering_with_random_mutations
  view_updating_consumer: make buffer limit a variable
  view: fix range tombstone handling on flushes in view_updating_consumer

(cherry picked from commit c25201c)
@nyh
Copy link
Contributor

nyh commented Jul 11, 2023

Backported to next-5.3 (b0d3dc5).

The backport to next-5.2 and next-5.1 wasn't clean. The conflict was just header files in test/boost/view_build_test.cc so I thought and fixed it but it turns out it uses header files (test/lib/key_utils.hh) which didn't yet exist in 5.1 or 5.2, so it won't build... (I briefly added a backport commit, and then dequeued it, sorry about the mess). So @michoecho please look into the tests and what header files you need for it (and how to avoid needing them) and prepare separate backport PRs (probably the same PR could work for both 5.1 and 5.2).

@michoecho
Copy link
Contributor Author

Backported to next-5.3 (b0d3dc5).

The backport to next-5.2 and next-5.1 wasn't clean. The conflict was just header files in test/boost/view_build_test.cc so I thought and fixed it but it turns out it uses header files (test/lib/key_utils.hh) which didn't yet exist in 5.1 or 5.2, so it won't build... (I briefly added a backport commit, and then dequeued it, sorry about the mess). So @michoecho please look into the tests and what header files you need for it (and how to avoid needing them) and prepare separate backport PRs (probably the same PR could work for both 5.1 and 5.2).

I see, thanks. I will prepare the PRs.

michoecho added a commit to michoecho/scylla that referenced this issue Jul 11, 2023
View update routines accept `mutation` objects.
But what comes out of staging sstable readers is a stream of
mutation_fragment_v2 objects.
To build view updates after a repair/streaming, we have to
convert the fragment stream into `mutation`s. This is done by piping
the stream to mutation_rebuilder_v2.

To keep memory usage limited, the stream for a single partition might
have to be split into multiple partial `mutation` objects.
view_update_consumer does that, but in improper way -- when the
split/flush happens inside an active range tombstone, the range
tombstone isn't closed properly. This is illegal, and triggers an
internal error.

This patch fixes the problem by closing the active range tombstone
(and reopening in the same position in the next `mutation` object).

The tombstone is closed just after the last seen clustered position.
This is not necessary for correctness -- for example we could delay
all processing of the range tombstone until we see its end
bound -- but it seems like the most natural semantic.

Fixes scylladb#14503
michoecho added a commit to michoecho/scylla that referenced this issue Jul 11, 2023
…andom_mutations

A random mutation test for view_updating_consumer's buffering logic.
Reproduces scylladb#14503.
michoecho added a commit to michoecho/scylla that referenced this issue Jul 11, 2023
…ator_buffering

This patch adds a full-range tombstone to the compacted mutation.
This raises the coverage of the test. In particular, it reproduces
issue scylladb#14503, which should have been caught by this test, but wasn't.
michoecho added a commit to michoecho/scylla that referenced this issue Jul 11, 2023
View update routines accept `mutation` objects.
But what comes out of staging sstable readers is a stream of
mutation_fragment_v2 objects.
To build view updates after a repair/streaming, we have to
convert the fragment stream into `mutation`s. This is done by piping
the stream to mutation_rebuilder_v2.

To keep memory usage limited, the stream for a single partition might
have to be split into multiple partial `mutation` objects.
view_update_consumer does that, but in improper way -- when the
split/flush happens inside an active range tombstone, the range
tombstone isn't closed properly. This is illegal, and triggers an
internal error.

This patch fixes the problem by closing the active range tombstone
(and reopening in the same position in the next `mutation` object).

The tombstone is closed just after the last seen clustered position.
This is not necessary for correctness -- for example we could delay
all processing of the range tombstone until we see its end
bound -- but it seems like the most natural semantic.

Fixes scylladb#14503
michoecho added a commit to michoecho/scylla that referenced this issue Jul 11, 2023
…andom_mutations

A random mutation test for view_updating_consumer's buffering logic.
Reproduces scylladb#14503.
michoecho added a commit to michoecho/scylla that referenced this issue Jul 11, 2023
…ator_buffering

This patch adds a full-range tombstone to the compacted mutation.
This raises the coverage of the test. In particular, it reproduces
issue scylladb#14503, which should have been caught by this test, but wasn't.
@denesb
Copy link
Contributor

denesb commented Jul 11, 2023

5.3 backport was dequeued as it breaks the build. @michoecho please prepare a backport PR for 5.3.

@denesb denesb removed the backport/5.2 Issues that should be backported to 5.2 branch once they'll be fixed label Jul 11, 2023
michoecho added a commit to michoecho/scylla that referenced this issue Jul 11, 2023
View update routines accept `mutation` objects.
But what comes out of staging sstable readers is a stream of
mutation_fragment_v2 objects.
To build view updates after a repair/streaming, we have to
convert the fragment stream into `mutation`s. This is done by piping
the stream to mutation_rebuilder_v2.

To keep memory usage limited, the stream for a single partition might
have to be split into multiple partial `mutation` objects.
view_update_consumer does that, but in improper way -- when the
split/flush happens inside an active range tombstone, the range
tombstone isn't closed properly. This is illegal, and triggers an
internal error.

This patch fixes the problem by closing the active range tombstone
(and reopening in the same position in the next `mutation` object).

The tombstone is closed just after the last seen clustered position.
This is not necessary for correctness -- for example we could delay
all processing of the range tombstone until we see its end
bound -- but it seems like the most natural semantic.

Fixes scylladb#14503
michoecho added a commit to michoecho/scylla that referenced this issue Jul 11, 2023
…andom_mutations

A random mutation test for view_updating_consumer's buffering logic.
Reproduces scylladb#14503.
michoecho added a commit to michoecho/scylla that referenced this issue Jul 11, 2023
…ator_buffering

This patch adds a full-range tombstone to the compacted mutation.
This raises the coverage of the test. In particular, it reproduces
issue scylladb#14503, which should have been caught by this test, but wasn't.
michoecho added a commit to michoecho/scylla that referenced this issue Jul 11, 2023
View update routines accept `mutation` objects.
But what comes out of staging sstable readers is a stream of
mutation_fragment_v2 objects.
To build view updates after a repair/streaming, we have to
convert the fragment stream into `mutation`s. This is done by piping
the stream to mutation_rebuilder_v2.

To keep memory usage limited, the stream for a single partition might
have to be split into multiple partial `mutation` objects.
view_update_consumer does that, but in improper way -- when the
split/flush happens inside an active range tombstone, the range
tombstone isn't closed properly. This is illegal, and triggers an
internal error.

This patch fixes the problem by closing the active range tombstone
(and reopening in the same position in the next `mutation` object).

The tombstone is closed just after the last seen clustered position.
This is not necessary for correctness -- for example we could delay
all processing of the range tombstone until we see its end
bound -- but it seems like the most natural semantic.

Fixes scylladb#14503
michoecho added a commit to michoecho/scylla that referenced this issue Jul 11, 2023
…andom_mutations

A random mutation test for view_updating_consumer's buffering logic.
Reproduces scylladb#14503.
michoecho added a commit to michoecho/scylla that referenced this issue Jul 11, 2023
…ator_buffering

This patch adds a full-range tombstone to the compacted mutation.
This raises the coverage of the test. In particular, it reproduces
issue scylladb#14503, which should have been caught by this test, but wasn't.
michoecho added a commit to michoecho/scylla that referenced this issue Jul 11, 2023
View update routines accept `mutation` objects.
But what comes out of staging sstable readers is a stream of
mutation_fragment_v2 objects.
To build view updates after a repair/streaming, we have to
convert the fragment stream into `mutation`s. This is done by piping
the stream to mutation_rebuilder_v2.

To keep memory usage limited, the stream for a single partition might
have to be split into multiple partial `mutation` objects.
view_update_consumer does that, but in improper way -- when the
split/flush happens inside an active range tombstone, the range
tombstone isn't closed properly. This is illegal, and triggers an
internal error.

This patch fixes the problem by closing the active range tombstone
(and reopening in the same position in the next `mutation` object).

The tombstone is closed just after the last seen clustered position.
This is not necessary for correctness -- for example we could delay
all processing of the range tombstone until we see its end
bound -- but it seems like the most natural semantic.

Fixes scylladb#14503
michoecho added a commit to michoecho/scylla that referenced this issue Jul 11, 2023
…andom_mutations

A random mutation test for view_updating_consumer's buffering logic.
Reproduces scylladb#14503.
michoecho added a commit to michoecho/scylla that referenced this issue Jul 11, 2023
…ator_buffering

This patch adds a full-range tombstone to the compacted mutation.
This raises the coverage of the test. In particular, it reproduces
issue scylladb#14503, which should have been caught by this test, but wasn't.
@DoronArazii DoronArazii added the backport/5.2 Issues that should be backported to 5.2 branch once they'll be fixed label Jul 11, 2023
denesb added a commit that referenced this issue Jul 11, 2023
…n view_updating_consumer' from Michał Chojnowski

View update routines accept mutation objects.
But what comes out of staging sstable readers is a stream of mutation_fragment_v2 objects.
To build view updates after a repair/streaming, we have to convert the fragment stream into mutations. This is done by piping the stream to mutation_rebuilder_v2.

To keep memory usage limited, the stream for a single partition might have to be split into multiple partial mutation objects. view_update_consumer does that, but in improper way -- when the split/flush happens inside an active range tombstone, the range tombstone isn't closed properly. This is illegal, and triggers an internal error.

This patch fixes the problem by closing the active range tombstone (and reopening in the same position in the next mutation object).

The tombstone is closed just after the last seen clustered position. This is not necessary for correctness -- for example we could delay all processing of the range tombstone until we see its end bound -- but it seems like the most natural semantic.

Backported from c25201c. `view_build_test.cc` needed some tiny adjustments for the backport.

Closes #14619
Fixes #14503

* github.com:scylladb/scylladb:
  test: view_build_test: add range tombstones to test_view_update_generator_buffering
  test: view_build_test: add test_view_udate_generator_buffering_with_random_mutations
  view_updating_consumer: make buffer limit a variable
  view: fix range tombstone handling on flushes in view_updating_consumer
denesb added a commit that referenced this issue Jul 11, 2023
…n view_updating_consumer' from Michał Chojnowski

View update routines accept mutation objects.
But what comes out of staging sstable readers is a stream of mutation_fragment_v2 objects.
To build view updates after a repair/streaming, we have to convert the fragment stream into mutations. This is done by piping the stream to mutation_rebuilder_v2.

To keep memory usage limited, the stream for a single partition might have to be split into multiple partial mutation objects. view_update_consumer does that, but in improper way -- when the split/flush happens inside an active range tombstone, the range tombstone isn't closed properly. This is illegal, and triggers an internal error.

This patch fixes the problem by closing the active range tombstone (and reopening in the same position in the next mutation object).

The tombstone is closed just after the last seen clustered position. This is not necessary for correctness -- for example we could delay all processing of the range tombstone until we see its end bound -- but it seems like the most natural semantic.

Backported from c25201c. view_build_test.cc needed some tiny adjustments for the backport.

Closes #14622
Fixes #14503

* github.com:scylladb/scylladb:
  test: view_build_test: add range tombstones to test_view_update_generator_buffering
  test: view_build_test: add test_view_udate_generator_buffering_with_random_mutations
  view_updating_consumer: make buffer limit a variable
  view: fix range tombstone handling on flushes in view_updating_consumer
@nyh
Copy link
Contributor

nyh commented Jul 11, 2023

Done. I prepared backport PRs for all branches:

Thanks, all of these are now merged to the respective "next" branches, let's see that they promote and then we can remove all the backport-candidate tag from this issue.

@denesb denesb removed Backport candidate backport/5.2 Issues that should be backported to 5.2 branch once they'll be fixed Requires-Backport-to-5.1 labels Jul 12, 2023
@denesb
Copy link
Contributor

denesb commented Jul 12, 2023

Promotions were successful, removed the backport labels.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants