-
Notifications
You must be signed in to change notification settings - Fork 782
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement GetAllHistoryTreeBranches for SQL persistence backends #3438
Implement GetAllHistoryTreeBranches for SQL persistence backends #3438
Conversation
This just adds the functionality to fetch history tree branches, but it doesn't turn it on. |
c2b5171
to
3840941
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove test_0928180931_yng
3840941
to
026de47
Compare
@dnr , I did some analysis on the different strategies we can use, and I'm now sure that this is the best one for us: As demonstrated by this script, it's important that we do pagination without offsets, and instead do cursor-based SET @num_rows := 10000;
DROP PROCEDURE IF EXISTS generate_fake_data;
CREATE PROCEDURE generate_fake_data()
BEGIN
DECLARE i INT DEFAULT 0;
# noinspection SqlWithoutWhere
DELETE FROM history_tree;
test_loop :
LOOP
IF (i = @num_rows) THEN
LEAVE test_loop;
END IF;
INSERT INTO history_tree
VALUES (RAND() * 1000, RANDOM_BYTES(16), RANDOM_BYTES(16), '', '');
SET i = i + 1;
END LOOP;
END;
CALL generate_fake_data();
SELECT COUNT(*)
FROM history_tree;
-- This is the offset-based approach. You can see that we are using the index, but we still have to
-- iterate over all rows before offset, so this becomes really inefficient towards the end.
EXPLAIN ANALYZE
SELECT shard_id, tree_id, branch_id, data, data_encoding
FROM history_tree
ORDER BY shard_id, tree_id, branch_id
-- OFFSET can't be a variable unfortunately--make sure it's something like half of num_rows
LIMIT 100 OFFSET 5000;
# -> Limit/Offset: 100/5000 row(s) (cost=262.01 rows=100) (actual time=1.641..1.671 rows=100 loops=1)
# -> Index scan on history_tree using PRIMARY (cost=262.01 rows=5100) (actual time=0.020..1.530 rows=5100 loops=1)
-- This is the cursor-based approach. You can see that we are using the index and never scanning
-- too many rows.
EXPLAIN ANALYZE
SELECT shard_id, tree_id, branch_id, data, data_encoding
FROM history_tree
WHERE (shard_id, tree_id, branch_id) > (RAND(), RANDOM_BYTES(16), RANDOM_BYTES(16))
ORDER BY shard_id, tree_id, branch_id
LIMIT 100;
# -> Limit: 100 row(s) (cost=0.14 rows=100) (actual time=0.026..0.063 rows=100 loops=1)
# -> Filter: ((history_tree.shard_id,history_tree.tree_id,history_tree.branch_id) > (rand(),random_bytes(16),random_bytes(16))) (cost=0.14 rows=100) (actual time=0.024..0.056 rows=100 loops=1)
# -> Index scan on history_tree using PRIMARY (cost=0.14 rows=100) (actual time=0.019..0.047 rows=103 loops=1)
-- Note that the order is important when using a cursor-based approach. As you can see here, if we switch tree_id and
-- branch_id, we need to do a table scan.
EXPLAIN ANALYZE
SELECT shard_id, branch_id, tree_id, data, data_encoding
FROM history_tree
WHERE (shard_id, branch_id, tree_id) > (RAND(), RANDOM_BYTES(16), RANDOM_BYTES(16))
ORDER BY shard_id, branch_id, tree_id
LIMIT 100;
# -> Limit: 100 row(s) (cost=1003.75 rows=100) (actual time=4.410..4.486 rows=100 loops=1)
# -> Sort row IDs: history_tree.shard_id, history_tree.branch_id, history_tree.tree_id, limit input to 100 row(s) per chunk (cost=1003.75 rows=10000) (actual time=4.409..4.481 rows=100 loops=1)
# -> Filter: ((history_tree.shard_id,history_tree.branch_id,history_tree.tree_id) > (rand(),random_bytes(16),random_bytes(16))) (cost=1003.75 rows=10000) (actual time=0.024..3.333 rows=9997 loops=1)
# -> Table scan on history_tree (cost=1003.75 rows=10000) (actual time=0.019..2.566 rows=10000 loops=1) |
026de47
to
668c364
Compare
668c364
to
8ad9f56
Compare
In this PR, I've added an implementation of the GetAllHistoryTreeBranches function for all of our SQL backends.
I made these changes for #3419. This change will allow the history scavenger to prune branches in clusters that use SQL backends for history storage.
There are existing test suites which verify that we can fetch all history branches after creating some. However, those tests were disabled for non-Cassandra databases. I removed that constraint, and watched the tests fail (because the GetAllHistoryTreeBranches method just panicked before this change). I then applied my fix, and watched the tests pass.
Worst case scenario, we somehow return the wrong data in GetAllHistoryTreeBranches, and that causes the scavenger to delete branches which it should not delete.
This change should not require a notification to the broader community, but there are some reactions on the description of the original issue, so we should notify them that this is now implemented.