Fix several bugs in quicklist: #12568

imchuncai · 2023-09-11T10:21:12Z

As is disscussed in Minor comment bug #12548: Certen call to function quicklistReplaceEntry(), quicklistInsertBefore() and quicklistInsertAfter() will cause a packed node violate size limit.
As is disscussed in [BUG] quicklist compress bug #12563: A node will not be compressed if it is not compress small enough. So node's member recompress will stay 0 after calling function quicklistDecompressNodeForUse(). If that node's entry is changed later, call function quicklistRecompressOnly() will not make that node compressed. In this situation, we should call function quicklistCompress() instead, I will take this approach in this commit, obviously it's not efficient. We should redesign 'recompress' to fundamentally solve the problem.
struct quicklistNode's member dont_compress is removed.

- As is disscussed in #12548: Certen call to function quicklistReplaceEntry(), quicklistInsertBefore() and quicklistInsertAfter() will cause a packed node violate size limit. - As is disscussed in #12563: A node will not be compressed if it is not compress small enough. So node's member recompress will stay 0 after calling function quicklistDecompressNodeForUse(). If that node's entry is changed later, call function quicklistRecompressOnly() will not make that node compressed. In this situation, we should call function quicklistCompress() instead, I will take this approach in this commit, obviously it's not efficient. We should redesign 'recompress' to fundamentally solve the problem. - struct quicklistNode's member dont_compress is removed.

sundb · 2023-09-11T10:56:25Z

src/quicklist.c

+        node->entry = lpReplace(node->entry, &entry->zi, data, sz);
+        quicklistNodeUpdateSz(node);


We should avoid these changes for a better review.

It's related to #12548.

I mean that we can revert unrelated changes.
It's like these two lines that don't need to be changed.

entry->node->entry = lpReplace(entry->node->entry, &entry->zi, data, sz); quicklistNodeUpdateSz(entry->node);

No, I don't think so. Don't you think ‘entry->node->entry’ is kind of weird? And 'entry->node' is used everywhere in function quicklistReplaceEntry(), take it out can improve readability of the code.

I agree with your idea, but for the sake of better review, we can avoid so many changes, maybe you can make these changes at the end instead of now!

Please also handle this and run ./runtest once.
I see that the runtest is also failed.

Failed runtest is fixed.

sundb · 2023-09-11T10:58:00Z

src/quicklist.c

-        _quicklistListpackMerge(quicklist, target, target->next);
-    }
+    unsigned int newCount = a->count + b->count;
+    lpMerge(&a->entry, &b->entry);


Can you describe why this changes was made?

/* else, the merge returned NULL and nothing changed. */
This comment is wrong, something is changed which is a and b is decompressed.
This else is not unnecessary or we should recompress a and b.

Function _quicklistMergeNodes() is deleted, don't used it now.

sundb · 2023-09-11T11:00:50Z

src/quicklist.c

@@ -1171,7 +1147,7 @@ int quicklistDelRange(quicklist *quicklist, const long start,
            quicklist->count -= del;
            quicklistDeleteIfEmpty(quicklist, node);
            if (node)
-                quicklistRecompressOnly(node);
+                quicklistCompress(quicklist, node);


Why are you using quicklistCompress here and in other places? Because node->recompress is 0 and can't compress?
This change may hide other bugs where node is not compressed.
btw: quicklistCompress is more expensive than quicklistRecompressOnly.

This is mentioned in commit log, This change can fix #12563, for more efficient experience, quicklistNode->recompress should be redesigned.

Do you mean that quicklistDelRange also can cause uncompress nodes?
Could you provide the smoke test and others?

No, this means I can't say that quicklistDelRange() won't cause uncompressed nodes. If anyone can prove that, should leave a comment here and keep using quicklistRecompressOnly().

Note that the uncompressed node is only created because we reset iter->node early before quicklistReleaseIterator(), not because of the uncompression caused by quicklistRecompressOnly.
I'd prefer to fix the problem of creating uncompressed nodes, rather than using quicklistRecompressOnly for all potentially uncompressed nodes.

The first scenario, I don't know if it's possible that a uncompressed node can be compressed after deleted part of the data.

if so we should find out why this node wasn't compressed correctly.
at any time we should assume that a node is compressed correctly.

This node is compressed correctly, but it is not compress small enough, so it remains uncompressed. What I don't know is if it's possible to compress it small enough after deleted part of the data.

Compression algorithms are usually unlikely to do this, when a piece of data has a low compression ratio, it means that there are too few duplicate blocks, and when some of the blocks are removed, its compression ratio should become lower.

OK, Then I will roll back this change.

sundb · 2023-09-11T11:08:37Z

src/quicklist.c

-        quicklistNodeUpdateSz(new_node);
-        __quicklistInsertNode(quicklist, node, new_node, after);
-        _quicklistMergeNodes(quicklist, node);
+        _quicklistInsertIntoFullNode(quicklist, entry, value, sz, after);


Can you explain why adding this method?
It seems that it doesn't relate to #12563.

It's related to #12548.

Append value to new_node without check can make listpack too big.

sundb · 2023-09-11T11:10:00Z

@imchuncai Thanks, Can you also add some unit tests at the bottom of quicklist.c?

sundb · 2023-09-12T02:37:19Z

Please also use ./runtest or ./runtest --single unit/type/list --large-memory.
This pr fails in my local PC.

[err]: Test LSET with packed / plain combinations in tests/unit/type/list.tcl
Expected 'bb' to be equal to 'ddddddddddddddddddddddddddddddd...'

- fix test bug - add unit tests for listpack limit test - roll back changes for function quicklistDelRange()

imchuncai · 2023-09-14T02:17:12Z

@imchuncai Thanks, Can you also add some unit tests at the bottom of quicklist.c?

Added.

imchuncai · 2023-09-14T02:17:38Z

Please also use ./runtest or ./runtest --single unit/type/list --large-memory. This pr fails in my local PC.
[err]: Test LSET with packed / plain combinations in tests/unit/type/list.tcl
Expected 'bb' to be equal to 'ddddddddddddddddddddddddddddddd...'

Fixed

sundb · 2023-09-14T02:56:14Z

src/quicklist.c

+            iter->offset--;
+            quicklistNext(iter, entry);


I don't think we should engage in these dangerous behaviors.
quicklistNext() should be the responsibility of the caller of the iterator, and should not be used internally.
All we can do is update the iterator to the correct position when it changes or reset it to prevent it from being used again.

It's removed, it causes bug anyway.

sundb · 2023-09-14T02:58:49Z

src/quicklist.c

@@ -1017,14 +1000,14 @@ REDIS_STATIC void _quicklistInsert(quicklistIter *iter, quicklistEntry *entry,
        node->entry = lpInsertString(node->entry, value, sz, entry->zi, LP_AFTER, NULL);
        node->count++;
        quicklistNodeUpdateSz(node);
-        quicklistRecompressOnly(node);
+        quicklistCompress(quicklist, node);


Similar changes in this file should be handled as we discussed in #12568 (comment).

Similar but not same, I've added a new unit test: TEST("small listpack compress"). Try this.

- fix test bug - add new unit test for small listpack compress - fix another issue in function _quicklistInsert() whitch not update node's sz

sundb · 2023-09-15T01:37:47Z

@imchuncai I see that you changed the positions of quicklistReplaceEntry() and quicklistReplaceAtIndex(), which will cause a lot of changes and be difficult to review.

sundb · 2023-09-15T02:27:43Z

@imchuncai IMHO, we can fix #12563 simply by using following patch:

diff --git a/src/quicklist.c b/src/quicklist.c
index 301a2166..8bd15d60 100644
--- a/src/quicklist.c
+++ b/src/quicklist.c
@@ -1071,12 +1071,14 @@ REDIS_STATIC void _quicklistInsert(quicklistIter *iter, quicklistEntry *entry,
         quicklistNodeUpdateSz(new_node);
         __quicklistInsertNode(quicklist, node, new_node, after);
         _quicklistMergeNodes(quicklist, node);
+        node = NULL;
     }
 
     quicklist->count++;
 
     /* In any case, we reset iterator to forbid use of iterator after insert.
      * Notice: iter->current has been compressed in _quicklistInsert(). */
+    if (node) quicklistCompress(quicklist, node);
     resetIterator(iter); 
 }

The reason why this node doesn't compress is that we forgot to compress the iterator node before resetting the iterator.
All we have to do is just do the same things as quicklistReleaseIterator().

imchuncai · 2023-09-15T02:57:23Z

@imchuncai IMHO, we can fix #12563 simply by using following patch:

diff --git a/src/quicklist.c b/src/quicklist.c
index 301a2166..8bd15d60 100644
--- a/src/quicklist.c
+++ b/src/quicklist.c
@@ -1071,12 +1071,14 @@ REDIS_STATIC void _quicklistInsert(quicklistIter *iter, quicklistEntry *entry,
         quicklistNodeUpdateSz(new_node);
         __quicklistInsertNode(quicklist, node, new_node, after);
         _quicklistMergeNodes(quicklist, node);
+        node = NULL;
     }
 
     quicklist->count++;
 
     /* In any case, we reset iterator to forbid use of iterator after insert.
      * Notice: iter->current has been compressed in _quicklistInsert(). */
+    if (node) quicklistCompress(quicklist, node);
     resetIterator(iter); 
 }

The reason why this node doesn't compress is that we forgot to compress the iterator node before resetting the iterator. All we have to do is just do the same things as quicklistReleaseIterator().

The reason is we changed node->entry, that makes node->recompress no longer reliable. Poor design of node->recompress is the fundamental problem.
The reason why I replaced quicklistRecompressOnly() with quicklistCompress() is quicklistRecompressOnly() didn't do the job he supposed to to.

sundb · 2023-09-15T03:05:54Z

@imchuncai But we recompress node by using quicklistCompress() not quicklistRecompressOnly().
Please refer the using of quicklistCompress() in the origin quicklistReplaceEntry().

imchuncai · 2023-09-15T03:13:44Z

@imchuncai But we recompress node by using quicklistCompress() not quicklistRecompressOnly(). Please refer the using of quicklistCompress() in the origin quicklistReplaceEntry().

So I replaced quicklistRecompressOnly() with quicklistCompress(), what's the problem?

sundb · 2023-09-15T03:26:13Z

@imchuncai Yeah, you are right, this is exactly how it's handled in quicklistReplaceEntry().

imchuncai · 2023-09-15T03:31:50Z

We should redesign node->recompress, if we only changed node->entry, quicklistRecompressOnly() should always work.

imchuncai · 2023-09-15T03:34:33Z

Replace quicklistRecompressOnly() with quicklistCompress() is just a quick fix not efficient。

sundb · 2023-09-15T03:38:32Z

@imchuncai Thanks, I finally understand the desing problem you mentioned about recompress.
If a node wasn't compressed before changing it, but it can be compressed after changing it.
now quicklistRecompressOnly() won't compress it that point.

sundb · 2023-09-15T03:46:21Z

src/quicklist.c

+        quicklistCompress(quicklist, new_node);
        quicklistRecompressOnly(node);


Suggested change

quicklistCompress(quicklist, new_node);

quicklistRecompressOnly(node);

quicklistCompressNode(new_node);

quicklistCompress(node);

Maybe it's more appropriate.
Since the newnode has data added to it, we'll recompress it anyway.

No, new_node may not in the compress depth. And node->entry is not changed, node->recompress is reliable.

sundb · 2023-09-15T10:49:08Z

@imchuncai I think we can fix both bugs in separate PRs, to avoid too many changes.
Please also handle the changes mentioned in #12568 (comment)

sundb · 2023-09-15T11:13:10Z

src/quicklist.c

+        /* node->count > 1, node will not be removed */
+        quicklistDelEntry(iter, entry);
+        if (quicklistNext(iter, entry)) {
+            _quicklistInsert(iter, entry, data, sz, iter->direction == AL_START_HEAD);
+        } else {
+            int direction = iter->direction == AL_START_HEAD 
+                            ? AL_START_TAIL : AL_START_HEAD;
+            quicklistIter *rev_iter = quicklistGetIterator(quicklist, direction);
+            quicklistNext(rev_iter, entry);
+            _quicklistInsert(rev_iter, entry, data, sz, direction == AL_START_HEAD);
+            quicklistReleaseIterator(rev_iter);


I wouldn't say I like this way that nested use of iterators.
Can you explain why we have to do as this.

We can't insert first now, because entry->node have a chance be freed due to merge.

In that case I'd prefer an implementation like _quicklistInsert to reimplement the replacement, rather than nested iterators.
But that would introduce a lot of duplicate code and more complexity.

imchuncai · 2023-09-16T01:20:14Z

@imchuncai I see that you changed the positions of quicklistReplaceEntry() and quicklistReplaceAtIndex(), which will cause a lot of changes and be difficult to review.

What's your suggestion? Move it back and add declaration of function _quicklistInsert() above it?

sundb · 2023-09-16T01:23:42Z

Move them back to their original position for better review.

imchuncai · 2023-09-16T01:38:47Z

Move them back to their original position for better review.

done

sundb · 2023-09-18T06:09:10Z

This PR fix actually equates packed_threshold to either packed_threshold or SIZE_SAFETY_LIMIT.
When an entry is larger than packed_threshold[fill] or SIZE_SAFETY_LIMIT,
it will always be kept in an isolated listpack, similar to a PLAIN node but with a listpack header.

I was thinking that maybe we could fix it in some other way.

make quicklist break the maximum limit in some cases, i.e. a node's size can be up to optimization_level[4] + packed_threshold, which is what the origin code is doing.
replace the existing packed_threshold with optimization_level or SIZE_SAFETY_LIMIT, i.e. an entry larger than this threshold will be a plain node.

I'm not sure which way is better.
Please share your thoughts, @imchuncai @oranagra

imchuncai · 2023-09-18T08:08:56Z

This PR fix actually equates packed_threshold to either packed_threshold or SIZE_SAFETY_LIMIT. When an entry is larger than packed_threshold[fill] or SIZE_SAFETY_LIMIT, it will always be kept in an isolated listpack, similar to a PAINT node but with a listpack header.

I was thinking that maybe we could fix it in some other way.

make quicklist break the maximum limit in some cases, i.e. a node's size can be up to optimization_level[4] + packed_threshold, which is what the origin code is doing.

replace the existing packed_threshold with optimization_level or SIZE_SAFETY_LIMIT, i.e. an entry larger than this threshold will be a plain node.

I'm not sure which way is better. Please share your thoughts, @imchuncai @oranagra

It's not related to this PR. This PR fixed a packed node violate size limit, what the limitation is nor which node should be treated as a large element is not concerned.

sundb · 2023-09-18T08:22:11Z

@imchuncai The scenarios I propose are only meant to avoid the implementations mentioned in #12568 (comment).
If we use SIZE_SAFETY_LIMIT or optimization_level[fill] instead of packed_threshold.
Consider the following scenarios

When an entry is inserted that is smaller than SIZE_SAFETY_LIMIT, we allow the node to break the maximum limit, so its replacement size could be <SIZE_SAFETY_LIMIT*2.
when the inserted entry is larger than SIZE_SAFETY_LIMIT, we treat it as a large element in the old way, because the entry will always exist alone anyway.

oranagra · 2023-09-21T15:06:39Z

disclaimer: i'm not certain i understand everything, as i only took a quick look at the code, and only read the top comment and last 3 at the top.

i understand that there's an overlapping between the threshold for PLAIN nodes, and the threshold of packed nodes with just one element (in case nodes are limited by count rather than size).

as far as i remember, SIZE_SAFETY_LIMIT is actually not about safety but an optimization, i.e. we on't want memmoves and reallocs when we're dealing with large items.

that said, this mechanism was created before PLAIN nodes existed, and i think that now that we have them, maybe we can reduce the default packed_threshold to 8kb, and there's no reason why we'd ever want to solve listpacks of just one element if we never gonna consider adding anything to that listpack.

i do think that we should take this opportunity to clean that code and simplify it (not just fix the bug), since this PR looks quite large anyway.
i.e. if it was just a 2 lines fix, i'd be ok to merge it as a bug fix, and leave the cleanup for later.

while on that subject, considering the amount of changes, and the fact the bugs aren't critical ones (don't really affect redis's behavior, right?), i don't think we should backport this fix to any existing release.
can we remove the backport tags?

sundb · 2023-09-26T06:47:33Z

while on that subject, considering the amount of changes, and the fact the bugs aren't critical ones (don't really affect redis's behavior, right?), i don't think we should backport this fix to any existing release. can we remove the backport tags?

I don't think we need to backport this which doesn't do any harm.

imchuncai · 2023-10-02T04:19:18Z

I'm not able to contributing code anymore.

sundb · 2023-10-02T08:08:56Z

@imchuncai Still appreciate the effort you put into it.

@sundb

Why: I tried to solve the issue I found earlier, but found myself stuck in a quagmire because the issues kept coming up while I fix the old one, so I finally decided to rewrite it. Issues with the old one: - A node which should be compressed stays raw This is due to by poor design of quicklist->recompress, the design forgot the situation that a node could stay uncompressed if it can not compress small enough. And if we changed the node, wo should perform compress on it again. See issue redis#12563. - Iterator don't behave like iterator Iterator will be reset and not avaliable for further use after replace or insert, see marcro resetIterator(). The only operation that athe iterator does not get reset is quicklistDelEntry(), and it has a commment about it, but the comment is wrong, the iterator may not behave like the comment says. See issue redis#12614. - Packed node violate size limit Certen call to function quicklistReplaceEntry(), quicklistInsertBefore() and quicklistInsertAfter() will cause a packed node violate size limit. See issue redis#12548. - Merge operation only performed in insert There is no merging in delete nor replace, which can make the quicklist contain adjacent small nodes. See issue redis#12856. - Algorithms that maintain compress depth are not efficient the algorithm to maintain compress depth after add or delete is to check nodes on both sides of the list, time complexity is O(n), where n is the uncompressed depth on both sides of the list. All the changes: - Partition the node Divide the node into three partitions: head, middle and tail. The head and tail partitions hold uncompressed nodes, and the middle partition holds compressed nodes. Therefore,the time complexity of maintaining compress depth after adding or deleting a node will drop to O(1), moving at most one node from one partition to another. - Removed annoying members recompress, attempted_compress and dont_compress from quicklist node structure - Merge structure quicklistIter and quicklistEntry - The historical parameter packed_threshold has been removed This is mentioned by @sundb and @oranagra in pull request redis#12568. - Merge strategy is added That is that any adjacent node in quicklist can not be merged.

@sundb

Why: I tried to solve the issue I found earlier, but found myself stuck in a quagmire because the issues kept coming up while I fix the old one, so I finally decided to rewrite it. Issues with the old one: - A node which should be compressed stays raw This is due to by poor design of quicklist->recompress, the design forgot the situation that a node could stay uncompressed if it can not compress small enough. And if we changed the node, wo should perform compress on it again. See issue redis#12563. - Iterator don't behave like iterator Iterator will be reset and not avaliable for further use after replace or insert, see marcro resetIterator(). The only operation that athe iterator does not get reset is quicklistDelEntry(), and it has a commment about it, but the comment is wrong, the iterator may not behave like the comment says. See issue redis#12614. - Packed node violate size limit Certen call to function quicklistReplaceEntry(), quicklistInsertBefore() and quicklistInsertAfter() will cause a packed node violate size limit. See issue redis#12548. - Merge operation only performed in insert There is no merging in delete nor replace, which can make the quicklist contain adjacent small nodes. See issue redis#12856. - Algorithms that maintain compress depth are not efficient the algorithm to maintain compress depth after add or delete is to check nodes on both sides of the list, time complexity is O(n), where n is the uncompressed depth on both sides of the list. All the changes: - Partition the node Divide the node into three partitions: head, middle and tail. The head and tail partitions hold uncompressed nodes, and the middle partition holds compressed nodes. Therefore,the time complexity of maintaining compress depth after adding or deleting a node will drop to O(1), moving at most one node from one partition to another. - Removed annoying members recompress, attempted_compress and dont_compress from quicklist node structure - Merge structure quicklistIter and quicklistEntry - The historical parameter packed_threshold has been removed This is mentioned by @sundb and @oranagra in pull request redis#12568. - Merge strategy is added That is that any adjacent node in quicklist can not be merged.

@sundb

Why: I tried to solve the issue I found earlier, but found myself stuck in a quagmire because the issues kept coming up while I fix the old one, so I finally decided to rewrite it. Issues with the old one: - A node which should be compressed stays raw This is due to by poor design of quicklist->recompress, the design forgot the situation that a node could stay uncompressed if it can not compress small enough. And if we changed the node, wo should perform compress on it again. See issue redis#12563. - Iterator don't behave like iterator Iterator will be reset and not avaliable for further use after replace or insert, see marcro resetIterator(). The only operation that athe iterator does not get reset is quicklistDelEntry(), and it has a commment about it, but the comment is wrong, the iterator may not behave like the comment says. See issue redis#12614. - Packed node violate size limit Certen call to function quicklistReplaceEntry(), quicklistInsertBefore() and quicklistInsertAfter() will cause a packed node violate size limit. See issue redis#12548. - Merge operation only performed in insert There is no merging in delete nor replace, which can make the quicklist contain adjacent small nodes. See issue redis#12856. - Algorithms that maintain compress depth are not efficient the algorithm to maintain compress depth after add or delete is to check nodes on both sides of the list, time complexity is O(n), where n is the uncompressed depth on both sides of the list. All the changes: - Partition the node Divide the node into three partitions: head, middle and tail. The head and tail partitions hold uncompressed nodes, and the middle partition holds compressed nodes. Therefore,the time complexity of maintaining compress depth after adding or deleting a node will drop to O(1), moving at most one node from one partition to another. - Removed annoying members recompress, attempted_compress and dont_compress from quicklist node structure - Merge structure quicklistIter and quicklistEntry - The historical parameter packed_threshold has been removed This is mentioned by @sundb and @oranagra in pull request redis#12568. - Merge strategy is added That is that any adjacent node in quicklist can not be merged.

@sundb

Why: I tried to solve the issue I found earlier, but found myself stuck in a quagmire because the issues kept coming up while I fix the old one, so I finally decided to rewrite it. Issues with the old one: - A node which should be compressed stays raw This is due to by poor design of quicklist->recompress, the design forgot the situation that a node could stay uncompressed if it can not compress small enough. And if we changed the node, wo should perform compress on it again. See issue redis#12563. - Iterator don't behave like iterator Iterator will be reset and not avaliable for further use after replace or insert, see marcro resetIterator(). The only operation that athe iterator does not get reset is quicklistDelEntry(), and it has a commment about it, but the comment is wrong, the iterator may not behave like the comment says. See issue redis#12614. - Packed node violate size limit Certen call to function quicklistReplaceEntry(), quicklistInsertBefore() and quicklistInsertAfter() will cause a packed node violate size limit. See issue redis#12548. - Merge operation only performed in insert There is no merging in delete nor replace, which can make the quicklist contain adjacent small nodes. See issue redis#12856. - Algorithms that maintain compress depth are not efficient the algorithm to maintain compress depth after add or delete is to check nodes on both sides of the list, time complexity is O(n), where n is the uncompressed depth on both sides of the list. All the changes: - Partition the node Divide the node into three partitions: head, middle and tail. The head and tail partitions hold uncompressed nodes, and the middle partition holds compressed nodes. Therefore,the time complexity of maintaining compress depth after adding or deleting a node will drop to O(1), moving at most one node from one partition to another. - Removed annoying members recompress, attempted_compress and dont_compress from quicklist node structure - Merge structure quicklistIter and quicklistEntry - The historical parameter packed_threshold has been removed This is mentioned by @sundb and @oranagra in pull request redis#12568. - Merge strategy is added That is that any adjacent node in quicklist can not be merged.

@sundb

Why: I tried to solve the issue I found earlier, but found myself stuck in a quagmire because the issues kept coming up while I fix the old one, so I finally decided to rewrite it. Issues with the old one: - A node which should be compressed stays raw This is due to by poor design of quicklist->recompress, the design forgot the situation that a node could stay uncompressed if it can not compress small enough. And if we changed the node, wo should perform compress on it again. See issue redis#12563. - Iterator don't behave like iterator Iterator will be reset and not avaliable for further use after replace or insert, see marcro resetIterator(). The only operation that athe iterator does not get reset is quicklistDelEntry(), and it has a commment about it, but the comment is wrong, the iterator may not behave like the comment says. See issue redis#12614. - Packed node violate size limit Certen call to function quicklistReplaceEntry(), quicklistInsertBefore() and quicklistInsertAfter() will cause a packed node violate size limit. See issue redis#12548. - Merge operation only performed in insert There is no merging in delete nor replace, which can make the quicklist contain adjacent small nodes. See issue redis#12856. - Algorithms that maintain compress depth are not efficient the algorithm to maintain compress depth after add or delete is to check nodes on both sides of the list, time complexity is O(n), where n is the uncompressed depth on both sides of the list. All the changes: - Partition the node Divide the node into three partitions: head, middle and tail. The head and tail partitions hold uncompressed nodes, and the middle partition holds compressed nodes. Therefore,the time complexity of maintaining compress depth after adding or deleting a node will drop to O(1), moving at most one node from one partition to another. - Removed annoying members recompress, attempted_compress and dont_compress from quicklist node structure - Merge structure quicklistIter and quicklistEntry - The historical parameter packed_threshold has been removed This is mentioned by @sundb and @oranagra in pull request redis#12568. - Merge strategy is added That is that any adjacent node in quicklist can not be merged.

@sundb

Why: I tried to solve the issue I found earlier, but found myself stuck in a quagmire because the issues kept coming up while I fix the old one, so I finally decided to rewrite it. Issues with the old one: - A node which should be compressed stays raw This is due to by poor design of quicklist->recompress, the design forgot the situation that a node could stay uncompressed if it can not compress small enough. And if we changed the node, wo should perform compress on it again. See issue redis#12563. - Iterator don't behave like iterator Iterator will be reset and not avaliable for further use after replace or insert, see marcro resetIterator(). The only operation that athe iterator does not get reset is quicklistDelEntry(), and it has a commment about it, but the comment is wrong, the iterator may not behave like the comment says. See issue redis#12614. - Packed node violate size limit Certen call to function quicklistReplaceEntry(), quicklistInsertBefore() and quicklistInsertAfter() will cause a packed node violate size limit. See issue redis#12548. - Merge operation only performed in insert There is no merging in delete nor replace, which can make the quicklist contain adjacent small nodes. See issue redis#12856. - Algorithms that maintain compress depth are not efficient the algorithm to maintain compress depth after add or delete is to check nodes on both sides of the list, time complexity is O(n), where n is the uncompressed depth on both sides of the list. All the changes: - Partition the node Divide the node into three partitions: head, middle and tail. The head and tail partitions hold uncompressed nodes, and the middle partition holds compressed nodes. Therefore,the time complexity of maintaining compress depth after adding or deleting a node will drop to O(1), moving at most one node from one partition to another. - Removed annoying members recompress, attempted_compress and dont_compress from quicklist node structure - Merge structure quicklistIter and quicklistEntry - The historical parameter packed_threshold has been removed This is mentioned by @sundb and @oranagra in pull request redis#12568. - Merge strategy is added That is that any adjacent node in quicklist can not be merged.

@sundb

Why: I tried to solve the issue I found earlier, but found myself stuck in a quagmire because the issues kept coming up while I fix the old one, so I finally decided to rewrite it. Issues with the old one: - A node which should be compressed stays raw This is due to by poor design of quicklist->recompress, the design forgot the situation that a node could stay uncompressed if it can not compress small enough. And if we changed the node, wo should perform compress on it again. See issue redis#12563. - Iterator don't behave like iterator Iterator will be reset and not avaliable for further use after replace or insert, see marcro resetIterator(). The only operation that athe iterator does not get reset is quicklistDelEntry(), and it has a commment about it, but the comment is wrong, the iterator may not behave like the comment says. See issue redis#12614. - Packed node violate size limit Certen call to function quicklistReplaceEntry(), quicklistInsertBefore() and quicklistInsertAfter() will cause a packed node violate size limit. See issue redis#12548. - Merge operation only performed in insert There is no merging in delete nor replace, which can make the quicklist contain adjacent small nodes. See issue redis#12856. - Algorithms that maintain compress depth are not efficient the algorithm to maintain compress depth after add or delete is to check nodes on both sides of the list, time complexity is O(n), where n is the uncompressed depth on both sides of the list. All the changes: - Partition the node Divide the node into three partitions: head, middle and tail. The head and tail partitions hold uncompressed nodes, and the middle partition holds compressed nodes. Therefore,the time complexity of maintaining compress depth after adding or deleting a node will drop to O(1), moving at most one node from one partition to another. - Removed annoying members recompress, attempted_compress and dont_compress from quicklist node structure - Merge structure quicklistIter and quicklistEntry - The historical parameter packed_threshold has been removed This is mentioned by @sundb and @oranagra in pull request redis#12568. - Merge strategy is added That is that any adjacent node in quicklist can not be merged.

Following #12568 In issue #9357, when inserting an element larger than 1GB, we currently store it in a plain node instead of a listpack. Presently, when we insert an element that exceeds the maximum size of a packed node, it cannot be accommodated in any other nodes, thus ending up isolated like a large element. I.e. it's a node with only one element, but it's listpack encoded rather than a plain buffer. This PR lowers the threshold for considering an element as 'large' from 1GB to the maximum size of a node. While this change doesn't completely resolve the bug mentioned in the previous PR, it does mitigate its potential impact. As a result of this change, we can now only use LSET to replace an element with another element that falls below the maximum size threshold. In the worst-case scenario, with a fill of -5, the largest packed node we can create is 2GB (32k * 64k): * 32k: The smallest element in a listpack is 2 bytes, which allows us to store up to 32k elements. * 64k: This is the maximum size for a single quicklist node. ## Others To fully fix #9357, we need more work, as discussed in #12568, when we insert an element into a quicklistNode, it may be created in a new node, put into another node, or merged, and we can't correctly delete the node that was supposed to be deleted. I'm not sure it's worth it, since it involves a lot of modifications.

sundb reviewed Sep 11, 2023

View reviewed changes

Fix code review issues

7178674

- fix test bug - add unit tests for listpack limit test - roll back changes for function quicklistDelRange()

sundb reviewed Sep 14, 2023

View reviewed changes

Fix second round code review issues:

78d909d

- fix test bug - add new unit test for small listpack compress - fix another issue in function _quicklistInsert() whitch not update node's sz

sundb reviewed Sep 15, 2023

View reviewed changes

Take the suggestion of better review.

599783b

This was linked to issues Sep 20, 2023

[BUG] quicklist compress bug #12563

Open

Minor comment bug #12548

Open

imchuncai closed this Oct 2, 2023

sundb mentioned this pull request Oct 16, 2023

Determine the large limit of the quicklist node based on fill #12659

Merged

imchuncai mentioned this pull request Dec 12, 2023

Rewrite quicklist #12859

Open

sundb mentioned this pull request Mar 8, 2024

Ensure correct compression for a node that was too small to compress but grows larger #13121

Open

		node->entry = lpReplace(node->entry, &entry->zi, data, sz);
		quicklistNodeUpdateSz(node);

		quicklistCompress(quicklist, new_node);
		quicklistRecompressOnly(node);

Fix several bugs in quicklist: #12568

Fix several bugs in quicklist: #12568

Conversation

imchuncai commented Sep 11, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sundb Sep 14, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sundb Sep 11, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sundb Sep 12, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

imchuncai Sep 11, 2023 • edited

Choose a reason for hiding this comment

sundb commented Sep 11, 2023

sundb commented Sep 12, 2023

imchuncai commented Sep 14, 2023

imchuncai commented Sep 14, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sundb commented Sep 15, 2023

sundb commented Sep 15, 2023 • edited

imchuncai commented Sep 15, 2023 • edited

sundb commented Sep 15, 2023

imchuncai commented Sep 15, 2023

sundb commented Sep 15, 2023

imchuncai commented Sep 15, 2023

imchuncai commented Sep 15, 2023

sundb commented Sep 15, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sundb commented Sep 15, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

imchuncai commented Sep 16, 2023

sundb commented Sep 16, 2023

imchuncai commented Sep 16, 2023

sundb commented Sep 18, 2023 • edited by oranagra

imchuncai commented Sep 18, 2023

sundb commented Sep 18, 2023

oranagra commented Sep 21, 2023

sundb commented Sep 26, 2023

imchuncai commented Oct 2, 2023

sundb commented Oct 2, 2023

sundb Sep 14, 2023 •

edited

sundb Sep 11, 2023 •

edited

sundb Sep 12, 2023 •

edited

imchuncai Sep 11, 2023 •

edited

sundb commented Sep 15, 2023 •

edited

imchuncai commented Sep 15, 2023 •

edited

sundb commented Sep 18, 2023 •

edited by oranagra