Replace all usage of ziplist with listpack for t_hash #8887

sundb · 2021-04-29T15:26:58Z

Part one of implementing #8702 (taking hashes first before other types)

Description of the feature

Change ziplist encoded hash objects to listpack encoding.
Convert existing ziplists on RDB loading time. an O(n) operation.

Rdb format changes

Add RDB_TYPE_HASH_LISTPACK rdb type.
Bump RDB_VERSION to 10

Interface changes

New hash-max-listpack-entries config is an alias for hash-max-ziplist-entries (same with hash-max-listpack-value)
OBJECT ENCODING will return listpack instead of ziplist

Listpack improvements:

Support direct insert, replace integer element (rather than convert back and forth from string)
Add more listpack capabilities to match the ziplist ones (like lpFind, lpRandomPairs and such)
Optimize element length fetching, avoid multiple calculations
Use inline to avoid function call overhead.

Tests

Add a new test to the RDB load time conversion
Adding the listpack unit tests. (based on the one in ziplist.c)
Add a few "corrupt payload: fuzzer findings" tests, and slightly modify existing ones.

oranagra

@sundb thank you for that effort.

i've added a few comments inside the code, but here are some general ones:

i don't like the term "list container". in "ziplist" and "listpack", we mention it is zipped, or packed, i.e. encoded. this container can be more easily mistaken to be something related to list data type rather than an encoding type.
maybe a term like "encoded list" or "packed list" (i.e. something that is generic, and both ziplist and listpack comply to, i.e. both ziplist and listpack are encoded and packed).
currently we only create new hashes with listpack, and when modifying an existing hashes we'll always keep them as ziplists.
we need to see if we can come up with a plan to gradually convert them.
i.e. i don't want an O(N) operation at load time or any other time, but eventually i want them all gone.
we have a problem with testing. the majority of the tests work by creating a new db from scratch, so the ziplist code is now mostly unreachable.
maybe we need some DEBUG sub-command that will tell redis to default to creating ziplists, and then run the entire testsuite in that mode (in the daily CI)
i'm specifically worried about the corrupt-dump-fuzzer test. we either need to save an old rdb file in the assets folder, or let it run twice with that DEBUG tweak i suggested.

src/server.h

src/db.c

src/listpack.c

src/t_hash.c

tests/unit/keyspace.tcl

oranagra · 2021-05-06T17:14:21Z

One more thing, we need an option to convert the encoding at rdb loading time.
either when doing full sanitization, since in that case we already do O(N) operation anyway, but also maybe some people will want to do that conversion at upgrade time (slave will take longer to go online), maybe we'll even make it the default one day.

i think it would also be a good idea to benchmark such a conversion, and get a feeling of how much longer it takes to load an rdb file if we do a conversion during loading vs one that just loads the ziplists as they are. can you try doing such a benchmark?

sundb · 2021-05-07T01:54:28Z

i don't like the term "list container". in "ziplist" and "listpack", we mention it is zipped, or packed, i.e. encoded. this container can be more easily mistaken to be something related to list data type rather than an encoding type.
maybe a term like "encoded list" or "packed list" (i.e. something that is generic, and both ziplist and listpack comply to, i.e. both ziplist and listpack are encoded and packed).

I also do not like "list container", it is too difficult to come up with a good name.

currently we only create new hashes with listpack, and when modifying an existing hashes we'll always keep them as ziplists.
we need to see if we can come up with a plan to gradually convert them.
i.e. i don't want an O(N) operation at load time or any other time, but eventually i want them all gone.

Maybe we can convert ziplist to listpack when call hashTypeInitIterator.

we have a problem with testing. the majority of the tests work by creating a new db from scratch, so the ziplist code is now mostly unreachable.
maybe we need some DEBUG sub-command that will tell redis to default to creating ziplists, and then run the entire testsuite in that mode (in the daily CI)
i'm specifically worried about the corrupt-dump-fuzzer test. we either need to save an old rdb file in the assets folder, or let it run twice with that DEBUG tweak i suggested.

I have write corrupt-dump-fuzzer test in #8761, I will complete these tests.

sundb · 2021-05-08T17:31:27Z

@oranagra I tested the speed of rdb loading, before(keep ziplsit) and after(convert ziplist) using the same dump.rdb, creating 100000 keys per test and filling value with random strings.
It looks like the speed becomes 3 times of the original.

entries num of one ziplist	max value size	rdb loading time without convert	rdb loading time with convert
256	64bytes	6.888s	18.078s
256	32bytes	6.849s	17.869s
256	16bytes	6.991s	18.0842s
128	64bytes	3.553s	8.983s
128	32bytes	3.521s	9.0222s
128	16bytes	3.558s	8.982s

1) Fix CR 2) In resolving conflict, I reverted all corrupt-test tests, as some were fixed in redis#9302, and Re-add the new issue caused this pr. 3) ziplistValidateIntegrity fails to validate an empty ziplsit because hash ziplist->listpack conversion is always deep santization, this should be fixed in redis#9297 but was missed.

tests/integration/corrupt-dump.tcl

src/ziplist.c

src/listpack.c

…ion-hash

Part one of implementing redis#8702 (taking hashes first before other types) ## Description of the feature 1. Change ziplist encoded hash objects to listpack encoding. 2. Convert existing ziplists on RDB loading time. an O(n) operation. ## Rdb format changes 1. Add RDB_TYPE_HASH_LISTPACK rdb type. 2. Bump RDB_VERSION to 10 ## Interface changes 1. New `hash-max-listpack-entries` config is an alias for `hash-max-ziplist-entries` (same with `hash-max-listpack-value`) 2. OBJECT ENCODING will return `listpack` instead of `ziplist` ## Listpack improvements: 1. Support direct insert, replace integer element (rather than convert back and forth from string) 3. Add more listpack capabilities to match the ziplist ones (like `lpFind`, `lpRandomPairs` and such) 4. Optimize element length fetching, avoid multiple calculations 5. Use inline to avoid function call overhead. ## Tests 1. Add a new test to the RDB load time conversion 2. Adding the listpack unit tests. (based on the one in ziplist.c) 3. Add a few "corrupt payload: fuzzer findings" tests, and slightly modify existing ones. Co-authored-by: Oran Agra <oran@redislabs.com>

Part two of implementing #8702 (zset), after #8887. ## Description of the feature Replaced all uses of ziplist with listpack in t_zset, and optimized some of the code to optimize performance. ## Rdb format changes New `RDB_TYPE_ZSET_LISTPACK` rdb type. ## Rdb loading improvements: 1) Pre-expansion of dict for validation of duplicate data for listpack and ziplist. 2) Simplifying the release of empty key objects when RDB loading. 3) Unify ziplist and listpack data verify methods for zset and hash, and move code to rdb.c. ## Interface changes 1) New `zset-max-listpack-entries` config is an alias for `zset-max-ziplist-entries` (same with `zset-max-listpack-value`). 2) OBJECT ENCODING will return listpack instead of ziplist. ## Listpack improvements: 1) Add `lpDeleteRange` and `lpDeleteRangeWithEntry` functions to delete a range of entries from listpack. 2) Improve the performance of `lpCompare`, converting from string to integer is faster than converting from integer to string. 3) Replace `snprintf` with `ll2string` to improve performance in converting numbers to strings in `lpGet()`. ## Zset improvements: 1) Improve the performance of `zzlFind` method, use `lpFind` instead of `lpCompare` in a loop. 2) Use `lpDeleteRangeWithEntry` instead of `lpDelete` twice to delete a element of zset. ## Tests 1) Add some unittests for `lpDeleteRange` and `lpDeleteRangeWithEntry` function. 2) Add zset RDB loading test. 3) Add benchmark test for `lpCompare` and `ziplsitCompare`. 4) Add empty listpack zset corrupt dump test.

Part three of implementing #8702, following #8887 and #9366 . ## Description of the feature 1. Replace the ziplist container of quicklist with listpack. 2. Convert existing quicklist ziplists on RDB loading time. an O(n) operation. ## Interface changes 1. New `list-max-listpack-size` config is an alias for `list-max-ziplist-size`. 2. Replace `debug ziplist` command with `debug listpack`. ## Internal changes 1. Add `lpMerge` to merge two listpacks . (same as `ziplistMerge`) 2. Add `lpRepr` to print info of listpack which is used in debugCommand and `quicklistRepr`. (same as `ziplistRepr`) 3. Replace `QUICKLIST_NODE_CONTAINER_ZIPLIST` with `QUICKLIST_NODE_CONTAINER_PACKED`(following #9357 ). It represent that a quicklistNode is a packed node, as opposed to a plain node. 4. Remove `createZiplistObject` method, which is never used. 5. Calculate listpack entry size using overhead overestimation in `quicklistAllowInsert`. We prefer an overestimation, which would at worse lead to a few bytes below the lowest limit of 4k. ## Improvements 1. Calling `lpShrinkToFit` after converting Ziplist to listpack, which was missed at #9366. 2. Optimize `quicklistAppendPlainNode` to avoid memcpy data. ## Bugfix 1. Fix crash in `quicklistRepr` when ziplist is compressed, introduced from #9366. ## Test 1. Add unittest for `lpMerge`. 2. Modify the old quicklist ziplist corrupt dump test. Co-authored-by: Oran Agra <oran@redislabs.com>

Part three of implementing redis#8702, following redis#8887 and redis#9366 . ## Description of the feature 1. Replace the ziplist container of quicklist with listpack. 2. Convert existing quicklist ziplists on RDB loading time. an O(n) operation. ## Interface changes 1. New `list-max-listpack-size` config is an alias for `list-max-ziplist-size`. 2. Replace `debug ziplist` command with `debug listpack`. ## Internal changes 1. Add `lpMerge` to merge two listpacks . (same as `ziplistMerge`) 2. Add `lpRepr` to print info of listpack which is used in debugCommand and `quicklistRepr`. (same as `ziplistRepr`) 3. Replace `QUICKLIST_NODE_CONTAINER_ZIPLIST` with `QUICKLIST_NODE_CONTAINER_PACKED`(following redis#9357 ). It represent that a quicklistNode is a packed node, as opposed to a plain node. 4. Remove `createZiplistObject` method, which is never used. 5. Calculate listpack entry size using overhead overestimation in `quicklistAllowInsert`. We prefer an overestimation, which would at worse lead to a few bytes below the lowest limit of 4k. ## Improvements 1. Calling `lpShrinkToFit` after converting Ziplist to listpack, which was missed at redis#9366. 2. Optimize `quicklistAppendPlainNode` to avoid memcpy data. ## Bugfix 1. Fix crash in `quicklistRepr` when ziplist is compressed, introduced from redis#9366. ## Test 1. Add unittest for `lpMerge`. 2. Modify the old quicklist ziplist corrupt dump test. Co-authored-by: Oran Agra <oran@redislabs.com>

Remove some dead code in object.c, ziplist is no longer used in 7.0 Some backgrounds: zipmap - hash: replaced by ziplist in redis#285 ziplist - hash: replaced by listpack in redis#8887 ziplist - zset: replaced by listpack in redis#9366 ziplist - list: replaced by quicklist (listpack) in redis#2143 / redis#9740

dev-lemontree · 2022-05-19T12:29:02Z

@sundb @oranagra I have a question. It can still support old RDB Version?
If we try to migrate Redis 6 to Redis 7 with replication, Maybe Redis 6 creates RDB with old version(using ziplist), Does Redis 7 load it(Redis 6 RDB)?
It is very common pattern to upgrade redis server.

sundb · 2022-05-19T12:30:58Z

@dev-lemontree Indeed, It is backward compatible.

Remove some dead code in object.c, ziplist is no longer used in 7.0 Some backgrounds: zipmap - hash: replaced by ziplist in #285 ziplist - hash: replaced by listpack in #8887 ziplist - zset: replaced by listpack in #9366 ziplist - list: replaced by quicklist (listpack) in #2143 / #9740 Moved the location of ziplist.h in the server.c

src/rdb.c

Remove some dead code in object.c, ziplist is no longer used in 7.0 Some backgrounds: zipmap - hash: replaced by ziplist in redis#285 ziplist - hash: replaced by listpack in redis#8887 ziplist - zset: replaced by listpack in redis#9366 ziplist - list: replaced by quicklist (listpack) in redis#2143 / redis#9740 Moved the location of ziplist.h in the server.c

sundb added 11 commits April 28, 2021 19:50

Add some interfaces to listpack for listpack migraion

d08be0a

Revert some code and fix valgrind

f459af5

Remove unsed dot

98856fb

Make the hash object support both listpack and ziplsit

1b4a166

Revert some code

0406a9d

Remove some code

bfe9dd0

Add listContainerListpack

7e4a1a9

Fix test fail

16e7298

Fix module test fail

44ad433

Change code style and incr rdb ver

f5a7e79

Remove unused code and fix code style

b49f9a6

This was referenced May 6, 2021

listpack migration - replace all usage of ziplist with listpack #8761

Closed

Replace all usage of ziplist with listpack for quicklist #8880

Closed

oranagra added this to Backlog in 7.0 via automation May 6, 2021

oranagra moved this from Backlog to In Review in 7.0 May 6, 2021

oranagra reviewed May 6, 2021

View reviewed changes

sundb added 11 commits May 7, 2021 11:32

Change return type of listLen and listBlogLen

a3c27c7

Change list_container to packedClass

29e508f

Add default packed encoding and add test

1e86e6d

Fix lpValidateIntegrity bug

bcfe84d

Simplify lpValidateIntegrity

5b9d556

Change type of val length to size_t

9d30ddb

Fix packed_encoding error in test

b03f34f

Convert ziplist to listpack when deep sanitization

5b77c47

Add support to convert ziplist to listpack when rdb loading

06b29a6

Optimize some code

b0536d9

Convert ziplist to listpack when call hashTypeInitIterator

59e32e1

sundb added 2 commits August 5, 2021 20:23

Merge branch 'unstable' into listpack-migration-hash

46095fb

sundb force-pushed the listpack-migration-hash branch from b582e97 to 23115d0 Compare August 6, 2021 04:04

oranagra reviewed Aug 8, 2021

View reviewed changes

tests/integration/corrupt-dump.tcl Outdated Show resolved Hide resolved

tests/integration/corrupt-dump.tcl Outdated Show resolved Hide resolved

src/ziplist.c Outdated Show resolved Hide resolved

sundb commented Aug 9, 2021

View reviewed changes

src/listpack.c Show resolved Hide resolved

Merge remote-tracking branch 'upstream/unstable' into listpack-migrat…

76ecbaa

…ion-hash

oranagra approved these changes Aug 10, 2021

View reviewed changes

oranagra added release-notes indication that this issue needs to be mentioned in the release notes state:major-decision Requires core team consensus labels Aug 10, 2021

oranagra merged commit 02fd76b into redis:unstable Aug 10, 2021

sundb mentioned this pull request Aug 10, 2021

Fix missing dismiss hash listpack memory due to ziplist->listpack migration #9353

Merged

oranagra moved this from In Review to Done in 7.0 Aug 11, 2021

sundb mentioned this pull request Aug 12, 2021

Replace all usage of ziplist with listpack for t_zset #9366

Merged

sundb deleted the listpack-migration-hash branch September 16, 2021 06:36

sundb mentioned this pull request Nov 5, 2021

Replace ziplist with listpack in quicklist #9740

Merged

2 tasks

enjoy-binbin mentioned this pull request May 19, 2022

Remove ziplist dead code in object.c #10751

Merged

DarrenJiang13 mentioned this pull request Jul 29, 2022

fix typo zl to lp as ziplist was replaced by listpack. #11062

Open

rhuddleston mentioned this pull request Nov 11, 2022

Redis 7 Support sripathikrishnan/redis-rdb-tools#185

Open

enjoy-binbin reviewed Nov 23, 2022

View reviewed changes

src/rdb.c Show resolved Hide resolved

oranagra mentioned this pull request Apr 24, 2023

[CRASH] Redis 5.0.9 crash due to ziplistInsert #12099

Closed

srgsanky mentioned this pull request Feb 7, 2024

OBJECT ENCODING - needs an update to use listpack? redis/redis-doc#2658

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace all usage of ziplist with listpack for t_hash #8887

Replace all usage of ziplist with listpack for t_hash #8887

sundb commented Apr 29, 2021 •

edited by oranagra

oranagra left a comment

oranagra commented May 6, 2021

sundb commented May 7, 2021

sundb commented May 8, 2021

dev-lemontree commented May 19, 2022

sundb commented May 19, 2022

Replace all usage of ziplist with listpack for t_hash #8887

Replace all usage of ziplist with listpack for t_hash #8887

Conversation

sundb commented Apr 29, 2021 • edited by oranagra

Description of the feature

Rdb format changes

Interface changes

Listpack improvements:

Tests

oranagra left a comment

Choose a reason for hiding this comment

oranagra commented May 6, 2021

sundb commented May 7, 2021

sundb commented May 8, 2021

dev-lemontree commented May 19, 2022

sundb commented May 19, 2022

sundb commented Apr 29, 2021 •

edited by oranagra