New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace all usage of ziplist with listpack for t_hash #8887
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sundb thank you for that effort.
i've added a few comments inside the code, but here are some general ones:
- i don't like the term "list container". in "ziplist" and "listpack", we mention it is zipped, or packed, i.e. encoded. this container can be more easily mistaken to be something related to list data type rather than an encoding type.
maybe a term like "encoded list" or "packed list" (i.e. something that is generic, and both ziplist and listpack comply to, i.e. both ziplist and listpack are encoded and packed). - currently we only create new hashes with listpack, and when modifying an existing hashes we'll always keep them as ziplists.
we need to see if we can come up with a plan to gradually convert them.
i.e. i don't want an O(N) operation at load time or any other time, but eventually i want them all gone. - we have a problem with testing. the majority of the tests work by creating a new db from scratch, so the ziplist code is now mostly unreachable.
maybe we need some DEBUG sub-command that will tell redis to default to creating ziplists, and then run the entire testsuite in that mode (in the daily CI)
i'm specifically worried about the corrupt-dump-fuzzer test. we either need to save an old rdb file in the assets folder, or let it run twice with that DEBUG tweak i suggested.
One more thing, we need an option to convert the encoding at rdb loading time. i think it would also be a good idea to benchmark such a conversion, and get a feeling of how much longer it takes to load an rdb file if we do a conversion during loading vs one that just loads the ziplists as they are. can you try doing such a benchmark? |
|
@oranagra I tested the speed of rdb loading, before(keep ziplsit) and after(convert ziplist) using the same dump.rdb, creating 100000 keys per test and filling value with random strings.
|
1) Fix CR 2) In resolving conflict, I reverted all corrupt-test tests, as some were fixed in redis#9302, and Re-add the new issue caused this pr. 3) ziplistValidateIntegrity fails to validate an empty ziplsit because hash ziplist->listpack conversion is always deep santization, this should be fixed in redis#9297 but was missed.
b582e97
to
23115d0
Compare
Part one of implementing redis#8702 (taking hashes first before other types) ## Description of the feature 1. Change ziplist encoded hash objects to listpack encoding. 2. Convert existing ziplists on RDB loading time. an O(n) operation. ## Rdb format changes 1. Add RDB_TYPE_HASH_LISTPACK rdb type. 2. Bump RDB_VERSION to 10 ## Interface changes 1. New `hash-max-listpack-entries` config is an alias for `hash-max-ziplist-entries` (same with `hash-max-listpack-value`) 2. OBJECT ENCODING will return `listpack` instead of `ziplist` ## Listpack improvements: 1. Support direct insert, replace integer element (rather than convert back and forth from string) 3. Add more listpack capabilities to match the ziplist ones (like `lpFind`, `lpRandomPairs` and such) 4. Optimize element length fetching, avoid multiple calculations 5. Use inline to avoid function call overhead. ## Tests 1. Add a new test to the RDB load time conversion 2. Adding the listpack unit tests. (based on the one in ziplist.c) 3. Add a few "corrupt payload: fuzzer findings" tests, and slightly modify existing ones. Co-authored-by: Oran Agra <oran@redislabs.com>
Part two of implementing #8702 (zset), after #8887. ## Description of the feature Replaced all uses of ziplist with listpack in t_zset, and optimized some of the code to optimize performance. ## Rdb format changes New `RDB_TYPE_ZSET_LISTPACK` rdb type. ## Rdb loading improvements: 1) Pre-expansion of dict for validation of duplicate data for listpack and ziplist. 2) Simplifying the release of empty key objects when RDB loading. 3) Unify ziplist and listpack data verify methods for zset and hash, and move code to rdb.c. ## Interface changes 1) New `zset-max-listpack-entries` config is an alias for `zset-max-ziplist-entries` (same with `zset-max-listpack-value`). 2) OBJECT ENCODING will return listpack instead of ziplist. ## Listpack improvements: 1) Add `lpDeleteRange` and `lpDeleteRangeWithEntry` functions to delete a range of entries from listpack. 2) Improve the performance of `lpCompare`, converting from string to integer is faster than converting from integer to string. 3) Replace `snprintf` with `ll2string` to improve performance in converting numbers to strings in `lpGet()`. ## Zset improvements: 1) Improve the performance of `zzlFind` method, use `lpFind` instead of `lpCompare` in a loop. 2) Use `lpDeleteRangeWithEntry` instead of `lpDelete` twice to delete a element of zset. ## Tests 1) Add some unittests for `lpDeleteRange` and `lpDeleteRangeWithEntry` function. 2) Add zset RDB loading test. 3) Add benchmark test for `lpCompare` and `ziplsitCompare`. 4) Add empty listpack zset corrupt dump test.
Part three of implementing #8702, following #8887 and #9366 . ## Description of the feature 1. Replace the ziplist container of quicklist with listpack. 2. Convert existing quicklist ziplists on RDB loading time. an O(n) operation. ## Interface changes 1. New `list-max-listpack-size` config is an alias for `list-max-ziplist-size`. 2. Replace `debug ziplist` command with `debug listpack`. ## Internal changes 1. Add `lpMerge` to merge two listpacks . (same as `ziplistMerge`) 2. Add `lpRepr` to print info of listpack which is used in debugCommand and `quicklistRepr`. (same as `ziplistRepr`) 3. Replace `QUICKLIST_NODE_CONTAINER_ZIPLIST` with `QUICKLIST_NODE_CONTAINER_PACKED`(following #9357 ). It represent that a quicklistNode is a packed node, as opposed to a plain node. 4. Remove `createZiplistObject` method, which is never used. 5. Calculate listpack entry size using overhead overestimation in `quicklistAllowInsert`. We prefer an overestimation, which would at worse lead to a few bytes below the lowest limit of 4k. ## Improvements 1. Calling `lpShrinkToFit` after converting Ziplist to listpack, which was missed at #9366. 2. Optimize `quicklistAppendPlainNode` to avoid memcpy data. ## Bugfix 1. Fix crash in `quicklistRepr` when ziplist is compressed, introduced from #9366. ## Test 1. Add unittest for `lpMerge`. 2. Modify the old quicklist ziplist corrupt dump test. Co-authored-by: Oran Agra <oran@redislabs.com>
Part three of implementing redis#8702, following redis#8887 and redis#9366 . ## Description of the feature 1. Replace the ziplist container of quicklist with listpack. 2. Convert existing quicklist ziplists on RDB loading time. an O(n) operation. ## Interface changes 1. New `list-max-listpack-size` config is an alias for `list-max-ziplist-size`. 2. Replace `debug ziplist` command with `debug listpack`. ## Internal changes 1. Add `lpMerge` to merge two listpacks . (same as `ziplistMerge`) 2. Add `lpRepr` to print info of listpack which is used in debugCommand and `quicklistRepr`. (same as `ziplistRepr`) 3. Replace `QUICKLIST_NODE_CONTAINER_ZIPLIST` with `QUICKLIST_NODE_CONTAINER_PACKED`(following redis#9357 ). It represent that a quicklistNode is a packed node, as opposed to a plain node. 4. Remove `createZiplistObject` method, which is never used. 5. Calculate listpack entry size using overhead overestimation in `quicklistAllowInsert`. We prefer an overestimation, which would at worse lead to a few bytes below the lowest limit of 4k. ## Improvements 1. Calling `lpShrinkToFit` after converting Ziplist to listpack, which was missed at redis#9366. 2. Optimize `quicklistAppendPlainNode` to avoid memcpy data. ## Bugfix 1. Fix crash in `quicklistRepr` when ziplist is compressed, introduced from redis#9366. ## Test 1. Add unittest for `lpMerge`. 2. Modify the old quicklist ziplist corrupt dump test. Co-authored-by: Oran Agra <oran@redislabs.com>
Remove some dead code in object.c, ziplist is no longer used in 7.0 Some backgrounds: zipmap - hash: replaced by ziplist in redis#285 ziplist - hash: replaced by listpack in redis#8887 ziplist - zset: replaced by listpack in redis#9366 ziplist - list: replaced by quicklist (listpack) in redis#2143 / redis#9740
@dev-lemontree Indeed, It is backward compatible. |
Remove some dead code in object.c, ziplist is no longer used in 7.0 Some backgrounds: zipmap - hash: replaced by ziplist in #285 ziplist - hash: replaced by listpack in #8887 ziplist - zset: replaced by listpack in #9366 ziplist - list: replaced by quicklist (listpack) in #2143 / #9740 Moved the location of ziplist.h in the server.c
Remove some dead code in object.c, ziplist is no longer used in 7.0 Some backgrounds: zipmap - hash: replaced by ziplist in redis#285 ziplist - hash: replaced by listpack in redis#8887 ziplist - zset: replaced by listpack in redis#9366 ziplist - list: replaced by quicklist (listpack) in redis#2143 / redis#9740 Moved the location of ziplist.h in the server.c
Part one of implementing #8702 (taking hashes first before other types)
Description of the feature
Rdb format changes
Interface changes
hash-max-listpack-entries
config is an alias forhash-max-ziplist-entries
(same withhash-max-listpack-value
)listpack
instead ofziplist
Listpack improvements:
lpFind
,lpRandomPairs
and such)Tests