Fix decrRefCount on NULL robj on corrupt KEY_META payload#15034
Fix decrRefCount on NULL robj on corrupt KEY_META payload#15034sundb merged 8 commits intoredis:unstablefrom
Conversation
🤖 Augment PR SummarySummary: Hardens RDB check-mode module value parsing against corrupt/truncated payloads by avoiding 🤖 Was this summary useful? React with 👍 or 👎 |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Reviewed by Cursor Bugbot for commit 0518ea6. Configure here.
|
@sundb, I think that once rdbLoadCheckModuleValue() fail while decoding a module payload, continuing to scan for RDB_MODULE_OPCODE_EOF is best-effort at best. We may already be desynchronized, so “keep going and return dummy” is not really a robust recovery strategy. If we do that, we should also update the callers so they stop parsing immediately on NULL: |
@moticless Since this PR needs to be backported, I’d prefer not to make too many changes to the behavior that’s already working correctly in this PR, to avoid introducing a new bug to old versions. |
## Summary This PR fixes two issues when processing corrupt data in rdbLoadCheckModuleValue(): 1. When handling `RDB_MODULE_OPCODE_STRING` opcode, rdbGenericLoadStringObject() can return NULL on a corrupt payload. The code called decrRefCount(o) unconditionally without a NULL check, resulting in a NULL pointer dereference crash. 2. The while loop condition was `!= RDB_MODULE_OPCODE_EOF`, which means a truncated payload (causing rdbLoadLen to return RDB_LENERR) would never exit the loop, since `RDB_LENERR != RDB_MODULE_OPCODE_EOF` is always true, potentially causing an infinite hang.

Summary
This PR fixes two issues when processing corrupt data in rdbLoadCheckModuleValue():
When handling
RDB_MODULE_OPCODE_STRINGopcode, rdbGenericLoadStringObject() can return NULL on a corrupt payload. The code called decrRefCount(o) unconditionally without a NULL check, resulting in a NULL pointer dereference crash.The while loop condition was
!= RDB_MODULE_OPCODE_EOF, which means a truncated payload (causing rdbLoadLen to return RDB_LENERR) would never exit the loop, sinceRDB_LENERR != RDB_MODULE_OPCODE_EOFis always true, potentially causing an infinite hang.Note that the new test in corrupt-dump covers both of these issues.
To ensure that this test could be reproduced on version 8.6, manually modified the payload.
Failed CI: https://github.com/redis/redis/actions/runs/24219825926/job/70708582753
Note
Low Risk
Changes are limited to RDB-check/corruption paths and add stricter validation/early exits, with minimal impact on normal RDB load/save behavior.
Overview
Improves corruption handling in
rdbLoadCheckModuleValue()by detectingRDB_LENERR, rejecting unknown module opcodes, and avoidingdecrRefCount()on a NULL string object read; it now supports an option to returnNULLon corruption instead of always returning a dummy object.Updates RDB/key-metadata and
redis-check-rdbcallers to pass the newnull_on_errorflag (returningNULLfor KEY_META skipping paths, keeping dummy objects for regular check-mode parsing), and adds an integration test covering the corruptKEY_METARESTORE regression.Reviewed by Cursor Bugbot for commit 10c03a4. Bugbot is set up for automated code reviews on this repo. Configure here.