-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Binlog JSON Parser: handle inline types correctly for large documents #8187
Conversation
…e detail in binlog parsing to identify errant data Signed-off-by: Rohit Nayak <rohit@planetscale.com>
…ensive comments. Add test for reported failure mode and improve other tests Signed-off-by: Rohit Nayak <rohit@planetscale.com>
4cc1b75
to
225df50
Compare
Signed-off-by: Rohit Nayak <rohit@planetscale.com>
Signed-off-by: Rohit Nayak <rohit@planetscale.com>
Signed-off-by: Rohit Nayak <rohit@planetscale.com>
Signed-off-by: Rohit Nayak <rohit@planetscale.com>
This is now ready for review: applying a patch with the inline fix in this PR fixed the MoveTables workflow that was failing. |
} else { | ||
fmt.Printf("%02d ", c) | ||
s += fmt.Sprintf("%02d ", c) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This concatenation will end up inefficient. Consider using strings.Builder
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually this is only used for local debugging of the functionality. It is off in production. It will be removed soon once we have a few more prod tests with the json data type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, cool. In that case, keep it as it is.
} | ||
|
||
// only used for logging/debugging | ||
var jsonTypeToName = map[uint]string{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there no boolean
type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the literal
type is used for booleans and nulls
return length, pos | ||
} | ||
|
||
// getElem returns the json value found inside json objects and arrays at the provided position | ||
func getElem(data []byte, pos int, large bool) (*ajson.Node, int, error) { | ||
var elem *ajson.Node | ||
var err error | ||
var offset int | ||
typ := jsonDataType(data[pos]) | ||
pos++ | ||
if isInline(typ, large) { | ||
elem, err = binlogJSON.getNode(typ, data, pos) | ||
if err != nil { | ||
return nil, 0, err | ||
} | ||
if large { | ||
pos += 4 | ||
} else { | ||
pos += 2 | ||
} | ||
} else { | ||
offset, pos = readInt(data, pos, large) | ||
if offset >= len(data) { // consistency check, should only come here is there is a bug in the code | ||
log.Errorf("unable to decode element") | ||
return nil, 0, fmt.Errorf("unable to decode element: %+v", data) | ||
} | ||
newData := data[offset:] | ||
//newPos ignored because this is an offset into the "extra" section of the buffer | ||
elem, err = binlogJSON.getNode(typ, newData, 1) | ||
if err != nil { | ||
return nil, 0, err | ||
} | ||
} | ||
return elem, pos, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gonna trust you on this 😅
Description
In JSON Objects and Arrays some data types are inlined. The existing code only expected that 2 byte values (int16s, literals, null) could be inlined. However for "large" documents (> 64K) mysql also inlines 4 byte values (int32s). This resulted in out-of-bounds panics because the inline 32-bit integers were treated as offsets into the document.
While the main reason of this PR is to fix the inline bug, we take this opportunity to:
Signed-off-by: Rohit Nayak rohit@planetscale.com
Error Snippet
A MoveTables failed due to the json parsing logic panicking for a binlog event with this error: