Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

varbinary column data truncated in RowsEvent #477

Closed
kolbitsch-lastline opened this issue Mar 8, 2020 · 2 comments
Closed

varbinary column data truncated in RowsEvent #477

kolbitsch-lastline opened this issue Mar 8, 2020 · 2 comments

Comments

@kolbitsch-lastline
Copy link
Contributor

MySQL columns BINARY and VARBINARY do not seem to have their own type in replication events and are simply streamed as MYSQL_TYPE_STRING or MYSQL_TYPE_VARSTRING, respectively - or, at least, the documentation (see Table 14.4 Column Types) does not list a specific type:

https://dev.mysql.com/doc/internals/en/com-query-response.html#packet-Protocol::MYSQL_TYPE_STRING

In the parsing of row events, we use the hack module to convert the raw []byte to string instances:

https://github.com/siddontang/go-mysql/blob/master/replication/row_event.go#L541

		n = int(length) + 1
		v = hack.String(data[1:n])

which seems to strip trailing 0-bytes from the parsed events. For strings, this makes sense, but for binary blobs, it does not.

While the decodeString method returns the extracted length field, the caller in decodeRows only uses this information to advance the position in the buffer:

https://github.com/siddontang/go-mysql/blob/master/replication/row_event.go#L349


		row[i], n, err = e.decodeValue(data[pos:], table.ColumnType[i], table.ColumnMeta[i])

		if err != nil {
			return 0, err
		}
		pos += n
	}

	e.Rows = append(e.Rows, row)
	return pos, nil

As a result, a client receiving parsed binlog data has no way of know a binary string was truncated due to a 0-byte or not.

@kolbitsch-lastline
Copy link
Contributor Author

After reading a bit more in through the mysql internal documentation, I'm actually not sure if this is a bug in the library or if the replication protocol really strips away trailing 0-bytes. If that were the case, we have to handle it in the caller who knows more about the schema than the replication logic

would love to hear some thoughts from people more familiar with the nitty-gritty details of the protocol

@kolbitsch-lastline
Copy link
Contributor Author

ok, eventually ended up finding the correct passage in the documentation:

https://dev.mysql.com/doc/refman/5.7/en/binary-varbinary.html

When BINARY values are stored, they are right-padded with the pad value to the specified length. The pad value is 0x00 (the zero byte). Values are right-padded with 0x00 for inserts, and no trailing bytes are removed for retrievals.

thus, it's expected that the value we see in the binlog (and that the module returns) does not contain the 0-bytes.

IMO we need to help clients handle this (and hence extend the schema.go code), but that's a separate issue I'll open and send a PR for

ghost pushed a commit to actiontech/go-mysql that referenced this issue Nov 24, 2020
ghost pushed a commit to actiontech/go-mysql that referenced this issue Nov 24, 2020
ghost pushed a commit to actiontech/go-mysql that referenced this issue Nov 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant