Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VReplication: Support MySQL Binary Log Transaction Compression #12950

Merged
merged 26 commits into from Apr 30, 2023

Conversation

mattlord
Copy link
Contributor

@mattlord mattlord commented Apr 21, 2023

Description

MySQL 8.0 added support for binary log compression via transaction (GTID) compression in 8.0.20. You can read more about this feature here: https://dev.mysql.com/doc/refman/8.0/en/binary-log-transaction-compression.html

This can — at the cost of increased CPU usage — dramatically reduce the amount of data sent over the wire for MySQL replication while also dramatically reducing the overall storage space needed to retain binary logs (for replication, backup and recovery, CDC, etc). For larger installations this was a very desirable feature and while you could technically use it with Vitess (the MySQL replica sets making up each shard could use it fine) there was one very big limitation — VReplication workflows would not work. Given the criticality of VReplication workflows within Vitess, this meant that in practice this MySQL feature was not usable within Vitess clusters.

This PR addresses the issue by adding support for processing the compressed transaction events in VReplication, without any (known) limitations.

Note if you want to test this locally — e.g. with the local examples — create a my.cnf file that enables the option and use that in the EXTRA_MY_CNF env variable before starting the cluster (mysqlctl looks for this). For example:

$ cat ~/.my.cnf
[mysql]
binary-as-hex=false

[mysqld]
binlog_transaction_compression = ON

$ env | grep EXTRA
EXTRA_MY_CNF=/Users/matt/.my.cnf

Related Issue(s)

Checklist

Signed-off-by: Matt Lord <mattalord@gmail.com>
Signed-off-by: Matt Lord <mattalord@gmail.com>
Signed-off-by: Matt Lord <mattalord@gmail.com>
@vitess-bot vitess-bot bot added NeedsDescriptionUpdate The description is not clear or comprehensive enough, and needs work NeedsWebsiteDocsUpdate What it says labels Apr 21, 2023
@vitess-bot
Copy link
Contributor

vitess-bot bot commented Apr 21, 2023

Review Checklist

Hello reviewers! 👋 Please follow this checklist when reviewing this Pull Request.

General

  • Ensure that the Pull Request has a descriptive title.
  • If this is a change that users need to know about, please apply the release notes (needs details) label so that merging is blocked unless the summary release notes document is included.
  • If a test is added or modified, there should be a documentation on top of the test to explain what the expected behavior is what the test does.

If a new flag is being introduced:

  • Is it really necessary to add this flag?
  • Flag names should be clear and intuitive (as far as possible)
  • Help text should be descriptive.
  • Flag names should use dashes (-) as word separators rather than underscores (_).

If a workflow is added or modified:

  • Each item in Jobs should be named in order to mark it as required.
  • If the workflow should be required, the maintainer team should be notified.

Bug fixes

  • There should be at least one unit or end-to-end test.
  • The Pull Request description should include a link to an issue that describes the bug.

Non-trivial changes

  • There should be some code comments as to why things are implemented the way they are.

New/Existing features

  • Should be documented, either by modifying the existing documentation or creating new documentation.
  • New features should have a link to a feature request issue or an RFC that documents the use cases, corner cases and test cases.

Backward compatibility

  • Protobuf changes should be wire-compatible.
  • Changes to _vt tables and RPCs need to be backward compatible.
  • vtctl command output order should be stable and awk-able.
  • RPC changes should be compatible with vitess-operator
  • If a flag is removed, then it should also be removed from VTop, if used there.

@github-actions github-actions bot added this to the v17.0.0 milestone Apr 21, 2023
@mattlord mattlord added Type: Enhancement Logical improvement (somewhere between a bug and feature) Component: VReplication labels Apr 21, 2023
Signed-off-by: Matt Lord <mattalord@gmail.com>
Signed-off-by: Matt Lord <mattalord@gmail.com>
We needed more random data in order to generate the necessary
binlog size when compression is enabled.

Signed-off-by: Matt Lord <mattalord@gmail.com>
Signed-off-by: Matt Lord <mattalord@gmail.com>
Signed-off-by: Matt Lord <mattalord@gmail.com>
Signed-off-by: Matt Lord <mattalord@gmail.com>
Signed-off-by: Matt Lord <mattalord@gmail.com>
Signed-off-by: Matt Lord <mattalord@gmail.com>
Signed-off-by: Matt Lord <mattalord@gmail.com>
Signed-off-by: Matt Lord <mattalord@gmail.com>
Signed-off-by: Matt Lord <mattalord@gmail.com>
@mattlord mattlord removed NeedsDescriptionUpdate The description is not clear or comprehensive enough, and needs work NeedsWebsiteDocsUpdate What it says labels Apr 25, 2023
@mattlord mattlord marked this pull request as ready for review April 25, 2023 16:01
Signed-off-by: Matt Lord <mattalord@gmail.com>
Copy link
Contributor

@rohit-nayak-ps rohit-nayak-ps left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice :-)

Similar to VStreamerEventsStreamed, should we add a VStreamerCompressedEventsStreamed and possibly metrics for bytes before/after compression for visibility? Not sure if it will add value, so just thinking aloud here.

Signed-off-by: Matt Lord <mattalord@gmail.com>
@mattlord
Copy link
Contributor Author

Very nice :-)

Similar to VStreamerEventsStreamed, should we add a VStreamerCompressedEventsStreamed and possibly metrics for bytes before/after compression for visibility? Not sure if it will add value, so just thinking aloud here.

Good idea. We don't stream them, but we decode them and stream the internal events. So I added a metric for VStreamerCompressedTransactionsDecoded.

I'll open a docs PR to add that here:
https://vitess.io/docs/17.0/reference/vreplication/metrics/

Signed-off-by: Matt Lord <mattalord@gmail.com>
@mattlord mattlord requested a review from vmg April 27, 2023 17:25
go/mysql/binlog_event.go Show resolved Hide resolved
go/mysql/binlog_event_compression.go Show resolved Hide resolved
go/mysql/binlog_event_compression.go Outdated Show resolved Hide resolved
go/mysql/binlog_event_compression.go Outdated Show resolved Hide resolved
}

// Create a reader that caches decompressors.
var zstdDecoder, _ = zstd.NewReader(nil, zstd.WithDecoderConcurrency(0))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this concurrency safe? Would we ever have multiple decodings happening in parallel?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah wait, looks like https://pkg.go.dev/github.com/klauspost/compress/zstd#NewReader answers this as we're using DecodeAll.

Would it be useful at some point to do streaming mode? Since from what I understand a single encrypted entry can contain multiple events, we would already emit events while decompressing so we don't have to decompress everything into memory?

Copy link
Contributor Author

@mattlord mattlord Apr 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was using the streaming mode at first and it's pretty slow. It would make sense to potentially switch to that based on the uncompressed payload size.... maybe we switch to streaming if it's > 128MiB or something?

Copy link
Contributor

@dbussink dbussink Apr 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mattlord Hmm, interesting. How slow? Seems like maybe something we should dig into at some point but it doesn't have to hold up this PR I think.

Copy link
Contributor Author

@mattlord mattlord Apr 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not THAT slow, but slow enough that it caused failures in the OnlineDDL VRepl stress tests — which have a bunch of threads executing small transactions against the DB while the workflow is running — as the vstream could not keep up enough to perform the final cutover before hitting the cutover timeout.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mattlord That reminds me. I think we should update to the latest https://github.com/klauspost/compress as well here, looks like there recently were quite a few ztsd improvements.

Signed-off-by: Matt Lord <mattalord@gmail.com>
Signed-off-by: Matt Lord <mattalord@gmail.com>
@mattlord
Copy link
Contributor Author

Thanks for the review, @dbussink ! I think that I've now addressed most of your comments.

// At what size should we switch from the in-memory buffer
// decoding to streaming mode -- which is slower, but does not
// require everything be done in memory.
const zstdInMemoryDecompressorMaxSize = 128 << (10 * 2) // 128MB
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should make this a vttablet flag? ...

Copy link
Contributor

@shlomi-noach shlomi-noach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few comments. I realize you've turned some magic numbers into constants, but while we're here, the current codebase makes false assumptions when parsing the events (it assumes CRC32 is enabled when it might not be).

Is there a potential to add a unit test file in go/mysql to test the new TransactionPayload parsing?

// Offset from 0 where the 4 byte length is stored.
binlogEventLenOffset = 9
// Byte length of the checksum suffix.
binlogChecksumLen = 4
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rename this as binlogCRC32ChecksumLen; there are potentially, even if not practically, other tpyes of checksums with different lengths.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make all of the above public constants; I have a use case to use those values externally (and I have, like you, created constants for those outside Vitess).

// dataBytes returns the event bytes without header prefix and without checksum suffix
func (ev binlogEvent) dataBytes(f BinlogFormat) []byte {
data := ev.Bytes()[f.HeaderLength:]
data = data[0 : len(data)-4]
data = data[0 : len(data)-binlogChecksumLen]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should really check BinlogFormat to see which checksum is being used, before assuming a checksum is at all used. Sample code:

	switch f.ChecksumAlgorithm {
	case mysql.BinlogChecksumAlgCRC32:
                .... size is binlogChecksumLen
	case mysql.BinlogChecksumAlgOff:
		... size is 0
	default:
		return vterrors.Errorf(vtrpc.Code_INVALID_ARGUMENT, "unsupported checksum algorithm: %v", format.ChecksumAlgorithm)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather not expand this PR. I don't disagree with you that we can improve this but it's not directly related and I want to keep this focused.

checksum := data[length-4:]
data = data[:length-4]
checksum := data[length-binlogChecksumLen:]
data = data[:length-binlogChecksumLen]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, we must not assume CRC32 here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, sure. But it's not related to this PR, I only made the 4 less "magical". 🙂

Copy link
Contributor Author

@mattlord mattlord Apr 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's also worth noting that the checksum is either 0 bytes, when there is no checksum, or it's 4 bytes regardless of the algorithm used. In both MySQL and MariaDB. Although in both cases none and CRC32 are the ONLY options supported today:
https://dev.mysql.com/doc/refman/8.0/en/replication-options-binary-log.html#option_mysqld_binlog-checksum

So we're being forward looking here, which I'm all for. But again, I don't want to expand the scope of this PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

go/mysql/binlog_event_mysql56.go Outdated Show resolved Hide resolved
var transactionPayloadCompressionTypes = map[uint64]string{
transactionPayloadCompressionZstd: "ZSTD",
transactionPayloadCompressionNone: "NONE",
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might have a need for these to be public to; but I will iterate in a different PR if needed.

go/mysql/binlog_event_compression.go Show resolved Hide resolved
go/mysql/binlog_event_compression.go Outdated Show resolved Hide resolved
Signed-off-by: Matt Lord <mattalord@gmail.com>
Signed-off-by: Matt Lord <mattalord@gmail.com>
@mattlord mattlord requested a review from ajm188 as a code owner April 28, 2023 16:12
@mattlord
Copy link
Contributor Author

Thank you for the review, @shlomi-noach ! ❤️ I believe that I've addressed all of your comments now (w/o expanding the scope of the PR too much). Please let me know if I missed something though.

Signed-off-by: Matt Lord <mattalord@gmail.com>
Copy link
Collaborator

@vmg vmg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great. I think we can still remove some allocations from the ZSTD decompression code but I don't want to block the PR, since the behavior is correct. I'll run some benchmarks next week!

@mattlord
Copy link
Contributor Author

This looks great. I think we can still remove some allocations from the ZSTD decompression code but I don't want to block the PR, since the behavior is correct. I'll run some benchmarks next week!

@vmg Yeah, I agree. That's why I wanted you, the perf master, to review. 😄 Thank you! ❤️

Signed-off-by: Matt Lord <mattalord@gmail.com>
@mattlord mattlord merged commit 0655342 into vitessio:main Apr 30, 2023
112 checks passed
@mattlord mattlord deleted the vrepl_binlog_compression branch April 30, 2023 21:38
frouioui pushed a commit to planetscale/vitess that referenced this pull request Nov 21, 2023
…ansaction Compression (vitessio#2047)

* cherry pick of 12950

* Fix merge conflicts

Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>

---------

Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>
Co-authored-by: Dirkjan Bussink <d.bussink@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: VReplication Type: Enhancement Logical improvement (somewhere between a bug and feature)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

MySQL 8.0 binlog compression does not work with Vitess VReplication
5 participants