Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VReplication Workflows: support noblob binlog_row_image format for MoveTables and Reshard #12905

Merged
merged 27 commits into from
Apr 26, 2023

Conversation

rohit-nayak-ps
Copy link
Contributor

@rohit-nayak-ps rohit-nayak-ps commented Apr 16, 2023

Description

Motivation

Currently, the full binlog_row_image is a hard requirement for using VReplication within Vitess. This is because
there are certain Online DDL and Materialize workflows which will not work without full.

For example, if the target table has a different primary key and doesn't have the existing primary key column, a partial
row image for a delete may not have the new primary key column to identify the target column.

However, if a row has blobs which are huge but rarely updated, we end up with a huge waste of resources (storage,
network bandwidth, cpu), in the replication (vplayer) phase, as compared to a noblob binlog_row_image.

Many common workflows like MoveTables and Reshard and indeed many Online DDL migrations will work fine. Hence, we should look at relaxing the requirement for full for such workflows.

Approach

For phase 1, to minimize impacted code and risk, we will support noblob for MoveTables and Reshard.
minimal will provide fewer benefits and requires more intrusive code changes.

VStreamer

We continue to send the same number of columns in VStreamer, but missing columns are sent as null using length:-1
for those columns. This implies that the contract on the vstreamer's consumers stays the same. An additional attribute
in the proto sends the null column bitmap. To optimize, we do this only if the flag is set and if it is not a
complete image.

VPlayer

The copy phase will do a full copy, using direct select queries against the source, so there is no impact of the binlog
row image mode.

For the VPlayer we need to generate different Insert and Update queries for different combinations of columns that are
used.

For partial queries, we compute the required ParsedQuery at runtime when we encounter one. We store it in a cache, one
each for Insert and Update, using the bitmap as the key. So we only compute it once and we then bind the variables to
this query based on the received bitmap.

Other design options and trade-offs are discussed below.

Other design options

Some other options that were considered before choosing the current approach.

Use single parsed query for update statements

We use the single Update parsed query and remove the columns by parsing the generated sql string, dropping the missing
columns and serialize the ast again.

Benefit
We don't have multiple partial update parsed queries. We cannot predict how many of such queries can exist:
theoretically we could have an combinatorial explosion depending on the number of update templates used by the app.
ORMs, for example, could generate many such depending on which column is modified.

Problem
Reparsing and regenerating the query dynamically will have a significant impact especially with a large update qps.

Status
We will export a metric to keep track of the cache size and reevaluate this later on. In general it is not expected to
require a huge number of cache entries.

Use default in insert statements

For insert statements we can continue to use the single parsed Insert query, but use the default keyword for columns
which are not present.

Benefit
Only single parsed query for insert statements

Problem
We will need to rewrite the already created query, for missing columns, in the TrackedBuffer, to replace the existing
bind variables format strings to (say) %q and then use default as the bind variable value.This needs to be done for every partial query and consequently result in non-trivial performance penalties.

Status
Same as above

Related Issue(s)

Checklist

  • "Backport to:" labels have been added if this change should be back-ported
  • Tests were added or are not required
  • Did the new or modified tests pass consistently locally and on the CI
  • Documentation was added or is not required

@github-actions github-actions bot added this to the v17.0.0 milestone Apr 16, 2023
@rohit-nayak-ps rohit-nayak-ps added Type: Feature Request Component: VReplication Do Not Merge release notes (needs details) This PR needs to be listed in the release notes in a dedicated section (deprecation notice, etc...) NeedsWebsiteDocsUpdate What it says labels Apr 16, 2023
…e2e test

Signed-off-by: Rohit Nayak <rohit@planetscale.com>
…e bitmap in RowChanges. Optimize if there are no missing columns. Add vstreamer test.

Signed-off-by: Rohit Nayak <rohit@planetscale.com>
…so test for noblob

Signed-off-by: Rohit Nayak <rohit@planetscale.com>
Signed-off-by: Rohit Nayak <rohit@planetscale.com>
…nit tests. Dynamically cache partial update queries.

Signed-off-by: Rohit Nayak <rohit@planetscale.com>
Signed-off-by: Rohit Nayak <rohit@planetscale.com>
… noblob binlog row image

Signed-off-by: Rohit Nayak <rohit@planetscale.com>
…d flaky binlog_row_image in tests: need to diag/fix

Signed-off-by: Rohit Nayak <rohit@planetscale.com>
Signed-off-by: Rohit Nayak <rohit@planetscale.com>
Signed-off-by: Rohit Nayak <rohit@planetscale.com>
Signed-off-by: Rohit Nayak <rohit@planetscale.com>
Signed-off-by: Rohit Nayak <rohit@planetscale.com>
Signed-off-by: Rohit Nayak <rohit@planetscale.com>
…al tests for new changes. Set binlog mode dynamically instead of in test-suite.cnf

Signed-off-by: Rohit Nayak <rohit@planetscale.com>
Signed-off-by: Rohit Nayak <rohit@planetscale.com>
Signed-off-by: Rohit Nayak <rohit@planetscale.com>
Signed-off-by: Rohit Nayak <rohit@planetscale.com>
Signed-off-by: Rohit Nayak <rohit@planetscale.com>
Signed-off-by: Rohit Nayak <rohit@planetscale.com>
Signed-off-by: Rohit Nayak <rohit@planetscale.com>
@rohit-nayak-ps rohit-nayak-ps changed the title Experimental: support noblob binlog_row_image format for MoveTables and Reshard VReplication Workflows: support noblob binlog_row_image format for MoveTables and Reshard Apr 21, 2023
Signed-off-by: Rohit Nayak <rohit@planetscale.com>
@vitess-bot
Copy link
Contributor

vitess-bot bot commented Apr 21, 2023

Review Checklist

Hello reviewers! 👋 Please follow this checklist when reviewing this Pull Request.

General

  • Ensure that the Pull Request has a descriptive title.
  • If this is a change that users need to know about, please apply the release notes (needs details) label so that merging is blocked unless the summary release notes document is included.
  • If a test is added or modified, there should be a documentation on top of the test to explain what the expected behavior is what the test does.

If a new flag is being introduced:

  • Is it really necessary to add this flag?
  • Flag names should be clear and intuitive (as far as possible)
  • Help text should be descriptive.
  • Flag names should use dashes (-) as word separators rather than underscores (_).

If a workflow is added or modified:

  • Each item in Jobs should be named in order to mark it as required.
  • If the workflow should be required, the maintainer team should be notified.

Bug fixes

  • There should be at least one unit or end-to-end test.
  • The Pull Request description should include a link to an issue that describes the bug.

Non-trivial changes

  • There should be some code comments as to why things are implemented the way they are.

New/Existing features

  • Should be documented, either by modifying the existing documentation or creating new documentation.
  • New features should have a link to a feature request issue or an RFC that documents the use cases, corner cases and test cases.

Backward compatibility

  • Protobuf changes should be wire-compatible.
  • Changes to _vt tables and RPCs need to be backward compatible.
  • vtctl command output order should be stable and awk-able.
  • RPC changes should be compatible with vitess-operator
  • If a flag is removed, then it should also be removed from VTop, if used there.

Signed-off-by: Rohit Nayak <rohit@planetscale.com>
Signed-off-by: Rohit Nayak <rohit@planetscale.com>
Signed-off-by: Rohit Nayak <rohit@planetscale.com>
@rohit-nayak-ps rohit-nayak-ps removed release notes (needs details) This PR needs to be listed in the release notes in a dedicated section (deprecation notice, etc...) NeedsWebsiteDocsUpdate What it says labels Apr 23, 2023
@shlomi-noach
Copy link
Contributor

shlomi-noach commented Apr 24, 2023

Related and notable: #12921 ; VReplication dos not have full support for column prefix indexes. As such, #12921 rejects them. The relevance is that a MySQL index cannot cover a blob column without a prefix -- meaning VReplication does not support blob PRIMARY KEY in the first place.

This means it is safe to use binlog_row_image=noblob for vitess Online DDL; either the migration is accepted and runs correctly, or is rejected.

Copy link
Contributor

@shlomi-noach shlomi-noach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just incredible.

proto/binlogdata.proto Show resolved Hide resolved
proto/binlogdata.proto Outdated Show resolved Hide resolved
@@ -209,7 +209,7 @@ func TestGatewayBufferingWhileReparenting(t *testing.T) {
hc.SetTabletType(replicaTablet, topodatapb.TabletType_PRIMARY)
hc.SetServing(replicaTablet, true)
hc.Broadcast(replicaTablet)

time.Sleep(10 * time.Second) // temp. hack to see if CI failures are due to delays here
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: remove

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

… Improve comments

Signed-off-by: Rohit Nayak <rohit@planetscale.com>
Signed-off-by: Rohit Nayak <rohit@planetscale.com>
Copy link
Contributor

@mattlord mattlord left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Nice work on this, Rohit! I only had minor comments that you can address as you feel is best.

changelog/17.0/17.0.0/summary.md Outdated Show resolved Hide resolved
@@ -533,6 +534,17 @@ func getDebugVar(t *testing.T, port int, varPath []string) (string, error) {
return string(val), nil
}

func getDebugVars(t *testing.T, port int) map[string]any {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I had added this or similar elsewhere too. Might be worth moving it to the cluster package (if it's not already there). In this case, we could move it to go/test/endtoend/cluster/vttablet_process.go or make it more generic and just take a host:port like you are here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, changed it to use GetVars() from VTTabletProcess .

go/test/endtoend/vreplication/helper_test.go Outdated Show resolved Hide resolved
go/test/endtoend/vreplication/helper_test.go Outdated Show resolved Hide resolved
@@ -306,13 +306,20 @@ enum VEventType {
COPY_COMPLETED = 20;
}

message Bitmap {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This message name feels too generic to me. Can we call it RowImageBitmap or something? In fact, it might be worth just embedding it directly in RowChange as it probably has no value/usage outside of that right? Then the name would be perfectly fine as it becomes a child. For example: https://github.com/vitessio/vitess/pull/12622/files#diff-13fb22ef91d4bfe00ede706cafee2757699abdb1358c14519b747e4eaf9e34b4R1314-R1320

func (tpb *tablePlanBuilder) createPartialUpdateQuery(dataColumns *binlogdatapb.Bitmap) *sqlparser.ParsedQuery {
bvf := &bindvarFormatter{}
buf := sqlparser.NewTrackedBuffer(bvf.formatter)
buf.Myprintf("update %v set ", tpb.name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's not already, I think we should make it an IdentifierCS for the potential escaping.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

colexpr already uses IndentifierCIs

if !isBitSet(dataColumns.Cols, ind) {
continue
}
buf.Myprintf("%s%v", separator, cexpr.colName)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably escape the column names (if needed) too, no?

Signed-off-by: Rohit Nayak <rohit@planetscale.com>
@rohit-nayak-ps rohit-nayak-ps merged commit 11a5c03 into vitessio:main Apr 26, 2023
@rohit-nayak-ps rohit-nayak-ps deleted the rn-noblob branch April 26, 2023 10:19
@vitess-bot
Copy link
Contributor

vitess-bot bot commented Apr 26, 2023

I was unable to backport this Pull Request to the following branches: release-16.0.

rohit-nayak-ps added a commit to planetscale/vitess that referenced this pull request May 9, 2023
Signed-off-by: Rohit Nayak <rohit@planetscale.com>
rohit-nayak-ps added a commit to planetscale/vitess that referenced this pull request May 10, 2023
Backport: VReplication Workflows: support noblob binlog_row_image format for MoveTables and Reshard vitessio#12905
frouioui pushed a commit to planetscale/vitess that referenced this pull request Nov 21, 2023
…veTables and Reshard (vitessio#12905) (vitessio#2022)

* Initial POC of binlog_row_image=noblob. Working player unit test and e2e test



* Add vttablet flag to allow non-full binlog row image. Send only single bitmap in RowChanges. Optimize if there are no missing columns. Add vstreamer test.



* Remove noblob settings from cnf files and modify test framework to also test for noblob



* Set binlog row image dynamically for e2e tests too



* Validate that noblob is only used with MoveTables/Reshard. More e2e/unit tests. Dynamically cache partial update queries.



* Minor refactor



* Unit tests: fake workflow type as MoveTables so that we can test with noblob binlog row image



* Also support partial inserts. Needed to update test-suite.cnf to avoid flaky binlog_row_image in tests: need to diag/fix



* Don't throw errors for noblob, for unit tests



* Fix flags help



* Add metrics for cache size and number of partial queries



* Refactor flags



* Fix tests



* Fail vstreamer if binlog mode is minimal. Fix UpdateVSchema and Minimal tests for new changes. Set binlog mode dynamically instead of in test-suite.cnf



* Fix flags help



* Extra cnf was not being set correctly in tests



* More tests



* We were testing for partial queries all the time instead of only noblob



* More comments and minor refactor



* make proto



* Minor refactor, more comments



* Update release summary



* Fix cleanup after vplayer tests



* Cleanup tests



* Address review comments. Add vstreamer/vplayer tests for unique keys. Improve comments



* Make proto



* Address review comments



---------

Signed-off-by: Rohit Nayak <rohit@planetscale.com>
Co-authored-by: Rohit Nayak <57520317+rohit-nayak-ps@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support "minimal" binlog_row_image for vreplication
3 participants