Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expression filter: UPDATE filter is ignored if all of update-old-value filters are empty #7774

Closed
kennytm opened this issue Dec 1, 2022 · 3 comments · Fixed by #7779
Closed

Comments

@kennytm
Copy link
Contributor

kennytm commented Dec 1, 2022

What did you do?

  1. Prepare a table

    -- upstream MySQL
    CREATE TABLE test.t (a BIGINT PRIMARY KEY, r VARCHAR(10), s INT);
    INSERT INTO test.t VALUES (1, 'a', 2);
    
    -- downstream TiDB
    CREATE TABLE test.t (a BIGINT PRIMARY KEY, r VARCHAR(10), s INT);
    INSERT INTO test.t VALUES (1, 'a', 2);
  2. Create a task which only the update-new-value filter is specified

    expression-filter:
        e02:
            schema: test
            table: t
            update-new-value-expr: "r = 'a'"
    mysql-instances:
      - source-id: mysql-replica-01
        expression-filters: ['e02']
    ...
  3. In the upstream MySQL, execute an UPDATE statement

    -- Upstream MySQL
    UPDATE test.t SET s = s + 1 WHERE a = 1;

What did you expect to see?

The UPDATE DML should be skipped, because the changed row has r = 'a'.

What did you see instead?

The DML is actually synchronized, i.e. SELECTing the row gives s = 3.

Versions of the cluster

v6.3.0

current status of DM cluster (execute query-status <task-name> in dmctl)

No response

@kennytm kennytm added type/bug This is a bug. area/dm Issues or PRs related to DM. affects-6.3 affects-6.4 labels Dec 1, 2022
@github-actions github-actions bot added this to Need Triage in Question and Bug Reports Dec 1, 2022
@kennytm
Copy link
Contributor Author

kennytm commented Dec 1, 2022

The issue is caused by oldValueFilters == nil inside the genAndFilterUpdateDMLs() function. The loop is thus completely skipped over and a RowChange is always generated.

tiflow/dm/syncer/dml.go

Lines 197 to 199 in bc176c3

for j := range oldValueFilters {
// AND logic
oldExpr, newExpr := oldValueFilters[j], newValueFilters[j]

The reason why oldValueFilters == nil I believe is the wrong short-circuit check used in GetUpdateExprs(). Although the doc suggests it will return two lists of the same length, if g.hasUpdateOldFilter[tableID] is missing, g.updateOldExprs[tableID] will remain an unpopulated zero-length list (instead of becoming ["1", "1", "1", …]).

// GetUpdateExprs returns two lists of expression filters for given table, to filter UPDATE events by old values and new
// values respectively. The two lists should have same length, and the corresponding expressions is AND logic.
// This function will lazy calculate expressions if not initialized.
func (g *ExprFilterGroup) GetUpdateExprs(table *filter.Table, ti *model.TableInfo) ([]expression.Expression, []expression.Expression, error) {
tableID := utils.GenTableID(table)
retOld, ok1 := g.updateOldExprs[tableID]
retNew, ok2 := g.updateNewExprs[tableID]
if ok1 || ok2 {
return retOld, retNew, nil
}
if _, ok := g.hasUpdateOldFilter[tableID]; ok {

(And I suspect filling in only update-old-value-expr will instead cause an index-out-of-bound panic in dml.go:199, but I haven't tried.)

@lance6716
Copy link
Contributor

strange that we have these tests

update_old_lt_100:
schema: "expr_filter"
table: "t5"
update-old-value-expr: "c > 100"
update_new_lt_100:
schema: "expr_filter"
table: "t5"
update-new-value-expr: "c > 100"

I'll take a look what's wrong

@kennytm
Copy link
Contributor Author

kennytm commented Dec 2, 2022

@lance6716 it is because the test contains multiple filters on expr_filter.t5. Try to have only one rule in the entire task.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging a pull request may close this issue.

4 participants