New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
executor: improve performance for insert ignore on duplicate key update statement #6760
Conversation
Good job! |
/run-all-tests |
executor/batch_checker.go
Outdated
// Batch get values. | ||
nKeys := 0 | ||
for _, r := range toBeCheckRows { | ||
nKeys += len(r.uniqueKeys) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
handleKey
should be considered?
executor/batch_checker.go
Outdated
dupErr error | ||
} | ||
|
||
type toBeCheckRow struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/toBeCheckRow/toBeCheckedRow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I want a more meaningful name. Any suggestions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think toBeCheckedRow
is ok.
executor/batch_checker.go
Outdated
} | ||
|
||
// batchGetInsertKeys uses batch-get to fetch all key-value pairs to be checked for ignore or duplicate key update. | ||
func (b *batchChecker) batchGetInsertKeys(ctx sessionctx.Context, t table.Table, newRows []types.DatumRow) ([]toBeCheckRow, map[string][]byte, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we assign the results to b
and only returns error?
executor/batch_checker.go
Outdated
} | ||
|
||
// batchGetOldValues gets the values of storage in batch. | ||
func (b *batchChecker) batchGetOldValues(ctx sessionctx.Context, t table.Table, handles []int64) (map[string][]byte, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just set the values into b.dupOldRowValues
?
executor/batch_checker.go
Outdated
func (b *batchChecker) initDupOldRowValue(ctx sessionctx.Context, t table.Table, newRows []types.DatumRow) (err error) { | ||
b.dupOldRowValues = make(map[string][]byte, len(newRows)) | ||
handles := make([]int64, 0, len(newRows)) | ||
for _, r := range b.toBeCheckRows { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about we loop the toBeCheckedRows
two times.
The first time we only set dupOldRowValues
for handle key.
The second time we collect unique key handles, then batch get those handles.
Then we can extract two methods, initDupOldRowFromHandleKey
and initDupOldRowFromUniqueKey
.
PTAL @coocood |
LGTM |
@XuHuaiyu PTAL |
executor/batch_checker.go
Outdated
} | ||
|
||
type keyValueWithDupInfo struct { | ||
newKeyValue keyValue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we rename newKeyValue
to newKV
? I think r.handleKey.newKeyValue.key
is too long.
executor/batch_checker.go
Outdated
type batchChecker struct { | ||
// For duplicate key update | ||
toBeCheckedRows []toBeCheckedRow | ||
dupKeyValues map[string][]byte |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto. s/dupKeyValues/dupKVs
executor/insert.go
Outdated
return errors.Trace(err) | ||
} | ||
delete(e.dupKeyValues, string(r.handleKey.newKeyValue.key)) | ||
newRows[i] = nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can remove this line like line142.
/run-all-tests |
LGTM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
What have you changed?
insert ignore on duplicate key update
statement.What are the type of the changes?
How has this PR been tested?
unit tests
Benchmark result
When insert 27 rows in one statement,
PTAL @coocood @XuHuaiyu