Skip to content
This repository was archived by the owner on Aug 21, 2023. It is now read-only.
This repository was archived by the owner on Aug 21, 2023. It is now read-only.

replace go-sql-connector/mysql with siddontang/go-mysql/client to improve dumpling's performance #126

@lichunzhu

Description

@lichunzhu

Description

Background

Currently dumpling's performance is only 1/2 to 2/3 of mydumper. There are two parts that cost a lot of time for dumpling.

  1. After analyzing the torch graph and doing some simple tests we find that dumpling costs a lot of time in fetch one row.

That's because driver.Value in database/sql package is an interface{} type variable.

When we convert []byte type varible to interface{} type, it will use runtime.mallocgc in runtime.convTslice to do that which will cost a lot of time, but we can't change driver.Value.

One solution is to abandon the usage of database/sql and directly use the []byte value readed from mysql server. But this is a huge change for dumpling.

  1. Now dumpling will do these things in serial:

read a row -> escape string -> write to buffer -> read next row ...

Actually, when we escape the value, we can start to read another row to improve the performance. But it seems hard for database/sql package to implement this function, which means we may have to implement the MySQL client by ourselves.

Reference

  1. convertion code in go-mysql: https://github.com/go-sql-driver/mysql/blob/73dc904a9ece5c074295a77abb7a135797a351bf/packets.go#L770
  2. dumpling torch graph:
    image
  3. mydumper torch graph:
    image
  4. code to test the effiency when assign []byte to interface{}
    assign []byte to interface{} will cost much more time than assign to []byte in this test.
    https://gist.github.com/lichunzhu/2433d332b4bfc57fb7c1aa3f404b4c58
  5. Test: If we use dumpling with only scan (disable escape and write), it will cost the same time as mydumper both write and read.
    Revelant torch: image

Tasks

  • improve dumpling's performance, make it better than mydumper (for both single-threaded and multi-threaded running)
    • one possible approach is to replace go-sql-connector/mysql with siddontang/go-mysql/client to improve dumpling's performance. What's more, we need to refactor this package to parallel reading from database, escaping chapters and writing to disks.

Score

  • 6600

Mentor

Recommended Skills

  • performance improvement for golang

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions