You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Aug 21, 2023. It is now read-only.
Currently dumpling's performance is only 1/2 to 2/3 of mydumper. There are two parts that cost a lot of time for dumpling.
After analyzing the torch graph and doing some simple tests we find that dumpling costs a lot of time in fetch one row.
That's because driver.Value in database/sql package is an interface{} type variable.
When we convert []byte type varible to interface{} type, it will use runtime.mallocgc in runtime.convTslice to do that which will cost a lot of time, but we can't change driver.Value.
One solution is to abandon the usage of database/sql and directly use the []byte value readed from mysql server. But this is a huge change for dumpling.
Now dumpling will do these things in serial:
read a row -> escape string -> write to buffer -> read next row ...
Actually, when we escape the value, we can start to read another row to improve the performance. But it seems hard for database/sql package to implement this function, which means we may have to implement the MySQL client by ourselves.
Test: If we use dumpling with only scan (disable escape and write), it will cost the same time as mydumper both write and read.
Revelant torch:
Tasks
improve dumpling's performance, make it better than mydumper (for both single-threaded and multi-threaded running)
one possible approach is to replace go-sql-connector/mysql with siddontang/go-mysql/client to improve dumpling's performance. What's more, we need to refactor this package to parallel reading from database, escaping chapters and writing to disks.
Description
Background
Currently dumpling's performance is only 1/2 to 2/3 of mydumper. There are two parts that cost a lot of time for dumpling.
That's because
driver.Valueindatabase/sqlpackage is aninterface{}type variable.When we convert
[]bytetype varible tointerface{}type, it will useruntime.mallocgcinruntime.convTsliceto do that which will cost a lot of time, but we can't changedriver.Value.One solution is to abandon the usage of
database/sqland directly use the[]bytevalue readed from mysql server. But this is a huge change for dumpling.read a row -> escape string -> write to buffer -> read next row ...
Actually, when we escape the value, we can start to read another row to improve the performance. But it seems hard for
database/sqlpackage to implement this function, which means we may have to implement the MySQL client by ourselves.Reference
[]bytetointerface{}assign
[]bytetointerface{}will cost much more time than assign to[]bytein this test.https://gist.github.com/lichunzhu/2433d332b4bfc57fb7c1aa3f404b4c58
Revelant torch:
Tasks
Score
Mentor
Recommended Skills