Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use multithread option on single table backup using mydumper #1427

Open
test-mydumper opened this issue Mar 7, 2024 · 14 comments
Open

Comments

@test-mydumper
Copy link

I'm trying to use mydumper and myloader to perform a backup and restore activity of the MySQL database. Below is the use case

We have 1 database, around 2 TB, out of which 1 table is around 1.3 TB. Now I want to take a backup of a single table using the multithreads option, and I used the below query, but I don't see multiple threads at the database level when I run show processlist.

mydumper -h host_name -u username --password= password -P 3306 --tables-list=db_name.table_name -t 14 -o /path

and also below

mydumper -h host_name -u username --password= password -P 3306 --tables-list=db_name.table_name --rows=1000000 -t 14 -o /path

So how to achieve multiple threading on a single table while taking the backup and also while restoring as well?

@davidducos
Copy link
Member

Has the table an integer pk ?

@davidducos davidducos added this to the Release 0.16.3-1 milestone Mar 7, 2024
@test-mydumper
Copy link
Author

Has the table an integer pk ?

Yes, the table has 'id' as the integer pk

@davidducos
Copy link
Member

Are you using the latest prerelease?

@test-mydumper
Copy link
Author

Are you using the latest prerelease?

Yes, I'm using the latest release

@davidducos
Copy link
Member

Hi @test-mydumper,
I'm not able to reproduce the issue that you reported using this test case:

date; time sysbench /usr/share/sysbench/oltp_write_only.lua --table-size=10000000 --tables=1 --threads=1 --mysql-user=root --mysql-db=sbtest --rate=10 --time=0 --report-interval=10 --create_secondary=off prepare; date
rm -rf data; ./mydumper -o data --tables-list=sbtest.sbtest1 -v 3 --rows=1000000

I got multiple threads exporting the table:

** Message: 15:36:27.065: Thread 2: dumping data for `sbtest`.`sbtest1`  WHERE (6000000 <= `id` AND `id` <= 6999999)       into data/sbtest.sbtest1.00000.00007.sql| Remaining jobs in this table: 7 All remaining jobs: 0
** Message: 15:36:27.069: Thread 4: dumping data for `sbtest`.`sbtest1`  WHERE (9000000 <= `id` AND `id` <= 9999999)       into data/sbtest.sbtest1.00000.00010.sql| Remaining jobs in this table: 7 All remaining jobs: 0
** Message: 15:36:27.072: Thread 3: dumping data for `sbtest`.`sbtest1`  WHERE (`id` IS NULL OR(0 <= `id` AND `id` <= 999999))       into data/sbtest.sbtest1.00000.00001.sql| Remaining jobs in this table: 7 All remaining jobs: 0
** Message: 15:36:27.083: Thread 1: dumping data for `sbtest`.`sbtest1`  WHERE (3000000 <= `id` AND `id` <= 3999999)       into data/sbtest.sbtest1.00000.00004.sql| Remaining jobs in this table: 7 All remaining jobs: 0
** Message: 15:36:35.067: Thread 1: dumping data for `sbtest`.`sbtest1`  WHERE (4000000 <= `id` AND `id` <= 4999999)       into data/sbtest.sbtest1.00000.00004.sql| Remaining jobs in this table: 7 All remaining jobs: 0
** Message: 15:36:35.256: Thread 2: dumping data for `sbtest`.`sbtest1`  WHERE (7000000 <= `id` AND `id` <= 7999999)       into data/sbtest.sbtest1.00000.00007.sql| Remaining jobs in this table: 7 All remaining jobs: 0
** Message: 15:36:35.284: Thread 3: dumping data for `sbtest`.`sbtest1`  WHERE (1000000 <= `id` AND `id` <= 1999999)       into data/sbtest.sbtest1.00000.00001.sql| Remaining jobs in this table: 7 All remaining jobs: 0
** Message: 15:36:35.569: Thread 4: dumping data for `sbtest`.`sbtest1`  WHERE (`id` = 10000000)       into data/sbtest.sbtest1.00000.00010.sql| Remaining jobs in this table: 7 All remaining jobs: 0
** Message: 15:36:35.584: Thread 4: dumping data for `sbtest`.`sbtest1`  WHERE (2000000 <= `id` AND `id` <= 2999999)       into data/sbtest.sbtest1.00000.00003.sql| Remaining jobs in this table: 7 All remaining jobs: 0
** Message: 15:36:45.192: Thread 1: dumping data for `sbtest`.`sbtest1`  WHERE (5000000 <= `id` AND `id` <= 5999999)       into data/sbtest.sbtest1.00000.00006.sql| Remaining jobs in this table: 7 All remaining jobs: 0
** Message: 15:36:45.337: Thread 2: dumping data for `sbtest`.`sbtest1`  WHERE (8000000 <= `id` AND `id` <= 8999999)       into data/sbtest.sbtest1.00000.00009.sql| Remaining jobs in this table: 7 All remaining jobs: 0

@test-mydumper
Copy link
Author

Hi @test-mydumper, I'm not able to reproduce the issue that you reported using this test case:

date; time sysbench /usr/share/sysbench/oltp_write_only.lua --table-size=10000000 --tables=1 --threads=1 --mysql-user=root --mysql-db=sbtest --rate=10 --time=0 --report-interval=10 --create_secondary=off prepare; date
rm -rf data; ./mydumper -o data --tables-list=sbtest.sbtest1 -v 3 --rows=1000000

I got multiple threads exporting the table:

** Message: 15:36:27.065: Thread 2: dumping data for `sbtest`.`sbtest1`  WHERE (6000000 <= `id` AND `id` <= 6999999)       into data/sbtest.sbtest1.00000.00007.sql| Remaining jobs in this table: 7 All remaining jobs: 0
** Message: 15:36:27.069: Thread 4: dumping data for `sbtest`.`sbtest1`  WHERE (9000000 <= `id` AND `id` <= 9999999)       into data/sbtest.sbtest1.00000.00010.sql| Remaining jobs in this table: 7 All remaining jobs: 0
** Message: 15:36:27.072: Thread 3: dumping data for `sbtest`.`sbtest1`  WHERE (`id` IS NULL OR(0 <= `id` AND `id` <= 999999))       into data/sbtest.sbtest1.00000.00001.sql| Remaining jobs in this table: 7 All remaining jobs: 0
** Message: 15:36:27.083: Thread 1: dumping data for `sbtest`.`sbtest1`  WHERE (3000000 <= `id` AND `id` <= 3999999)       into data/sbtest.sbtest1.00000.00004.sql| Remaining jobs in this table: 7 All remaining jobs: 0
** Message: 15:36:35.067: Thread 1: dumping data for `sbtest`.`sbtest1`  WHERE (4000000 <= `id` AND `id` <= 4999999)       into data/sbtest.sbtest1.00000.00004.sql| Remaining jobs in this table: 7 All remaining jobs: 0
** Message: 15:36:35.256: Thread 2: dumping data for `sbtest`.`sbtest1`  WHERE (7000000 <= `id` AND `id` <= 7999999)       into data/sbtest.sbtest1.00000.00007.sql| Remaining jobs in this table: 7 All remaining jobs: 0
** Message: 15:36:35.284: Thread 3: dumping data for `sbtest`.`sbtest1`  WHERE (1000000 <= `id` AND `id` <= 1999999)       into data/sbtest.sbtest1.00000.00001.sql| Remaining jobs in this table: 7 All remaining jobs: 0
** Message: 15:36:35.569: Thread 4: dumping data for `sbtest`.`sbtest1`  WHERE (`id` = 10000000)       into data/sbtest.sbtest1.00000.00010.sql| Remaining jobs in this table: 7 All remaining jobs: 0
** Message: 15:36:35.584: Thread 4: dumping data for `sbtest`.`sbtest1`  WHERE (2000000 <= `id` AND `id` <= 2999999)       into data/sbtest.sbtest1.00000.00003.sql| Remaining jobs in this table: 7 All remaining jobs: 0
** Message: 15:36:45.192: Thread 1: dumping data for `sbtest`.`sbtest1`  WHERE (5000000 <= `id` AND `id` <= 5999999)       into data/sbtest.sbtest1.00000.00006.sql| Remaining jobs in this table: 7 All remaining jobs: 0
** Message: 15:36:45.337: Thread 2: dumping data for `sbtest`.`sbtest1`  WHERE (8000000 <= `id` AND `id` <= 8999999)       into data/sbtest.sbtest1.00000.00009.sql| Remaining jobs in this table: 7 All remaining jobs: 0

Sorry, below is the version I'm using:
mydumper v0.15.1-3, built against MySQL 8.0.33-25 with SSL support

@davidducos
Copy link
Member

Test with latest release please.

@test-mydumper
Copy link
Author

Test with latest release please.

Sure. Can you please share the exact query to have multiple threads on same table when using mydumper, and also the query for myloader to have multiple threads on single table while restoring.

@test-mydumper
Copy link
Author

Now I have upgraded to the latest release.

mydumper v0.16.1-1, built against MySQL 8.0.36-28 with SSL support

@test-mydumper
Copy link
Author

test-mydumper commented Mar 10, 2024

@davidducos
There are two tables remaining from two distinct databases when we backup all the databases at once, even all other database backups are completed. One of the tables is roughly 1.5 TB in size, and the other is roughly 1 TB.
My jump server where we are taking the backup is allocated with 16v CPU and 128 GB of RAM, so I'm trying to use 16 multi-threaded while taking the backup, but once all other DBs backups complete, I only see 2 threads running for those 2 tables. Could it be because of the old release?
Could you please suggest the exact query to take the backup of all the databases in such a way that when the backup of all other DBs completes, the 16 threads should be used for those 2 tables instead of only using 2 threads?
The same way, when I'm trying to restore the same data, I can see only 2 threads at the last for those 2 large tables, even though I have used 16 multi-threaded myloader. Please suggest the exact myloader query as well so that once all other tables restoration gets completed, the 16 threads should be used for those 2 tables instead of only using 2 threads.

@test-mydumper
Copy link
Author

test-mydumper commented Mar 12, 2024

Hi @davidducos could you please help me on this?

@davidducos
Copy link
Member

Hi @test-mydumper,
I'm already investigating your issue, but I have limited amount of time to review MyDumper issues. I'm doing my best effort here.

@test-mydumper
Copy link
Author

Thank you so much @davidducos

@davidducos
Copy link
Member

Hi @test-mydumper, I'm still not able to reproduce your issue. I will need your logs to understand where might be the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants