Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallelize compression/decompression during backup and restore #9856

Merged
merged 3 commits into from
Jun 15, 2018

Conversation

pliu
Copy link

@pliu pliu commented May 16, 2018

Drop-in of pgzip for gzip

Compression for the backup and restore commands of influxd were able to download the data from the database quickly, but subsequent compression and decompression was done in a single-threaded manner that is extremely slow.

Required for all non-trivial PRs
  • [ x ] Sign CLA (if not already signed)

Drop-in of pgzip for gzip
@ghost ghost added the proposed label May 16, 2018
@e-dard
Copy link
Contributor

e-dard commented May 29, 2018

@pliu thanks for the contribution. Whilst the code changes in this PR are trivial, there may be significant changes to the behaviour of the database. In order to accept this contribution it would be good to see some benchmarks or performance tests.

Further, the core team would probably need to look into the third part library and check it's suitable for inclusion. /cc @aanthony1243

@pliu
Copy link
Author

pliu commented May 29, 2018

@e-dard Do you have internal benchmark/performance tests that you can run against the change or is this something I will need to perform?

Thanks!

Copy link
Contributor

@e-dard e-dard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @pliu this is looking fine. Thanks for this. We have recently started using dep as our dependency management tool. Do you mind if I add a commit to the PR with the correct dep update for pgzip?

@aanthony1243
Copy link
Contributor

agreed, once we update for dep, this is ok to merge.

@pliu
Copy link
Author

pliu commented Jun 4, 2018

@e-dard Feel free to make the necessary changes :) Thank you!

@pliu pliu force-pushed the backup_restore_parallel_compression branch from b334800 to 4035591 Compare June 12, 2018 07:15
@e-dard
Copy link
Contributor

e-dard commented Jun 13, 2018

@pliu hi, can you cherry pick this commit into your branch? 4712404

@pliu
Copy link
Author

pliu commented Jun 13, 2018

@e-dard Done

@e-dard e-dard changed the title Parallel compression Parallelize compression/decompression during backup and restore Jun 13, 2018
@e-dard e-dard merged commit ab293e8 into influxdata:master Jun 15, 2018
@ghost ghost removed the proposed label Jun 15, 2018
@rbetts rbetts mentioned this pull request Jun 25, 2018
10 tasks
jacobmarble pushed a commit that referenced this pull request Jun 26, 2018
Parallelize compression/decompression during backup and restore
jacobmarble pushed a commit that referenced this pull request Jun 27, 2018
Parallelize compression/decompression during backup and restore
jacobmarble pushed a commit that referenced this pull request Jun 27, 2018
Parallelize compression/decompression during backup and restore
jacobmarble pushed a commit that referenced this pull request Jun 27, 2018
Parallelize compression/decompression during backup and restore
jacobmarble pushed a commit that referenced this pull request Jun 27, 2018
Parallelize compression/decompression during backup and restore
jacobmarble pushed a commit that referenced this pull request Jun 27, 2018
Parallelize compression/decompression during backup and restore
jacobmarble pushed a commit that referenced this pull request Jun 27, 2018
Parallelize compression/decompression during backup and restore
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants