Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fwrite gzip #3288

Merged
merged 65 commits into from
Apr 23, 2019
Merged

Fwrite gzip #3288

merged 65 commits into from
Apr 23, 2019

Conversation

philippechataignon
Copy link
Contributor

@philippechataignon philippechataignon commented Jan 16, 2019

Closes #2016
Closes #2974
Closes #2612

I reopen a PR to discuss about its contents. Its a complete rewrite of #3278 : it doesn't use gzopen/gzwrite/gzclose because only one thread compress and that's slow.

In fwrite, output file is always open in binary mode. The idea is to take data from uncompressed buffer in normal case and to compress it and take data from compressed buffer when gzip is active. The code for managing file output is quite the same in normal and gzip mode.

The main difference concern the buffers :

Thanks for any advice on this PR.

Philippe Chataignon and others added 30 commits January 14, 2019 10:52
Use zlib and gzopen/gzwrite/gzclose function to write buffer directly in
a gzipped csv file.

zlib is thread-safe and the gzip compression use the fwrite threads.

Option compress="gzip" is added to fwrite et is automatically set when file ends
with ".gz"
In fwrite, compress has now 3 options :

* default : gzip if file ends with .gz, else csv
* none : force csv
* gzip : force gzip
Before this commit, every column is output one by one.
@mattdowle mattdowle added this to the 1.12.4 milestone Apr 16, 2019
@codecov
Copy link

codecov bot commented Apr 16, 2019

Codecov Report

Merging #3288 into master will increase coverage by 0.29%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3288      +/-   ##
==========================================
+ Coverage   96.71%   97.01%   +0.29%     
==========================================
  Files          66       66              
  Lines       12424    12475      +51     
==========================================
+ Hits        12016    12102      +86     
+ Misses        408      373      -35
Impacted Files Coverage Δ
src/fwrite.c 97.59% <100%> (+6.58%) ⬆️
R/fwrite.R 98.24% <100%> (+1.94%) ⬆️
src/fwriteR.c 98.02% <100%> (+3.11%) ⬆️
src/nafill.c 100% <0%> (ø) ⬆️
src/freadR.c 96.28% <0%> (+0.12%) ⬆️
src/uniqlist.c 98.98% <0%> (+0.5%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c30e161...7665a8d. Read the comment docs.

@mattdowle
Copy link
Member

Fantastic PR -- thanks @philippechataignon.

@mattdowle mattdowle merged commit 9559d26 into Rdatatable:master Apr 23, 2019
mattdowle added a commit that referenced this pull request Apr 23, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants