Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement zip file streaming #896

Closed
cemeyer opened this issue May 8, 2016 · 1 comment
Closed

Implement zip file streaming #896

cemeyer opened this issue May 8, 2016 · 1 comment

Comments

@cemeyer
Copy link
Contributor

cemeyer commented May 8, 2016

When searching large zip files (e.g. multi-GB logs or something), it's desirable not to decompress the whole thing in memory at one time. Instead, a psuedo-FILE object can be created with fopencookie(3) (glibc) or funopen(3) (BSD) that only needs to seek forwards and decompress small chunks at a time, and that FILE* can be passed to the ordinary streaming search routine.

Here's an example of what fopencookie/funopen-wrapped zip streaming looks like: https://github.com/cemeyer/Zlib-FILE (that implementation is permissively licensed, feel free to just drop it in to the_silver_searcher).

Aside from the FILE-wrapper code, the actual patch should be fairly small. Something like this:

--- a/src/search.c
+++ b/src/search.c
@@ -315,14 +315,10 @@ void search_file(const char *file_full_path) {
     if (opts.search_zip_files) {
         ag_compression_type zip_type = is_zipped(buf, f_len);
         if (zip_type != AG_NO_COMPRESSION) {
-            int _buf_len = (int)f_len;
-            char *_buf = decompress(zip_type, buf, f_len, file_full_path, &_buf_len);
-            if (_buf == NULL || _buf_len == 0) {
-                log_err("Cannot decompress zipped file %s", file_full_path);
-                goto cleanup;
-            }
-            search_buf(_buf, _buf_len, file_full_path);
-            free(_buf);
+           log_debug("%s is a zip file. stream searching", file_full_path);
+           fp = decompress_file(zip_type, ...);
+           search_stream(fp, file_full_path);
+           fclose(fp);
             goto cleanup;
         }
     }
cemeyer added a commit to cemeyer/the_silver_searcher that referenced this issue Jun 17, 2017
Use the POSIX fopencookie(3) mechanism to produce a FILE object, and
then treat them the same as other non-mmapable streams (i.e., FIFOs).

Since some supported platforms do not support fopencookie(3) (Mac OS X,
maybe Cygwin, older BSDs), retain non-streaming zip file support.

Existing tests pass.
@cemeyer
Copy link
Contributor Author

cemeyer commented Jun 17, 2017

See #1106.

@ggreer ggreer closed this as completed in 2ec3782 Jun 27, 2017
ggreer added a commit that referenced this issue Jun 27, 2017
Fix #896 - Stream decompress zipped files
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant