You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Based on #690 (comment) discussion, we can implement a new unzip_above: 100mb option which would extract in a tmp dir (defined by a FSCRAWLER_TMP system option (default to System tmp dir)) and run then FSCrawler on that dir.
For each generated Doc, we would then transform the URL as coming not from the tmp dir but from the ZIP/GZ file.
Then after run, we would remove the tmp dir and files. Could be optional with a setting like keep_tmp_files: true.
The text was updated successfully, but these errors were encountered:
Based on #690 (comment) discussion, we can implement a new
unzip_above: 100mb
option which would extract in a tmp dir (defined by aFSCRAWLER_TMP
system option (default to System tmp dir)) and run then FSCrawler on that dir.For each generated
Doc
, we would then transform the URL as coming not from the tmp dir but from the ZIP/GZ file.Then after run, we would remove the tmp dir and files. Could be optional with a setting like
keep_tmp_files: true
.The text was updated successfully, but these errors were encountered: