Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: support parallel file processing #10

Open
drcapybara opened this issue Oct 17, 2023 · 1 comment
Open

feature: support parallel file processing #10

drcapybara opened this issue Oct 17, 2023 · 1 comment
Labels
feature Add new functionality to the library Larger Project Larger project sizing affecting multiple areas of the library. Not difficult, but more to consider.

Comments

@drcapybara
Copy link
Owner

Use rayon to open and hash a file in batches. part of larger batch issue.

@drcapybara drcapybara changed the title support parallel file processing feature: support parallel file processing Oct 18, 2023
@drcapybara drcapybara added feature Add new functionality to the library Larger Project Larger project sizing affecting multiple areas of the library. Not difficult, but more to consider. labels Oct 20, 2023
@drcapybara
Copy link
Owner Author

This will consist of multiple parts.

  1. Need to read in files of any size and convert to byte arrays before passing into library.
  2. Should carefully take into account size of file. Sliding window will likely be needed.
  3. Open question about how to handle in place operations. Can be destructive if we do not account for interruptions in execution.
  4. Should use rayon crate to perform file based hashing in parallel if possible

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Add new functionality to the library Larger Project Larger project sizing affecting multiple areas of the library. Not difficult, but more to consider.
Projects
None yet
Development

No branches or pull requests

1 participant