New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(processing): add multi-volume handler support #564
Conversation
3ea7576
to
323eb81
Compare
Wait, since when do you develop in Rust ?? :) |
c1ce53e
to
7b57d31
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is ready to be merged, I have commented on some minor things, that would be better fixed now, than live with it (Extractable.extractable_id
).
I would also like to get #579 merged first, to have an up to date type checker as soon as possible.
5e69bc9
to
3345e0d
Compare
The whole thing looks good ! I guess we could take advantage of these changes later on to allow users to provide a directory rather than a file as part of the command line by using a |
958a05e
to
97052dc
Compare
There are certain formats where the content is split between multiple files. Currently unblob operates under the assumption that all content resides in a single file. A few examples where this might be relevant: - multi-volume archives, such as 7zip, rar etc. - VM snapshots - content + index type formats This change introduces a DirectoryHandler which can operate on multiple files residing in one directory or at least under one subtree. Most formats there is a "main" file which can be identified by a directory file name pattern. Using this first file the handler can identify the other files and return a MultiFile object, similar to ValidChunks. We do not support cases where a single file is part of multiple MultiFile, also a file processed & extracted in the context of a MultiFile is not processed by traditional handlers. Also there is no carving step rather the files are extracted directly into an extraction directory. The original files are kept and never deleted, as these are normal files, unlike carved out temporary chunks. Files extracted from a MultiFile have a MultiFile as their parent. This required extending the current File -> Chunk reporting concept by introducing an abstract Blob type which is the parent of Chunk and MultiFile as well. MultiFileReports are reported under the directory Task, but contains all included file paths as well.
#553