-
Notifications
You must be signed in to change notification settings - Fork 4
Add LevelDB #5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add LevelDB #5
Conversation
|
Please let us know if there is anything we can do to move this PR forward or to ease the review process. |
If you could clone me, that'd be great. Unfortunately this is a huge PR and I simply have not gotten around to looking at it yet. Between reviewing all other PRs and working on large PRs myself, too, I'm simply stretched thin. |
Schamper
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How you doin'? There's a lot of unnecessary patterns in here (class level type hints for no apparent reason, methods that could easily be inlined or more easily replaced by inheritance, you can take my comments on that in the earlier files as generic comments over the rest as well (it's very slow to review a large PR on GitHub).
| @property | ||
| def records(self) -> Iterator[RecordKey]: | ||
| """Yield all records related to this store.""" | ||
|
|
||
| if self._records: | ||
| yield from self._records | ||
|
|
||
| # e.g. with "_https://google.com\x00\x01MyKey", the prefix would be "_https://google.com\x00" | ||
| prefix = RecordKey.prefix + self.host.encode("iso-8859-1") + b"\x00" | ||
| prefix_len = len(prefix) | ||
|
|
||
| for record in self._local_storage._leveldb.records: | ||
| if record.key[:prefix_len] == prefix: | ||
| key = RecordKey(self, record.key, record.value, record.state, record.sequence) | ||
| self._records.append(key) | ||
| yield key |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few things about this:
- A property generator doesn't feel very safe/stable
- The cache is dangerous: as soon as you do a single generator iteration (but don't exhaust the generator) you'll have an issue where it will only iterate the up-until-then read records.
- The cache in it's current implementation will yield duplicate records (it doesn't exit after reading from the cached records)
It's probably fine not caching this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. Refactored to use lists instead of generators and no more caching.
| ) | ||
|
|
||
|
|
||
| class BlinkHostObjectHandlerDecodeError(v8serialize.DecodeV8SerializeError): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Importing this file will now fail when the dependency is missing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you suggest as a fix for this?
| ] | ||
|
|
||
| leveldb = [ | ||
| "cramjam>=2.11.0,<3", # required for snappy decompression |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can hopefully soon be replaced with dissect.util once that's merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will remove cramjam as a dependency once fox-it/dissect.util#109 is merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't these files be in tests/_data/indexeddb for the IndexedDB ones?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use those files for the leveldb testing at the moment, so they are technically at the right spot.
Co-authored-by: Erik Schamper <1254028+Schamper@users.noreply.github.com>
Co-authored-by: Erik Schamper <1254028+Schamper@users.noreply.github.com>
Co-authored-by: Erik Schamper <1254028+Schamper@users.noreply.github.com>
Co-authored-by: Erik Schamper <1254028+Schamper@users.noreply.github.com>
This PR adds a LevelDB storage implementation to
dissect.database.Also adds support for serialization formats building on top of LevelDB: IndexedDB, and Chromium's LocalStorage and SessionStorage. Please let us know if these formats should be structured differently in this project.
Makes use of two (pure Python and/or Rust) dependencies: cramjam (for LevelDB Snappy decompression) and v8serialize (for IndexedDB v8 javascript object deserialization). We do not have the time or resources to port these dependencies to
dissect.utilordissect.*- hopefully these dependencies can be accepted.