Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

squashfs: add configurable block cache #206

Merged
merged 3 commits into from
Jan 2, 2024
Merged

Conversation

ncw
Copy link
Contributor

@ncw ncw commented Dec 31, 2023

This adds a configurable block cache as discussed in #205

This is done in 3 commits for ease of review - it compiles and the tests pass at each commit so they can be merged as 3 commits if desired.

With this cache this now performs within a factor of 2 of unsquashfs

  • squashfs: make readMetadata a method on FileSystem
  • squashfs: add LRU cache for blocks
  • squashfs: use the LRU cache for fragments and metadata

@ncw ncw changed the title fix 205 add cache squashfs: add configurable block cache Dec 31, 2023
deitch
deitch previously approved these changes Dec 31, 2023
Copy link
Collaborator

@deitch deitch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't mind some performance test on this, but not for me to complain.

Do you want to add a comment somewhere (likely in squashfs.go, or in doc.go) about why we have a cache for this, capture some of your thoughts from the PR and issue?

@ncw
Copy link
Contributor Author

ncw commented Dec 31, 2023

I wouldn't mind some performance test on this, but not for me to complain.

Were you thinking of a Go benchmark?

Here are some with rclone - unpacking my favourite tensorflow 3GB squashfs with rclone

no caching 256 MB cache 128MB cache 64 MB cache 32 MB cache
206s 16.7s 17.5s 23.4s 54.s

Whereas unsquashfs takes 12.0s

Do you want to add a comment somewhere (likely in squashfs.go, or in doc.go) about why we have a cache for this, capture some of your thoughts from the PR and issue?

I've added some notes to the Filesystem.Read method which is where the cache gets initialized - this will appear in the Go doc and be hopefully obvious to the users.

Interestingly the squashfs kernel module uses the same design choice (which I hadn't noticed before) about having a fragment and metadata cache.

Note that this PR will conflict with #201 I think, which I'll fix up.

@ncw
Copy link
Contributor Author

ncw commented Dec 31, 2023

Note that this PR will conflict with #201 I think, which I'll fix up.

Oooh, you merged that while I was writing the above! It is now fixed.

@deitch
Copy link
Collaborator

deitch commented Dec 31, 2023

I think this is solid.

Do you want to throw a benchmarking doc in there? You captured much of it here, but this tends to get lost. Specifically either in the doc.go, or in the lru.go? Just showing how you created the file, how you used rclone, so we have some basis for understanding it in the future?

@ncw
Copy link
Contributor Author

ncw commented Jan 2, 2024

I added some info about benchmarks here

https://github.com/ncw/go-diskfs/blob/a7c64791e8ccc64f3bed0e0618e712224a1338b5/filesystem/squashfs/squashfs.go#L126-L141

What do you think - look ok?

@ncw
Copy link
Contributor Author

ncw commented Jan 2, 2024

Actually I forgot to say how I made the file, let me add that!

…#205

This also adds methods to set and read the size of the cache.
@ncw
Copy link
Contributor Author

ncw commented Jan 2, 2024

@deitch deitch merged commit dc1aa30 into diskfs:master Jan 2, 2024
19 checks passed
@ncw ncw deleted the fix-205-add-cache branch January 2, 2024 21:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants