Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Content/file addressed zfs dedupe layer #10552

Open
zenaan opened this issue Jul 10, 2020 · 9 comments
Open

Content/file addressed zfs dedupe layer #10552

zenaan opened this issue Jul 10, 2020 · 9 comments
Labels
Type: Feature Feature request or new feature

Comments

@zenaan
Copy link

zenaan commented Jul 10, 2020

A zfs content addressing layer, not too dissimilar to git's content addressing, is a simple hero tier plugin enhancement for zfs filesystems.

It is simple since Git provides a well proven, and simple, content addressing layer design.

It is hero tier for the hopefully obvious reason that a content addressed filesystem is naturally de-duplicating with close to zero RAM overhead.

The basic design, as demonstrated by the many re-implementations of the Git content storage model, is simple to implement, at least for a trivial implementation, and in this case of ZFS, other features including compression, do not need to be added, since zfs already optionally does compression at the block layer.

To briefly hint at how content addressing works, the content of a file to be stored in the filesystem is "hashed", and this hash must be known to read the file back. To make this usable for the end user, a very simple map from "end user filename, to content hash" is maintained. This map could readily be a simple hierarchy of directories to say 4 layers, where each layer is the next char of the hash, and the bottom layer contains the balance of the chars of the file's hash.

In order to mostly remove the extra hashing this might imply, and since ZFS already hashes every block for checksum/ verification, simply use this same hash, and e.g. XOR all the existing block hashes, to create the file's "content address" or "content hash" - there's simply no point adding a whole new layer of otherwise unneeded content hashing.

This may in most cases where it's useful, supplant the need for the existing zfs dedupe layer.

This zfs "git" backend would be a good first project for someone wanting to ease into C language programming and to get up to speed with zfs internals.

See also:
- http://reprog.wordpress.com/2010/05/13/you-could-have-invented-git-and-maybe-you-already-have/
- https://stackoverflow.com/questions/8198105/how-does-git-store-files
- https://metacpan.org/pod/distribution/Perl-Repository-APC/eg/trimtrees.pl
- http://xmailserver.org/flcow.html
- https://devblogs.microsoft.com/devops/supercharging-the-git-commit-graph-iii-generations/

@snajpa
Copy link
Contributor

snajpa commented Jul 10, 2020

Maybe I'm missing something, but in case of Git, you actually are relying on the filesystem, to provide you with the resolution of where is this hash stored on disk, where should the drive read from to get the data.

If we transfer the concept down onto ZFS, it would still have to keep the index of the hashes, not dissimilar to the dedup tables already in existence.

Or have I missed something? Maybe I didn't get the basics right here...

@zenaan
Copy link
Author

zenaan commented Jul 10, 2020 via email

@ahrens ahrens added the Type: Feature Feature request or new feature label Jul 13, 2020
@shodanshok
Copy link
Contributor

I'm not familiar with the internal of git, but if I understand your proposal correctly it would dedup identical files only. While it is good in itself, a more general block-level dedup approach (the one currently implemented in ZFS) seems superior.

For a fast & cheap block-level dedup, one has to see no further than vdo. It would be great if ZFS would implement something similar.

For file-level dedup, I think the correct long-term solution would be to support reflink and let the user trigger dedup by simply searching for identical files and reflinking them.

@zenaan
Copy link
Author

zenaan commented Aug 9, 2020 via email

@jittygitty
Copy link

@zenaan @shodanshok During my research on my issue at #13349 I just came upon your discussion at https://zfsonlinux.topicbox.com/groups/zfs-discuss/Tfa22fbf65c5411f0
Where @Zeenan said:
"Even more reason to use a content addressed map layer (on top of zfs'
existing block etc backend) for dedupe :)

To make the content addresses require minimum cpu overhead, simply XOR the
block checksums that zfs already must calculate, and voi la, "free" dedupe."

And I thought hmm what you said seems similar to my thoughts on leveraging Linux fiemap and existing Linux reflinks code to implement cp --reflink and offline-dedupe etc.

@zenaan
Copy link
Author

zenaan commented Apr 30, 2022 via email

@jittygitty
Copy link

Implementing a content addressed map layer (on top of zfs' existing block etc backend) for dedupe, which would open up a world of git-like possibilities to boot, has to be conceptually one of the simplest sigma-/god- tier enhancements possible - for anyone wanting a quick start to ego boosting stardom in the FS/kernel world :)

I was afraid to say that out loud! But after 12 days of banging my head torturing myself to pour through code and patches all way from 2008 to now, I was starting to come to that conclusion also. Which raised some uncomfortable questions, as to the reason why these features that the "community" has been begging and crying for all over the internet for the past TWELVE YEARS have seemingly gotten very little serious attention from the project contributors/leaders etc. If you read my #13349 issue you'll see I quoted someone at Phoronix thread that accused openzfs of refusing to give the "real reason", which they claimed was openzfs fear of license incompatibility, which I didn't think was the reason. But my ticket and my questions have been ignored by the leadership so far, of course they might just be busy so I'll have to be a little patient. Yet if my questions keep getting ignored, sadly I'll have to conclude the Phoronix guy was mouthing off some conspiracy that would turn out true.

Personally I thought the reason for lack of progress given that these features should be very doable on Linux was that most of the development was driven by companies working with illumos kernels or BSD and the fact that it would be easy to do on LINUX and not on their OS systems meant that they weren't going to get paid by the companies they worked for to work on it.

I plan to post in #7545 and #9554 to see if this implementation they were trying to do, if it has had to do extra work/workarounds around any gpl-only exports ie export_symbol_gpl or if they didn't have any of those.

Because my questions in #11357 haven't gotten any reply from anyone for twelve days.
But again I'm trying to be patient since hey maybe "everyone" has just been too busy and they just need a few reminders etc.

@zenaan
Copy link
Author

zenaan commented May 1, 2022 via email

@jittygitty
Copy link

jittygitty commented May 1, 2022

@zenaan Hey if you read my other post you'd see I thought the phoronix post was not true, and that nobody was conspiring to hide the real reason as he was saying, but in my opinion it was simply that contributors had "other priorities" and maybe not "LINUX" but BSD or illumos kernel based distributions, and if a feature was ONLY EASY to do on "LINUX" and "NOT" on the other distributions, that naturally they would be working on their own distributions instead, regardless if a feature would be easier to do on Linux. That said, if I never do get some responses to my simple questions like in #11357 then and only then might I start to believe maybe I was "wrong" and perhaps somehow somebody thinks that the licensing issue with Linux is the reason why some things that should be easier on Linux than on all the other distros, were not done etc.

I've always been grateful for everyone's contributions, and I even respect their wish to work on the distributions important to them, even if its not "Linux".

Anyway not sure if you've seen my recent post here for Crowd Funding issues/features etc:
#13397

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Feature Feature request or new feature
Projects
None yet
Development

No branches or pull requests

5 participants