FUSE: reflect deduplication in allocated blocks #184

dnnr · 2015-01-23T16:13:52Z

Instead of giving all files a fixed block count of 1, this assigns each
deduplicated chunk to a certain file. In effect, the cumulative file
size that is shown in the mountpoint accurately reflects the amount of
actual disk space needed for the repository (barring metadata overhead).

Although the block assignment is done arbitrarily, depending on the
user's access pattern, the sizes will be consistent within the entire
mount point. This facilitates the use of tools like du and ncdu for
inspecting the actual disk usage in a repository as opposed to just
looking at the original, uncompressed, non-deduplicated file sizes.

Instead of giving all files a fixed block count of 1, this assigns each deduplicated chunk to a certain file. In effect, the cumulative file size that is shown in the mountpoint accurately reflects the amount of actual disk space needed for the repository (barring metadata overhead). Although the block assignment is done arbitrarily, depending on the user's access pattern, the sizes will be consistent within the entire mount point. This facilitates the use of tools like du and ncdu for inspecting the actual disk usage in a repository as opposed to just looking at the original, uncompressed, non-deduplicated file sizes.

ThomasWaldmann · 2015-03-06T22:09:50Z

can we have some opinions here about this PR?

is there a chance that this might confuse users, if the blocks are more or less random compared to the original filesize?

dnnr · 2015-03-06T22:29:51Z

On the one hand, yes. But on the other hand, those values are currently simply set to 1, i.e., they're mostly wrong and meaningless anyway. And more importantly: I'd say that the semantics of that field are actually correct this way. It's supposed represent the "size used on disk" and therefore supposed to be potentially arbitrarily different from the nominal file size exactly because of the effects caused by compression, deduplication, sparse files, or whatever else is going on in the underlying file system.

So of course someone might claim to be confused by those values, but I actually can't think of any better way of populating st_blocks that wouldn't be at least equally confusing. At least this way it's consistent.

ThomasWaldmann mentioned this pull request Mar 8, 2015

Some file operations not working correctly in mounted repo #170

Open

maltefiala mentioned this pull request May 14, 2015

Dealing with attic issues borgbackup/borg#5

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FUSE: reflect deduplication in allocated blocks #184

FUSE: reflect deduplication in allocated blocks #184

dnnr commented Jan 23, 2015

ThomasWaldmann commented Mar 6, 2015

dnnr commented Mar 6, 2015

FUSE: reflect deduplication in allocated blocks #184

Are you sure you want to change the base?

FUSE: reflect deduplication in allocated blocks #184

Conversation

dnnr commented Jan 23, 2015

ThomasWaldmann commented Mar 6, 2015

dnnr commented Mar 6, 2015