Dear g2p,
it interesting to see that I am the first to leave an issue. This is an excellent tool and I hope that this kind of feature makes it into btrfs one day. Many thanks for putting the effort in.
I am using bedup to de-duplicate large backups stored in a btrfs. For me the adavantages/features of this approach are:
- compression through btrfs
- offline file dedup through bedup
- acl/xattr stored natively
- no restore tool/user management necessary as backups are accessible as mount
My current problem is that bedup uses excessive amounts of memory (in my case more than 12GB) in dedup_tracked(sess, volset, tt). This is for an inode table with 1.6 million rows and happens when 'comm1.inodes' is executed.
Output:
41:02.0 Updated 1480885 items
00.00 Partial hash of same-size groups 0/109721
Note: Non of the common sets should be larger than 20 files.
I cannot see why this is necessary. Without SQL I would simply traverse the size-sorted inode list ...
Best Regards,
Henrik
Dear g2p,
it interesting to see that I am the first to leave an issue. This is an excellent tool and I hope that this kind of feature makes it into btrfs one day. Many thanks for putting the effort in.
I am using bedup to de-duplicate large backups stored in a btrfs. For me the adavantages/features of this approach are:
My current problem is that bedup uses excessive amounts of memory (in my case more than 12GB) in dedup_tracked(sess, volset, tt). This is for an inode table with 1.6 million rows and happens when 'comm1.inodes' is executed.
Output:
41:02.0 Updated 1480885 items
00.00 Partial hash of same-size groups 0/109721
Note: Non of the common sets should be larger than 20 files.
I cannot see why this is necessary. Without SQL I would simply traverse the size-sorted inode list ...
Best Regards,
Henrik