Floyddotnet / duperemove
forked from markfasheh/duperemovedigest_trigger
Commits on Nov 3, 2016
-
add "IF NOT EXISTS" to table and trigger creation statments
Floyddotnet committedNov 3, 2016 -
outsource database creation from dbfile_create into __dbfile_create
Floyddotnet committedNov 3, 2016 -
fix sql error in create_triggers
Floyddotnet committedNov 3, 2016 -
rewrite GET_DUPLICATE_HASHES to use the new digest table
Floyddotnet committedNov 3, 2016 -
call create_triggers in dbfile_create
Floyddotnet committedNov 3, 2016 -
move create_indexes into dbfile_create
Floyddotnet committedNov 3, 2016 -
add hashes table trigger create statment
Floyddotnet committedNov 3, 2016 -
add digest table index create statment
Floyddotnet committedNov 3, 2016 -
add digest table create statment
Floyddotnet committedNov 3, 2016 -
Floyddotnet committed
Nov 3, 2016
Commits on Sep 30, 2016
-
Move some FAQ items from the wiki into the man page
Signed-off-by: Mark Fasheh <mfasheh@suse.de>
Mark Fasheh committedSep 30, 2016
Commits on Sep 29, 2016
-
Use proper len of tail blocks during block dedupe
We were submitting every block at blocksize, but we need to submit the last blocks of a file at the actual length that they are. Otherwise we were missing dedupe on those blocks and returning EINVAL. Signed-off-by: Mark Fasheh <mfasheh@suse.de>
Mark Fasheh committedSep 29, 2016 -
run_dedupe: avoid size_list corruption in push_blocks()
Each thread frees the corresponding dups list structure, but in doing so will remove it from the size list. When this happens it corrupts the list for push_blocks(). Remove it from the size_list in push_blocks() instead so we don't have this situation later. Signed-off-by: Mark Fasheh <mfasheh@suse.de>
Mark Fasheh committedSep 29, 2016
Commits on Sep 27, 2016
-
Signed-off-by: Mark Fasheh <mfasheh@suse.de>
Mark Fasheh committedSep 27, 2016 -
FAQ: Add entry about interrupting the program
Signed-off-by: Mark Fasheh <mfasheh@suse.de>
Mark Fasheh committedSep 27, 2016 -
Signed-off-by: Mark Fasheh <mfasheh@suse.de>
Mark Fasheh committedSep 27, 2016 -
FAQ: Add entry about memory usage
Signed-off-by: Mark Fasheh <mfasheh@suse.de>
Mark Fasheh committedSep 27, 2016 -
Add note in FAQ about breaking up large data sets
Signed-off-by: Mark Fasheh <mfasheh@suse.de>
Mark Fasheh committedSep 27, 2016 -
Point to master branch for latest code in README
Signed-off-by: Mark Fasheh <mfasheh@suse.de>
Mark Fasheh committedSep 27, 2016 -
manpage: Add FAQ entry on hashfile size
Signed-off-by: Mark Fasheh <mfasheh@suse.de>
Mark Fasheh committedSep 27, 2016 -
Signed-off-by: Mark Fasheh <mfasheh@suse.de>
Mark Fasheh committedSep 27, 2016
Commits on Sep 26, 2016
-
The idea of find-dupes is a great one - we want to cut down on the number of extent references placed on disk by building exents out of our dupe blocks tree. The problem is that we've never been able to get this to perform reasonably well and give good dedupe results at the same time. The design doc in our wiki has the full details but the most relevant excerpt would be: We're trying to balance at least 3 very important resources: - cpu usage - memory usage - quality of dedupe Right now we catch all possible extents (100% dedupe quality) at the expense of a ton of memory and CPU. Turning down the quality in favor of fewer expended resources tends to get us in situations where the pattern of dedupe is seemingly random, or we always miss at least some obvious cases (such as identical files). We can continue to experiment until we get something that works well - there's still many options going forward. In the meantime however, the number of bug reports I have recieved where find-dupes is a severe performance problem is too high. We want to ensure a smooth user experience, especially for those with large dedupe sets so make find-dupes optional. Signed-off-by: Mark Fasheh <mfasheh@suse.de>
Mark Fasheh committedSep 26, 2016 -
find_dupes.c: Throttle the number of compares we alloc and queue
Signed-off-by: Mark Fasheh <mfasheh@suse.de>
Mark Fasheh committedSep 26, 2016 -
debug: Print some stats totals
Give us dupe, deduped, % totals for filerec stats Signed-off-by: Mark Fasheh <mfasheh@suse.de>
Mark Fasheh committedSep 26, 2016 -
Allow building memstats locking without DEBUG
Signed-off-by: Mark Fasheh <mfasheh@suse.de>
Mark Fasheh committedSep 26, 2016
Commits on Sep 25, 2016
-
Merge pull request markfasheh#152 from ian-kelling/master
docs: fix extra dash in man page
markfasheh committedSep 25, 2016 -
docs: fix extra dash in man page
Signed-off-by: Ian Kelling <ian@iankelling.org>
ian-kelling committedSep 25, 2016
Commits on Sep 22, 2016
-
Better documentation for -b switch in man page
Signed-off-by: Mark Fasheh <mfasheh@suse.de> # Please enter the commit message for your changes. Lines starting
Mark Fasheh committedSep 22, 2016 -
Better documentation for --hash switch in man page
Signed-off-by: Mark Fasheh <mfasheh@suse.de>
Mark Fasheh committedSep 22, 2016 -
Lock around our memstats counts when a debug build is enabled.
We might be fine using this in all builds but it will require testing first to be sure we don't take a noticable performance hit. Signed-off-by: Mark Fasheh <mfasheh@suse.de>
Mark Fasheh committedSep 22, 2016 -
factor out variable handling code in alloc tracking
Signed-off-by: Mark Fasheh <mfasheh@suse.de>
Mark Fasheh committedSep 22, 2016 -
Merge pull request markfasheh#147 from petzah/buildflags
Add C preprocessor flags
markfasheh committedSep 22, 2016
Commits on Sep 21, 2016
-
petzah committed
Sep 21, 2016 -
Print when we are a debug build
Signed-off-by: Mark Fasheh <mfasheh@suse.de>
Mark Fasheh committedSep 21, 2016