Add hasher backend and KV store #5587

ivandeex · 2021-09-09T07:20:10Z

What is the purpose of this change?

This PR adds new hasher backend after preliminary design proposal in #949 (comment). The actual design is described below and solves the following Goals:

Fix slow hashing of large files on LocalFS, FTP, SFTP
Emulate hash types unimplemented by backends
Warm up checksum cache from external SUM files for large pre-hashed data arrays
Implemented as a transitive backend because VFS does NOT need checksums but CLI operations DO

Getting Started

So you want to cache or otherwise handle checksums for existing myRemote:path or /local/path?
Install patched rclone.
Read documentation.
Open ~/.config/rclone.conf in a text editor and add new section(s):

[Hasher]
type = hasher
remote = myRemote:path
hashes = md5
max_age = 30d

[Hasher2]
type = hasher
remote = /local/path
hashes = dropbox,sha1
max_age = 24h

The backend takes basically the following parameters:

remote (like alias, required)
hashes - comma separated list (by default md5,sha1)
max_age - maximum time to keep a checksum value, e.g. 1h30m or 30d
0 will disable caching completely,
off will cache "forever" (i.e. until the files get changed)

UPDATE: hash_names renamed to hashes, can be separated by comma only (blanks not allowed anymore)

Use it as Hasher2:subdir/file instead of base remote. Hasher will transparently update cache with new checksums when a file is fully read or overwritten, like:

rclone copy External:path/file Hasher:dest/path

rclone cat Hasher:path/to/file > /dev/null

The way to refresh all cached checksums (even unsupported by the base backend) for a subtree is to re-download all files in the subtree. For example, use hashsum --download using any supported hashsum on the command line (we just care to re-read):

rclone hashsum MD5 --download Hasher:path/to/subtree > /dev/null

rclone backend dump Hasher:path/to/subtree

How It Works

rclone hashsum (or md5sum or sha1sum):

if requested hash is supported by lower level, just pass it.
if object size is below auto_size then download object and calculate requested hashes on the fly.
if unsupported and the size is big enough, build object fingerprint (including size, modtime if supported, first-found other hash if any).
if the strict match is found in cache for the requested remote, return the stored hash.
if remote found but fingerprint mismatched, then: purge the entry and proceed to step 6.
if remote not found or had no requested hash type or after step 5: download object, calculate all supported hashes on the fly and store in cache; return requested hash.

Other operations:

whenever a file is uploaded or downloaded in full, capture the stream to calculate all supported hashes on the fly and update database
server-side move will update keys of existing cache entries
delete will remove cache entry
purge will remove all cache entries underneath
periodically prune entries (if max_age is not off)
(this helps against bit-rot or database thrashing when a file got removed externally)

Pre-Seed from a SUM File

Hasher supports two backend commands: generic SUM file import and faster but less consistent stickyimport.

rclone backend import Hasher:dir/subdir SHA1 /path/to/SHA1SUM [--checkers 4]

Instead of SHA1 it can be any hash supported by the remote. The last argument can point to either a local or an other-remote:path text file in SUM format. The command will parse the SUM file, then walk down the path given by the first argument, snapshot current fingerprints and fill in the cache entries correspondingly.

Paths in the SUM file are treated as relative to hasher:dir/subdir.
The command will not check that supplied values are correct. You must know what you are doing.
This is a one-time action. The SUM file will not get "attached" to the remote. Cache entries can still be overwritten later, should the object's fingerprint change.
The tree walk can take long depending on the tree size. You can increase --checkers to make it faster. Or use stickyimport if you don't care about fingerprints and consistency.

rclone backend stickyimport hasher:path/to/data sha1 remote:/path/to/sum.sha1

stickyimport is similar to import but works much faster because it does not need to stat existing files and skips initial tree walk. Instead of binding cache entries to file fingerprints it creates sticky entries bound to the file name alone ignoring size, modification time etc. Such hash entries can be replaced only by purge, delete, backend drop or by full re-read/re-write of the files.

Cache Storage

Note: Details updated after recent KV redesign

Cached checksums are stored as bolt database files under ~/.cache/rclone/kv/, one per base backend, named like BaseRemote~hasher.bolt. Checksums for multiple alias-es into a single base backend will be stored in the single database. All local paths are treated as aliases into the local backend (unless crypted or chunked) and stored in ~/.cache/rclone/kv/local~hasher.bolt.

By default database is opened in read-only mode and can be shared by multiple rclone instances. When a change is needed, rclone will reopen database in exclusive read-write mode for a moment and reopen it back for sharing after changing to let other instances access it.

The bolt engine shows good performance for databases with up to a million entries on ordinary average hardware. Reopening is cheap and takes less than a millisecond.

You can print or drop database using custom backend commands:

rclone backend dump Hasher:dir/subdir

rclone backend drop Hasher:

Boilerplate

Was the change discussed in an issue or in the forum before?

Fixes #949
Fixes #157
Fixes #626

Checklist

I have read the contribution guidelines.
I have added tests for all changes in this PR if appropriate.
I have added documentation for the changes if appropriate.
All commit messages are in house style.
I'm done, this Pull Request is ready for review :-)

ncw · 2021-09-09T17:21:32Z

This is a great implementation @ivandeex. I was imagining that we'd patch all the backends but this is much better.

I wonder whether we should support easy integration of this as nested remotes are a bit of a pain to set up... I had the idea that wrapping backends should support doing :hasher:remote:mypath. This is easy to do, if remote is empty in NewFs then use root instead. I made a zip and a tar backend using this and it worked quite well. What do you think?

Documentation's not in the patch yet.
After a code review I'll slightly edit the top post here and add to the commit.

I'll give it a quick once over now.

Generic backend integration test passed successfully.

Unit tests for hasher specific features not done yet. Probably in the next PR.

Code quality test is addressed by build: apply gofmt from golang 1.17 #5588
(will rebase when that's merged)

Reviewed :-)

Betas to play with are published as iva releases at https://github.com/ivandeex/rclone/releases

I'd like to get all devs on the same page with releasing test binaries to users. At the moment if you push to a branch on the main repo you'll get betas built etc. What I'd like to do is make it so that if a team member makes a PR then this also makes a beta - what do you think of that idea? Ideally we'd make some actions bot which posted a link in the PR once it was built.

ivandeex · 2021-09-09T18:05:31Z

I wonder whether we should support easy integration of this as nested remotes are a bit of a pain to set up...

👍 👍 👍

I had the idea that wrapping backends should support doing :hasher:remote:mypath. This is easy to do, if remote is empty in NewFs then use root instead.

👍 An API like fs.NewFsChain(":wrapper,wrap_opt1=val1:remote,inner_opt1=val2:mypath") for

building an FS chain from extended remote path and
returning the top-most Fs from the new chain

seems reasonable and straightforward in the case when wrapping remote has empty root relative to the inner remote (we could even stuff the chain builder right into good old fs.NewFs ???)

👍 FS chains with empty wrapper root should be straightforward on the command line too.

👎 I can't imagine INI syntax for FS chains even with empty wrapper root. Probably needs TOML, not INI ???

👎 I can't imagine what would extended remote path be for wrappers with non-empty root like :wrapper,opt1=val1[[[wrapper-root-goes-somewhere-here]]]:remote,opt2=val2:/path/to/remote/root

I'll give it a quick once over now.
[5min]
Reviewed :-)

I will add docs tomorrow. I'm not nearly as lightning fast :)

I'd like to get all devs on the same page with releasing test binaries to users. At the moment if you push to a branch on the main repo you'll get betas built etc.

I'd be happy to make a beta branch for bisync!
Upd: https://github.com/rclone/rclone/tree/pr-bisync

What I'd like to do is make it so that if a team member makes a PR then this also makes a beta - what do you think of that idea? Ideally we'd make some actions bot which posted a link in the PR once it was built.

I don't know. I just can test out whatever you make.

ivandeex · 2021-09-09T21:07:11Z

@ncw Please verify whether this patch fixes #157

ncw

I've given this a quick once over - looking great!

I'd like to have a discussion about the use of bolt? I'd like to only have one KV database in rclone (once we have disabled the cache backend). Is bolt the right one?

backend/hasher/bolt.go

backend/hasher/hasher.go

backend/hasher/hasher_test.go

backend/hasher/object.go

docs/content/hasher.md

ncw · 2021-09-10T10:04:55Z

@ivandeex wrote:

I wonder whether we should support easy integration of this as nested remotes are a bit of a pain to set up...
I had the idea that wrapping backends should support doing :hasher:remote:mypath. This is easy to do, if remote is empty in NewFs then use root instead.

An API like fs.NewFsChain(":wrapper,wrap_opt1=val1:remote,inner_opt1=val2:mypath") for
building an FS chain from extended remote path and returning the top-most Fs from the new chain seems reasonable and straightforward in the case when wrapping remote has empty root relative to the inner remote (we could even stuff the chain builder right into good old fs.NewFs ???)

I was thinking that if you called wrapper.NewFs with remote="" it would say, ah no remote, lets use root instead.

This would enable this syntax to just work...

:wrapper,wrap_opt1=val1:remote,inner_opt1=val2:mypath
wrapper.NewFs would be called with opt={wrap_opt1=val1}, root=remote,inner_opt1=val2:mypath

It would then decide that since its config remote was empty to set remote = remote,inner_opt1=val2:mypath and root="" and then call remote.NewFs with inner_opt and root=mypath.

So I don't think a new API is needed as long as we are happy with that syntax.

FS chains with empty wrapper root should be straightforward on the command line too.
I can't imagine INI syntax for FS chains even with empty wrapper root. Probably needs TOML, not INI ???
I can't imagine what would extended remote path be for wrappers with non-empty root like :wrapper,opt1=val1[[[wrapper-root-goes-somewhere-here]]]:remote,opt2=val2:/path/to/remote/root

Yes that is a potential problem. So we are saying that if we had a hasher set up on /backup how do we access a path /backup/current?

What would hasher do if it you set up a hasher for /backup and then tried to use /backup/current. Is that effectively a new hasher? The VFS cache is designed so that you can do that, but it is quite a heavy load to impose on all backends.

Here are some ideas

:wrapper,opt1=val1,remote='remote,opt2=val2:/backup':current  # this would work now
:wrapper,opt1=val1,root=current:remote,opt2=val2:/backup         # use root as a parameter as well?
:wrapper,opt1=val1:remote,opt2=val2:/backup//current                # parse off wrapper root from end?

I'd like to get all devs on the same page with releasing test binaries to users. At the moment if you push to a branch on the main repo you'll get betas built etc.

I'd be happy to make a beta branch for bisync! Upd: https://github.com/rclone/rclone/tree/pr-bisync

Great :-)

What I'd like to do is make it so that if a team member makes a PR then this also makes a beta - what do you think of that idea? Ideally we'd make some actions bot which posted a link in the PR once it was built.

I don't know. I just can test out whatever you make.

I figured out how to do everything except work out if the PR is from a team member, but I'll carry on!

[ivandeex] I intended to edit a quoted reply but edited your post by mistake. I tried to do the best to restore.

ivandeex · 2021-09-10T11:40:32Z

I'd like to only have one KV database in rclone (once we have disabled the cache backend).

Currently only hasher plans to use it.
Probably the resumer framework can pretend to have a kind of resumption store.

We could however make it a common API:

factor hasher/bolt.go to lib/kv/bolt.go
rename interface hashOp to kvOp
make --hasher-lock-time a global option --kv-lock-time
change database location from ~/.cache/rclone/hasher/REMOTE to ~/.cache/rclone/kv/REMOTE
run a central pool of database goroutines per rclone instance (one goroutine per database)
factor hasher.Fs.do(hasherOp) into lib/kv.Do(remote, kvOp)
develop new kinds of kvOp to satisfy more internal consumers if any (after this PR)
etc

I'd be happy to factor KV store away and make the backend just use an API.
Anyway it will be a transient cache-oriented KV store not suitable for object metadata as the latter will have to be shared between rclone instances on multiple computers.

Is bolt the right one?

bolt is a satisfactory one.

ivandeex · 2021-09-10T12:11:44Z

What would hasher do if it you set up a hasher for /backup and then tried to use /backup/current.

Hasher will jump in if you setup Hasher{remote=/backup} and use the path Hasher:current.
I provided examples in the top post here (and will make it into the backend docs).

Is that effectively a new hasher?

No. /backup/current is a short for local:/backup/current handled by the local backend.
Hasher does not install hooks in the local backend. It reacts to its own paths.

The VFS cache is designed so that you can do that, but it is quite a heavy load to impose on all backends.

VFS jumps in when you mount VFS controlled stuff somewhere and access the mount point.
In fact you install a hook in the kernel and let it trigger.
rclone mounts do not use file checksums so far so Hasher is irrelevant here (maybe in future mounts will return checksums as extended file attributes).

ivandeex · 2021-09-10T12:26:16Z

Currently wrapping(transient) rclone backends only provide means for wrapping other remotes. You are talking about "injecting" (or mixing in) behaviors of transient backends into other remotes. I didn't set such a goal. rclone lacks internal APIs for such mixins. It's out of scope for this PR.

ivandeex · 2021-09-10T14:25:56Z

Here are some ideas

:wrapper,opt1=val1,remote='remote,opt2=val2:/backup':current  # this would work now
:wrapper,opt1=val1,root=current:remote,opt2=val2:/backup      # use root as a parameter as well?
:wrapper,opt1=val1:remote,opt2=val2:/backup//current          # parse off wrapper root from end?

Let me experiment with your sketch for some time. Maybe I can deliver a PR with new rpath parser in 2-3 months.
At any rate I don't have enough time to complete it before 1.57.

@ncw
I planned to use September for publishing a number of patches developed this summer:
a few ftp fixes, automount support, mount helper, hasher...
and probably bisync (only docs fixes and if you accept it without --backup-dir, modtime support and sessions, otherwise I'll keep it on beta branch and postpone till 1.58).

ivandeex · 2021-09-10T19:23:51Z

I was imagining that we'd patch all the backends but this is much better.

Probably adding optional hashes parameter to every backend and wrapping backend.Object.Hash is a viable alternative......

Update: hashes was previously known as hash_names

ivandeex

:(

backend/hasher/hasher.go

ncw · 2021-09-11T13:30:11Z

I'd like to only have one KV database in rclone (once we have disabled the cache backend).

Currently only hasher plans to use it.
We could however make it a common API:

factor hasher/bolt.go to lib/kv/bolt.go

rename interface hashOp to kvOp

make --hasher-lock-time a global option --kv-lock-time

change database location from ~/.cache/rclone/hasher/REMOTE to ~/.cache/rclone/kv/REMOTE

This should probably have a separate name for each user ~/.cache/rclone/kv/hasher/REMOTE ~/.cache/rclone/kv/VFS/REMOTE (or something!)

run a central pool of database goroutines per rclone instance (one goroutine per database)

etc

Bolt is really the right solution for the VFS metadata - when there gets to be a lot of it, scanning lots of little JSON files is very slow. However I didn't want to commit to Bolt at the time and was wary of the multiple process limitations.

Another place I'd like a KV database is doing syncs larger than memory - eg sync two 20 million entry directories - without using 20 GB of RAM.

I think your idea to stick it in lib is a good one and probably worth doing now.

Is bolt the right one?

bolt is a satisfactory one.

I was looking also at badger however my main concern is longevity and stability rather than absolute performance.

Currently wrapping(transient) rclone backends only provide means for wrapping other remotes. You are talking about "injecting" (or mixing in) behaviors of transient backends into other remotes. I didn't set such a goal. rclone lacks internal APIs for such mixins. It's out of scope for this PR.

Yes lets discuss this later. I just wanted to share my ideas.

Here are some ideas
:wrapper,opt1=val1,remote='remote,opt2=val2:/backup':current  # this would work now
:wrapper,opt1=val1,root=current:remote,opt2=val2:/backup         # use root as a parameter as well?
:wrapper,opt1=val1:remote,opt2=val2:/backup//current                # parse off wrapper root from end?
I planned to use September for publishing a number of prepared patches: ftp, autofs mounts, probably bisync.
Let me experiment with your sketch for a month. Maybe I can deliver a PR with new rpath parser in 2-3 months.

It would be a good idea to finish some of the things we've started. I've got a list of things too sitting in branches here.

ivandeex · 2021-09-11T14:26:01Z

This should probably have a separate name for each user ~/.cache/rclone/kv/hasher/REMOTE ~/.cache/rclone/kv/VFS/REMOTE (or something!)

Alternatively we could use multiple bolt buckets per database, but the more files the better due to fine grained locking.

The "user" approach also calls for:

extended API like kv.Do(ctx context.Context, user string, f fs.Fs, op kvOp) error
something like a map[user_string][fs.Fs]func (...) (pool) of goroutines
a common function to clean and encode remote names like kv.CleanFsName(fs.Fs)

Another place I'd like a KV database is doing syncs larger than memory - eg sync two 20 million entry directories - without using 20 GB of RAM.

I also had an idea to use a struct Trie (details don't matter a.t.m.) for large recursive listings to replace long paths by 32-bit integers around the code. Could save some RAM especially in sync operations.

I was looking also at badger however my main concern is longevity and stability rather than absolute performance.

If we make it an API then "procedural users" will accept anything.
If we decide to switch to another database engine later, the "human users" will notice just a one-time delay due to database regeneration.

Yes lets discuss this later. I just wanted to share my ideas.

I'll create a dedicated ticket and copy the discussion there.

It would be a good idea to finish some of the things we've started.

Holy true.

ncw · 2021-09-12T12:10:50Z

This should probably have a separate name for each user ~/.cache/rclone/kv/hasher/REMOTE ~/.cache/rclone/kv/VFS/REMOTE (or something!)

Alternatively we could use multiple bolt buckets per database, but the more files the better due to fine grained locking.

I like the idea of one file per "purpose" if you see what I mean.

The "user" approach also calls for:

extended API like kv.Do(ctx context.Context, user string, f fs.Fs, op kvOp) error

something like a map[user_string][fs.Fs]func (...) (pool) of goroutines

a common function to clean and encode remote names like kv.CleanFsName(fs.Fs)

There is the stuff in the VFS cache which has had many patches and we should probably start from! This could be factored out into fs maybe?

rclone/vfs/vfscache/cache.go

Lines 78 to 100 in b35db61

    
           // Get cache root path. 
        
           // We need it in two variants: OS path as an absolute path with UNC prefix, 
        
           // OS-specific path separators, and encoded with OS-specific encoder. Standard path 
        
           // without UNC prefix, with slash path separators, and standard (internal) encoding. 
        
           // Care must be taken when creating OS paths so that the ':' separator following a 
        
           // drive letter is not encoded (e.g. into unicode fullwidth colon). 
        
           var err error 
        
           parentOSPath := config.CacheDir // Assuming string contains a local path in OS encoding 
        
           if parentOSPath, err = filepath.Abs(parentOSPath); err != nil { 
        
           	return nil, errors.Wrap(err, "failed to make --cache-dir absolute") 
        
           } 
        
           fs.Debugf(nil, "vfs cache: root is %q", parentOSPath) 
        
           parentPath := fromOSPath(parentOSPath) 
        
           // Get a relative cache path representing the remote. 
        
           relativeDirPath := fremote.Root() // This is a remote path in standard encoding 
        
           if runtime.GOOS == "windows" { 
        
           	if strings.HasPrefix(relativeDirPath, `//?/`) { 
        
           		relativeDirPath = relativeDirPath[2:] // Trim off the "//" for the result to be a valid when appending to another path 
        
           	} 
        
           } 
        
           relativeDirPath = fremote.Name() + "/" + relativeDirPath 
        
           relativeDirOSPath := toOSPath(relativeDirPath)

Another place I'd like a KV database is doing syncs larger than memory - eg sync two 20 million entry directories - without using 20 GB of RAM.

I also had an idea to use a struct Trie (details don't matter a.t.m.) for large recursive listings to replace long paths by 32-bit integers around the code. Could save some RAM especially in sync operations.

Interesting idea.

I was looking also at badger however my main concern is longevity and stability rather than absolute performance.

If we make it an API then "procedural users" will accept anything.
If we decide to switch to another database engine later, the "human users" will notice just a one-time delay due to database regeneration.

Agreed

Yes lets discuss this later. I just wanted to share my ideas.

I'll create a dedicated ticket and copy the discussion there.

👍

It would be a good idea to finish some of the things we've started.

Holy true.

:-)

ivandeex · 2021-09-12T13:07:06Z

This should probably have a separate name for each user ~/.cache/rclone/kv/hasher/REMOTE ~/.cache/rclone/kv/VFS/REMOTE (or something!)

We can consider various db file naming schemes below ~/.cache/rclone/kv:

SUBSYS/REMOTE - db files grouped in dirs by subsystem (vfs, hasher, etc)
REMOTE/SUBSYS - grouped by remote,
for easier cleanup when remotes get deprecated
REMOTE~SUBSYS - flat scheme with some delimiter,
for cleanup by shell globs *~vfs or myremote~*,
delimiter can be ~, #, wide :, wide / or whatever prohibited in remote names
✊🏻 REMOTE~SUBSYS.bolt - variation with explicit db format,
for experiments with badger or schema migrations
etc

ivandeex · 2021-09-15T06:35:46Z

common function to clean and encode remote names like kv.CleanFsName(fs.Fs)

There is the stuff in the VFS cache which has had many patches and we should probably start from! This could be factored out into fs maybe?

rclone/vfs/vfscache/cache.go

Lines 78 to 100 in b35db61

// Get cache root path.

// We need it in two variants: OS path as an absolute path with UNC prefix,

// OS-specific path separators, and encoded with OS-specific encoder. Standard path

// without UNC prefix, with slash path separators, and standard (internal) encoding.

// Care must be taken when creating OS paths so that the ':' separator following a

// drive letter is not encoded (e.g. into unicode fullwidth colon).

var err error

parentOSPath := config.CacheDir // Assuming string contains a local path in OS encoding

if parentOSPath, err = filepath.Abs(parentOSPath); err != nil {

return nil, errors.Wrap(err, "failed to make --cache-dir absolute")

}

fs.Debugf(nil, "vfs cache: root is %q", parentOSPath)

parentPath := fromOSPath(parentOSPath)

// Get a relative cache path representing the remote.

relativeDirPath := fremote.Root() // This is a remote path in standard encoding

if runtime.GOOS == "windows" {

if strings.HasPrefix(relativeDirPath, `//?/`) {

relativeDirPath = relativeDirPath[2:] // Trim off the "//" for the result to be a valid when appending to another path

}

}

relativeDirPath = fremote.Name() + "/" + relativeDirPath

relativeDirOSPath := toOSPath(relativeDirPath)

Note to myself. Look what was factored in reply to
#4547 (comment) and coordinate

ncw · 2021-10-13T15:03:41Z

@ivandeex I've stuck c437002 onto your PR and reverted the test fix - that passes the hasher tests for me - does that look OK to you?

I didn't want to force push your branch - it needs squashing and re-ordering my commit should go first if you think it is OK.

ivandeex · 2021-10-13T15:24:42Z

I didn't want to force push your branch - it needs squashing and re-ordering my commit should go first if you think it is OK.

I think you know it better...
Squashing and reordering...

ivandeex · 2021-10-14T09:23:39Z

A quick note about last two commits...
After testing concurrent calling of kv.Start and db.Stop I had to restrict more parts of these under mutex to make results deterministic without Sleep-s in the test body.
It's more safe but has potential to lock Start for up to 1-2 seconds due to db.open.
Not fond of it, will seek for ways to improve performance in future.

ncw

Let's merge this for 1.57

I see new backends as relatively low risk. This one is marked experimental so if we need to change the config/commands/interface the users have been warned!

Before this change we checked that features.ReadMimeTime was set if and only if the Object.MimeType method was implemented. However this test is overly general - we don't care if Objects advertise MimeType when features.ReadMimeTime is set provided that they always return an empty string (which is what a wrapping backend might do). This patch implements that logic.

Add bolt-based key-value database support. Quick API description: rclone#5587 (comment)

After testing concurrent calling of `kv.Start` and `db.Stop` I had to restrict more parts of these under mutex to make results deterministic without Sleep's in the test body. It's more safe but has potential to lock Start for up to 2 seconds due to `db.open`.

ivandeex · 2021-10-20T15:58:15Z

Merging
Deleting beta branch https://github.com/rclone/rclone/commits/pr-hasher

Before this change we checked that features.ReadMimeTime was set if and only if the Object.MimeType method was implemented. However this test is overly general - we don't care if Objects advertise MimeType when features.ReadMimeTime is set provided that they always return an empty string (which is what a wrapping backend might do). This patch implements that logic.

Add bolt-based key-value database support. Quick API description: #5587 (comment)

After testing concurrent calling of `kv.Start` and `db.Stop` I had to restrict more parts of these under mutex to make results deterministic without Sleep's in the test body. It's more safe but has potential to lock Start for up to 2 seconds due to `db.open`.

ivandeex · 2021-10-20T16:13:51Z

@ncw
You might want to cleanup https://beta.rclone.org/branch/pr-hasher/

ivandeex · 2021-11-02T09:43:15Z

@Sarke
This ticket is implemented and closed.
Please post your question as a new ticket.

Add bolt-based key-value database support. Quick API description: rclone/rclone#5587 (comment)

ivandeex added the hashing label Sep 9, 2021

ivandeex self-assigned this Sep 9, 2021

This was referenced Sep 9, 2021

fs/local: implement hashsum cache #949

Closed

md5sum file support #157

Closed

ivandeex requested a review from ncw September 9, 2021 07:54

ivandeex added this to the v1.57 milestone Sep 9, 2021

This comment has been minimized.

Sign in to view

ivandeex force-pushed the pr-hasher branch 2 times, most recently from 55b93aa to 4709c9c Compare September 9, 2021 18:23

ncw reviewed Sep 10, 2021

View reviewed changes

ivandeex added the new backend label Sep 10, 2021

ivandeex commented Sep 10, 2021

View reviewed changes

backend/hasher/hasher.go Outdated Show resolved Hide resolved

ivandeex changed the title ~~Add hasher backend~~ Add hasher backend and KV store Sep 12, 2021

ivandeex mentioned this pull request Sep 15, 2021

Add new Combine Remote #5600

Closed

ivandeex modified the milestones: v1.57, v1.58 Sep 26, 2021

ivandeex force-pushed the pr-hasher branch from c437002 to 81f7753 Compare October 13, 2021 15:29

ivandeex force-pushed the pr-hasher branch from 50d8e37 to a3d16b4 Compare October 15, 2021 17:50

ivandeex modified the milestones: v1.58, v1.57 Oct 18, 2021

ivandeex mentioned this pull request Oct 18, 2021

v1.57 release #5734

Closed

ncw approved these changes Oct 20, 2021

View reviewed changes

ncw and others added 5 commits October 20, 2021 18:46

lib/kv: add key-value database api rclone#5587

e1eb7ef

Add bolt-based key-value database support. Quick API description: rclone#5587 (comment)

hasher: add hasher backend rclone#5587

6693eeb

hasher: backend documentation rclone#5587

0e447b7

ivandeex force-pushed the pr-hasher branch from a3d16b4 to 0e447b7 Compare October 20, 2021 15:54

ivandeex merged commit 0d7426a into rclone:master Oct 20, 2021

ivandeex added a commit that referenced this pull request Oct 20, 2021

lib/kv: add key-value database api #5587

50df8ce

Add bolt-based key-value database support. Quick API description: #5587 (comment)

ivandeex added a commit that referenced this pull request Oct 20, 2021

hasher: add hasher backend #5587

f102ef2

ivandeex deleted the pr-hasher branch October 20, 2021 16:12

ivandeex added the refactoring label Oct 25, 2021

ivandeex mentioned this pull request Oct 27, 2021

Windows sha1sum via SFTP #5758

Closed

Sarke mentioned this pull request Nov 3, 2021

Using new Hasher BoltDB when possible to bridge different hash remotes #5780

Open

NyaMisty pushed a commit to NyaMisty/fclone that referenced this pull request Feb 5, 2022

lib/kv: add key-value database api #5587

9749ad8

Add bolt-based key-value database support. Quick API description: rclone/rclone#5587 (comment)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add hasher backend and KV store #5587

Add hasher backend and KV store #5587

ivandeex commented Sep 9, 2021 •

edited

Loading

This comment has been minimized.

ncw commented Sep 9, 2021

ivandeex commented Sep 9, 2021 •

edited

Loading

ivandeex commented Sep 9, 2021

ncw left a comment

ncw commented Sep 10, 2021 •

edited by ivandeex

Loading

ivandeex commented Sep 10, 2021 •

edited

Loading

ivandeex commented Sep 10, 2021 •

edited

Loading

ivandeex commented Sep 10, 2021 •

edited

Loading

ivandeex commented Sep 10, 2021 •

edited

Loading

ivandeex commented Sep 10, 2021 •

edited

Loading

ivandeex left a comment •

edited

Loading

ncw commented Sep 11, 2021

ivandeex commented Sep 11, 2021 •

edited

Loading

ncw commented Sep 12, 2021

ivandeex commented Sep 12, 2021 •

edited

Loading

ivandeex commented Sep 15, 2021 •

edited

Loading

ncw commented Oct 13, 2021

ivandeex commented Oct 13, 2021 •

edited

Loading

ivandeex commented Oct 14, 2021 •

edited

Loading

ncw left a comment

ivandeex commented Oct 20, 2021

ivandeex commented Oct 20, 2021

ivandeex commented Nov 2, 2021

Add hasher backend and KV store #5587

Add hasher backend and KV store #5587

Conversation

ivandeex commented Sep 9, 2021 • edited Loading

What is the purpose of this change?

Getting Started

How It Works

Pre-Seed from a SUM File

Cache Storage

Boilerplate

Was the change discussed in an issue or in the forum before?

Checklist

This comment has been minimized.

ncw commented Sep 9, 2021

ivandeex commented Sep 9, 2021 • edited Loading

ivandeex commented Sep 9, 2021

ncw left a comment

Choose a reason for hiding this comment

ncw commented Sep 10, 2021 • edited by ivandeex Loading

ivandeex commented Sep 10, 2021 • edited Loading

ivandeex commented Sep 10, 2021 • edited Loading

ivandeex commented Sep 10, 2021 • edited Loading

ivandeex commented Sep 10, 2021 • edited Loading

ivandeex commented Sep 10, 2021 • edited Loading

ivandeex left a comment • edited Loading

Choose a reason for hiding this comment

ncw commented Sep 11, 2021

ivandeex commented Sep 11, 2021 • edited Loading

ncw commented Sep 12, 2021

ivandeex commented Sep 12, 2021 • edited Loading

ivandeex commented Sep 15, 2021 • edited Loading

ncw commented Oct 13, 2021

ivandeex commented Oct 13, 2021 • edited Loading

ivandeex commented Oct 14, 2021 • edited Loading

ncw left a comment

Choose a reason for hiding this comment

ivandeex commented Oct 20, 2021

ivandeex commented Oct 20, 2021

ivandeex commented Nov 2, 2021

ivandeex commented Sep 9, 2021 •

edited

Loading

ivandeex commented Sep 9, 2021 •

edited

Loading

ncw commented Sep 10, 2021 •

edited by ivandeex

Loading

ivandeex commented Sep 10, 2021 •

edited

Loading

ivandeex commented Sep 10, 2021 •

edited

Loading

ivandeex commented Sep 10, 2021 •

edited

Loading

ivandeex commented Sep 10, 2021 •

edited

Loading

ivandeex commented Sep 10, 2021 •

edited

Loading

ivandeex left a comment •

edited

Loading

ivandeex commented Sep 11, 2021 •

edited

Loading

ivandeex commented Sep 12, 2021 •

edited

Loading

ivandeex commented Sep 15, 2021 •

edited

Loading

ivandeex commented Oct 13, 2021 •

edited

Loading

ivandeex commented Oct 14, 2021 •

edited

Loading