Skip to content

Rclone backend #1657

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 28 commits into from
Apr 1, 2018
Merged

Rclone backend #1657

merged 28 commits into from
Apr 1, 2018

Conversation

fd0
Copy link
Member

@fd0 fd0 commented Mar 7, 2018

What is the purpose of this change? What does it change?

These commits add a backend based on rclone (or, an unreleased version of rclone). The corresponding PR for rclone is here: rclone/rclone#2116

Basically, restic starts an instance of rclone, talks to it via HTTP2 on stdin/stdout, and then uses the REST protocol.

Was the change discussed in an issue or in the forum before?

Closes #1561

Checklist

  • I have read the Contribution Guidelines
  • I have added tests for all changes in this PR
  • I have added documentation for the changes (in the manual)
  • There's a new file in changelog/unreleased/ that describes the changes for our users (template here)
  • I have run gofmt on the code in all commits
  • All commit messages are formatted in the same style as the other commits in the repo
  • I'm done, this Pull Request is ready for review

TODO:

  • Decide how to delete/hide files, and how to communicate this to the user
  • Make number of connections configurable

url *url.URL
sem *backend.Semaphore
client *http.Client
backend.Layout
}

const (
contentTypeV1 = "application/vnd.x.restic.rest.v1"
contentTypeV2 = "application/vnd.x.restic.rest.v2"
ContentTypeV1 = "application/vnd.x.restic.rest.v1"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exported const ContentTypeV1 should have comment (or a comment on this block) or be unexported

@fd0 fd0 force-pushed the rclone-backend branch 3 times, most recently from 0f864bf to ff20223 Compare March 14, 2018 19:48
Copy link
Contributor

@ncw ncw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please find a few minor notes attached :-)

rtest "github.com/restic/restic/internal/test"
)

const rcloneConfig = `
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could do away with this config if you want - see below

}

cfg := rclone.NewConfig()
cfg.Program = fmt.Sprintf("rclone --config %q", cfgfile)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can remove --config %q and the saving of the file above and just use

cfg.Remote = repodir

rclone remotes are either remote:remote/path or /local/path. (There is a little bit of ambiguity there I'm sure you'll notice which you can workaround with ./localdirwithcolon:/path)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, nice, I didn't know that.

// Config contains all configuration necessary to start rclone.
type Config struct {
Program string `option:"program" help:"path to rclone (default: rclone)"`
Args string `option:"args" help:"arguments for running rclone (default: restic serve --stdio)"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That should probably be (default: serve restic --stdio)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep


// Close terminates the backend.
func (be *Backend) Close() error {
debug.Log("exting rclone")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo here! "exiting"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

@@ -0,0 +1,72 @@
package rclone
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It ocurred to me that this is the kind of useful little object that might go in its own repo. Rclone could then import it too!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That also occurred to me, but I prefer to not couple the source code too much for now. We can always move it out later (ideally, when vgo is merged)

@codecov-io
Copy link

codecov-io commented Mar 15, 2018

Codecov Report

Merging #1657 into master will decrease coverage by 0.22%.
The diff coverage is 65.38%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master   #1657      +/-   ##
=========================================
- Coverage   52.43%   52.2%   -0.23%     
=========================================
  Files         143     148       +5     
  Lines       11415   11639     +224     
=========================================
+ Hits         5985    6076      +91     
- Misses       4507    4627     +120     
- Partials      923     936      +13
Impacted Files Coverage Δ
internal/backend/location/location.go 65.71% <ø> (ø) ⬆️
cmd/restic/global.go 28.86% <0%> (-0.87%) ⬇️
internal/backend/rclone/stdio_conn_other.go 0% <0%> (ø)
internal/backend/rclone/stdio_conn_go110.go 0% <0%> (ø)
internal/backend/foreground_unix.go 0% <0%> (ø)
internal/backend/rest/config.go 76.47% <100%> (+3.13%) ⬆️
internal/backend/rest/rest.go 63.24% <100%> (ø) ⬆️
internal/backend/test/tests.go 59.95% <33.33%> (-0.93%) ⬇️
internal/backend/rclone/stdio_conn.go 52% <52%> (ø)
internal/backend/sftp/sftp.go 62.3% <60%> (-0.34%) ⬇️
... and 12 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e4a39e0...3f48e0e. Read the comment docs.

@fd0
Copy link
Member Author

fd0 commented Mar 15, 2018

So, the code is basically done, it just needs some docs. I could use some help testing the backend :)

@ncw
Copy link
Contributor

ncw commented Mar 15, 2018

So, the code is basically done, it just needs some docs. I could use some help testing the backend :)

What sort of tests?

I could run the backend unit tests (using rclone's unit test) against all the test accounts for all the different providers quite easily? Would that help?

@fd0
Copy link
Member Author

fd0 commented Mar 15, 2018

I could run the backend unit tests (using rclone's unit test) against all the test accounts for all the different providers quite easily? Would that help?

That'd be interesting! It may transfer a non-trivial amount of data though...

What I meant was more like manual tests, accessing repos stored in some service and created directly with restic indirectly via rclone and so on. Report back on how it feels :)

@fd0 fd0 force-pushed the rclone-backend branch from 2beaad1 to 2763290 Compare March 15, 2018 21:48
@fd0
Copy link
Member Author

fd0 commented Mar 15, 2018

Docs are done, please let me know if there's anything odd!

@fd0 fd0 removed the PR:WIP label Mar 15, 2018
@rawtaz
Copy link
Contributor

rawtaz commented Mar 15, 2018

I commented on one small thing in 2beaad1.

@rawtaz
Copy link
Contributor

rawtaz commented Mar 16, 2018

In addition to the comment I mentioned above, I suggest you also add rclone to the list of backends in the README.rst file :)

@fd0
Copy link
Member Author

fd0 commented Mar 16, 2018

Cool, thanks!

@ncw
Copy link
Contributor

ncw commented Mar 16, 2018

I ran the integration tests against all the remotes last night...

The results look good though not perfect.

Straight passes for

  • AzureBlob
  • Box
  • Cache
  • Crypt + Drive
  • Crypt + Swift
  • Drive
  • Dropbox
  • FTP
  • GoogleCloudStorage
  • Pcloud
  • QingStor
  • S3
  • Sftp
  • Swift
  • Webdav (log missing as I had to redo this as my webdav server got full!)
  • Yandex

Fails of some kind for

  • B2
  • Hubic
  • OneDrive

I'll dig into these. Here are the logs in case you are interested: integration-tests.zip

B2

There were no errors in the rclone log, but this test failed

                --- FAIL: TestBackendRESTExternalServer (95.50s)
                    --- FAIL: TestBackendRESTExternalServer/TestBackend (23.50s)
                        tests.go:782: wrong number of IDs returned: want 4, got 3

I think that is worth investigating further as it seems consistent

Hubic

Hubic failed with service problems or eventual consistency problems. rclone would deal with this sort of thing with lots of retries...

2018/03/15 22:50:54 ERROR : keys/4e54d2c721cbdb730f01b10b62dec622962b36966ec685880effa63d71c808f2: Post request put error: HTTP Error: 503: 503 Service Unavailable
2018/03/15 22:56:44 ERROR : data/2e/2e4967c43b7d3fb9d3f4b61997974240b1a5a6e6136d9756673eb03b89946126: Delete request remove error: Object Not Found
2018/03/15 22:59:26 ERROR : potato/sausage/data/24/248d6a61d20638b8e5c026930c3e6039a33ce45964ff2167f6ecedd419db06c1: Delete request remove error: Object Not Found
2018/03/15 23:04:32 ERROR : potato/sausage/data/1b/1b06f7b567c7f267319bf39f28aa39461537550f0daf6b1c92564fdfb44dda05: Delete request remove error: Object Not Found
2018/03/15 23:06:01 ERROR : keys/c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714caef0c4f2: Couldn't delete: Object Not Found

Onedrive

Failed because of a recent commit to rclone!

@ncw
Copy link
Contributor

ncw commented Mar 16, 2018

I had a look through the docs - they look great :-)

You might want to add something like this, which restic users might find easier since the restic configuration is often done with environment variables.

....

Rclone can be configured with environment variables also, so for instance if you wanted to set a bandwidth limit for rclone you could export RCLONE_BWLIMIT=1M rather than using -o rclone.args="serve restic --stdio --bwlimit 1M"

@fd0
Copy link
Member Author

fd0 commented Mar 17, 2018

Thanks for the comments, I've added the suggested text :)

@fd0
Copy link
Member Author

fd0 commented Mar 17, 2018

@ncw hm, is it possible that the error for B2 is caused by not calling rclone with --b2-hard-delete?

@fd0
Copy link
Member Author

fd0 commented Mar 17, 2018

Looks like it, running the backend tests for the same directory without --b2-hard-delete fails on the second and subsequent runs, with --b2-hard-delete it always completes.

Shall we maybe enable this by default in rclone for the restic server? Or just hard-code this into the backend test call for rclone?

Are there any other backends which have a concept of "versions" of files for which we should do the same?

@ncw
Copy link
Contributor

ncw commented Mar 18, 2018

@fd0 It should make no difference whether --b2-hard-delete is called or not - the deleted file versions should be invisible... I'm struggling trying to work out what is going on.

Apparently the test is indicating that there is an extra key returned from the delayedList function.

Looking at that function I'm wondering if there might be some sort of eventual consistency thing going on as that function seems to collect up as many list entries as it can, rather than just returning the last list which had the desired entry in it.

I wonder whether something like this might fix the problem...

--- a/internal/backend/test/tests.go
+++ b/internal/backend/test/tests.go
@@ -673,9 +673,10 @@ func (s *Suite) delayedRemove(t testing.TB, be restic.Backend, handles ...restic
 }
 
 func delayedList(t testing.TB, b restic.Backend, tpe restic.FileType, max int, maxwait time.Duration) restic.IDs {
-	list := restic.NewIDSet()
+	var list restic.IDSet
 	start := time.Now()
 	for i := 0; i < max; i++ {
+		list = restic.NewIDSet()
 		err := b.List(context.TODO(), tpe, func(fi restic.FileInfo) error {
 			id := restic.TestParseID(fi.Name)
 			list.Insert(id)
@@ -783,7 +784,7 @@ func (s *Suite) TestBackend(t *testing.T) {
 
 		list := delayedList(t, b, tpe, len(IDs), s.WaitForDelayedRemoval)
 		if len(IDs) != len(list) {
-			t.Fatalf("wrong number of IDs returned: want %d, got %d", len(IDs), len(list))
+			t.Fatalf("wrong number of IDs returned: want %d, got %d\n  want:\n  %v\n  got:\n%v\n", len(IDs), len(list), IDs, list)
 		}
 
 		sort.Sort(IDs)

I tried it - it didn't!

Any more ideas?

@fd0 fd0 force-pushed the rclone-backend branch from 3fb8917 to 3944b2b Compare March 18, 2018 19:30
@fd0
Copy link
Member Author

fd0 commented Mar 18, 2018

@ncw I've managed to reproduce it while writing debug logs for restic and rclone, see here:

rclone-log.txt
restic-log.txt

Restic was started like this:

$ cd internal/backend/rest
$ RESTIC_TEST_REST_REPOSITORY=rest:http://localhost:8080/ DEBUG_FILES=*.go go test -tags debug -v -run External/TestBackend

and rclone:

rclone --verbose=2 --dump-bodies --dump-headers serve restic b2:restic-test-an/test-rest-ext/

Maybe you can see something. What I discovered so far was: restic creates and deletes the file c3ab8ff1 a few times, among others in the data/ directory. At the end it tries to list all files in data/, and did not find the file:

        tests.go:787: wrong number of IDs returned: want 4, got 3
                  want:
                  [c3ab8ff1 248d6a61 cc5d46bd 4e54d2c7]
                  got:
                [248d6a61 4e54d2c7 cc5d46bd]

It was found before and for other file types. So, a file/key is missing, instead of one additional key.

I can see in the rclone log file that listing the subdirs of data/ returns only three dirs, 24, 4e, and cc. So the last upload did maybe not succeed? It looks like rclone told restic it did succeed...

Any idea what's going on?

@fd0
Copy link
Member Author

fd0 commented Mar 21, 2018

So, what's still to do before this can be merged?

From my point of view, we need find a solution for the deletion issue on B2. I'd like to avoid users ending up with plenty of invisible files stored at B2 that they're not aware of. When restic deletes files via rclone, are they eventually deleted on B2? Is there some delayed remove for hidden files? Or do users need to manually run rclone cleanup?

@fd0 fd0 mentioned this pull request Mar 21, 2018
7 tasks
@fd0 fd0 force-pushed the rclone-backend branch from 2d756ce to 3f48e0e Compare April 1, 2018 08:36
@fd0
Copy link
Member Author

fd0 commented Apr 1, 2018

I've added the two parameters for now, we can remove them from restic once the next version of rclone is released. Thanks a lot for your work @ncw!

Once the integration tests complete I'll merge this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants