New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rclone backend #1657

Merged
merged 28 commits into from Apr 1, 2018

Conversation

Projects
None yet
7 participants
@fd0
Copy link
Member

fd0 commented Mar 7, 2018

What is the purpose of this change? What does it change?

These commits add a backend based on rclone (or, an unreleased version of rclone). The corresponding PR for rclone is here: ncw/rclone#2116

Basically, restic starts an instance of rclone, talks to it via HTTP2 on stdin/stdout, and then uses the REST protocol.

Was the change discussed in an issue or in the forum before?

Closes #1561

Checklist

  • I have read the Contribution Guidelines
  • I have added tests for all changes in this PR
  • I have added documentation for the changes (in the manual)
  • There's a new file in changelog/unreleased/ that describes the changes for our users (template here)
  • I have run gofmt on the code in all commits
  • All commit messages are formatted in the same style as the other commits in the repo
  • I'm done, this Pull Request is ready for review

TODO:

  • Decide how to delete/hide files, and how to communicate this to the user
  • Make number of connections configurable

@fd0 fd0 added the PR:WIP label Mar 7, 2018

@fd0 fd0 referenced this pull request Mar 7, 2018

Closed

Using rclone as a backend #1561

@fd0 fd0 force-pushed the rclone-backend branch from 3510c4d to d7e6c45 Mar 11, 2018

@fd0 fd0 force-pushed the rclone-backend branch from d7e6c45 to a61f09b Mar 13, 2018

url *url.URL
sem *backend.Semaphore
client *http.Client
backend.Layout
}

const (
contentTypeV1 = "application/vnd.x.restic.rest.v1"
contentTypeV2 = "application/vnd.x.restic.rest.v2"
ContentTypeV1 = "application/vnd.x.restic.rest.v1"

This comment has been minimized.

@houndci-bot

houndci-bot Mar 13, 2018

exported const ContentTypeV1 should have comment (or a comment on this block) or be unexported

@fd0 fd0 force-pushed the rclone-backend branch 3 times, most recently from 0f864bf to ff20223 Mar 13, 2018

@ncw
Copy link
Contributor

ncw left a comment

Please find a few minor notes attached :-)

rtest "github.com/restic/restic/internal/test"
)

const rcloneConfig = `

This comment has been minimized.

@ncw

ncw Mar 14, 2018

Contributor

You could do away with this config if you want - see below

}

cfg := rclone.NewConfig()
cfg.Program = fmt.Sprintf("rclone --config %q", cfgfile)

This comment has been minimized.

@ncw

ncw Mar 14, 2018

Contributor

You can remove --config %q and the saving of the file above and just use

cfg.Remote = repodir

rclone remotes are either remote:remote/path or /local/path. (There is a little bit of ambiguity there I'm sure you'll notice which you can workaround with ./localdirwithcolon:/path)

This comment has been minimized.

@fd0

fd0 Mar 15, 2018

Author Member

Ah, nice, I didn't know that.

// Config contains all configuration necessary to start rclone.
type Config struct {
Program string `option:"program" help:"path to rclone (default: rclone)"`
Args string `option:"args" help:"arguments for running rclone (default: restic serve --stdio)"`

This comment has been minimized.

@ncw

ncw Mar 14, 2018

Contributor

That should probably be (default: serve restic --stdio)

This comment has been minimized.

@fd0

fd0 Mar 15, 2018

Author Member

Yep


// Close terminates the backend.
func (be *Backend) Close() error {
debug.Log("exting rclone")

This comment has been minimized.

@ncw

ncw Mar 14, 2018

Contributor

Typo here! "exiting"

This comment has been minimized.

@fd0

fd0 Mar 15, 2018

Author Member

Good catch!

@@ -0,0 +1,72 @@
package rclone

This comment has been minimized.

@ncw

ncw Mar 14, 2018

Contributor

It ocurred to me that this is the kind of useful little object that might go in its own repo. Rclone could then import it too!

This comment has been minimized.

@fd0

fd0 Mar 15, 2018

Author Member

That also occurred to me, but I prefer to not couple the source code too much for now. We can always move it out later (ideally, when vgo is merged)

@codecov-io

This comment has been minimized.

Copy link

codecov-io commented Mar 15, 2018

Codecov Report

Merging #1657 into master will decrease coverage by 0.22%.
The diff coverage is 65.38%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master   #1657      +/-   ##
=========================================
- Coverage   52.43%   52.2%   -0.23%     
=========================================
  Files         143     148       +5     
  Lines       11415   11639     +224     
=========================================
+ Hits         5985    6076      +91     
- Misses       4507    4627     +120     
- Partials      923     936      +13
Impacted Files Coverage Δ
internal/backend/location/location.go 65.71% <ø> (ø) ⬆️
cmd/restic/global.go 28.86% <0%> (-0.87%) ⬇️
internal/backend/rclone/stdio_conn_other.go 0% <0%> (ø)
internal/backend/rclone/stdio_conn_go110.go 0% <0%> (ø)
internal/backend/foreground_unix.go 0% <0%> (ø)
internal/backend/rest/config.go 76.47% <100%> (+3.13%) ⬆️
internal/backend/rest/rest.go 63.24% <100%> (ø) ⬆️
internal/backend/test/tests.go 59.95% <33.33%> (-0.93%) ⬇️
internal/backend/rclone/stdio_conn.go 52% <52%> (ø)
internal/backend/sftp/sftp.go 62.3% <60%> (-0.34%) ⬇️
... and 12 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e4a39e0...3f48e0e. Read the comment docs.

@fd0

This comment has been minimized.

Copy link
Member Author

fd0 commented Mar 15, 2018

So, the code is basically done, it just needs some docs. I could use some help testing the backend :)

@ncw

This comment has been minimized.

Copy link
Contributor

ncw commented Mar 15, 2018

So, the code is basically done, it just needs some docs. I could use some help testing the backend :)

What sort of tests?

I could run the backend unit tests (using rclone's unit test) against all the test accounts for all the different providers quite easily? Would that help?

@fd0

This comment has been minimized.

Copy link
Member Author

fd0 commented Mar 15, 2018

I could run the backend unit tests (using rclone's unit test) against all the test accounts for all the different providers quite easily? Would that help?

That'd be interesting! It may transfer a non-trivial amount of data though...

What I meant was more like manual tests, accessing repos stored in some service and created directly with restic indirectly via rclone and so on. Report back on how it feels :)

@fd0 fd0 force-pushed the rclone-backend branch from 2beaad1 to 2763290 Mar 15, 2018

@fd0

This comment has been minimized.

Copy link
Member Author

fd0 commented Mar 15, 2018

Docs are done, please let me know if there's anything odd!

@fd0 fd0 removed the PR:WIP label Mar 15, 2018

@rawtaz

This comment has been minimized.

Copy link
Contributor

rawtaz commented Mar 15, 2018

I commented on one small thing in 2beaad1.

@rawtaz

This comment has been minimized.

Copy link
Contributor

rawtaz commented Mar 16, 2018

In addition to the comment I mentioned above, I suggest you also add rclone to the list of backends in the README.rst file :)

@fd0

This comment has been minimized.

Copy link
Member Author

fd0 commented Mar 16, 2018

Cool, thanks!

@ncw

This comment has been minimized.

Copy link
Contributor

ncw commented Mar 16, 2018

I ran the integration tests against all the remotes last night...

The results look good though not perfect.

Straight passes for

  • AzureBlob
  • Box
  • Cache
  • Crypt + Drive
  • Crypt + Swift
  • Drive
  • Dropbox
  • FTP
  • GoogleCloudStorage
  • Pcloud
  • QingStor
  • S3
  • Sftp
  • Swift
  • Webdav (log missing as I had to redo this as my webdav server got full!)
  • Yandex

Fails of some kind for

  • B2
  • Hubic
  • OneDrive

I'll dig into these. Here are the logs in case you are interested: integration-tests.zip

B2

There were no errors in the rclone log, but this test failed

                --- FAIL: TestBackendRESTExternalServer (95.50s)
                    --- FAIL: TestBackendRESTExternalServer/TestBackend (23.50s)
                        tests.go:782: wrong number of IDs returned: want 4, got 3

I think that is worth investigating further as it seems consistent

Hubic

Hubic failed with service problems or eventual consistency problems. rclone would deal with this sort of thing with lots of retries...

2018/03/15 22:50:54 ERROR : keys/4e54d2c721cbdb730f01b10b62dec622962b36966ec685880effa63d71c808f2: Post request put error: HTTP Error: 503: 503 Service Unavailable
2018/03/15 22:56:44 ERROR : data/2e/2e4967c43b7d3fb9d3f4b61997974240b1a5a6e6136d9756673eb03b89946126: Delete request remove error: Object Not Found
2018/03/15 22:59:26 ERROR : potato/sausage/data/24/248d6a61d20638b8e5c026930c3e6039a33ce45964ff2167f6ecedd419db06c1: Delete request remove error: Object Not Found
2018/03/15 23:04:32 ERROR : potato/sausage/data/1b/1b06f7b567c7f267319bf39f28aa39461537550f0daf6b1c92564fdfb44dda05: Delete request remove error: Object Not Found
2018/03/15 23:06:01 ERROR : keys/c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714caef0c4f2: Couldn't delete: Object Not Found

Onedrive

Failed because of a recent commit to rclone!

@ncw

This comment has been minimized.

Copy link
Contributor

ncw commented Mar 16, 2018

I had a look through the docs - they look great :-)

You might want to add something like this, which restic users might find easier since the restic configuration is often done with environment variables.

....

Rclone can be configured with environment variables also, so for instance if you wanted to set a bandwidth limit for rclone you could export RCLONE_BWLIMIT=1M rather than using -o rclone.args="serve restic --stdio --bwlimit 1M"

@fd0

This comment has been minimized.

Copy link
Member Author

fd0 commented Mar 17, 2018

Thanks for the comments, I've added the suggested text :)

@fd0

This comment has been minimized.

Copy link
Member Author

fd0 commented Mar 17, 2018

@ncw hm, is it possible that the error for B2 is caused by not calling rclone with --b2-hard-delete?

@fd0

This comment has been minimized.

Copy link
Member Author

fd0 commented Mar 17, 2018

Looks like it, running the backend tests for the same directory without --b2-hard-delete fails on the second and subsequent runs, with --b2-hard-delete it always completes.

Shall we maybe enable this by default in rclone for the restic server? Or just hard-code this into the backend test call for rclone?

Are there any other backends which have a concept of "versions" of files for which we should do the same?

@ncw

This comment has been minimized.

Copy link
Contributor

ncw commented Mar 18, 2018

@fd0 It should make no difference whether --b2-hard-delete is called or not - the deleted file versions should be invisible... I'm struggling trying to work out what is going on.

Apparently the test is indicating that there is an extra key returned from the delayedList function.

Looking at that function I'm wondering if there might be some sort of eventual consistency thing going on as that function seems to collect up as many list entries as it can, rather than just returning the last list which had the desired entry in it.

I wonder whether something like this might fix the problem...

--- a/internal/backend/test/tests.go
+++ b/internal/backend/test/tests.go
@@ -673,9 +673,10 @@ func (s *Suite) delayedRemove(t testing.TB, be restic.Backend, handles ...restic
 }
 
 func delayedList(t testing.TB, b restic.Backend, tpe restic.FileType, max int, maxwait time.Duration) restic.IDs {
-	list := restic.NewIDSet()
+	var list restic.IDSet
 	start := time.Now()
 	for i := 0; i < max; i++ {
+		list = restic.NewIDSet()
 		err := b.List(context.TODO(), tpe, func(fi restic.FileInfo) error {
 			id := restic.TestParseID(fi.Name)
 			list.Insert(id)
@@ -783,7 +784,7 @@ func (s *Suite) TestBackend(t *testing.T) {
 
 		list := delayedList(t, b, tpe, len(IDs), s.WaitForDelayedRemoval)
 		if len(IDs) != len(list) {
-			t.Fatalf("wrong number of IDs returned: want %d, got %d", len(IDs), len(list))
+			t.Fatalf("wrong number of IDs returned: want %d, got %d\n  want:\n  %v\n  got:\n%v\n", len(IDs), len(list), IDs, list)
 		}
 
 		sort.Sort(IDs)

I tried it - it didn't!

Any more ideas?

@fd0 fd0 force-pushed the rclone-backend branch from 3fb8917 to 3944b2b Mar 18, 2018

@fd0

This comment has been minimized.

Copy link
Member Author

fd0 commented Mar 18, 2018

@ncw I've managed to reproduce it while writing debug logs for restic and rclone, see here:

rclone-log.txt
restic-log.txt

Restic was started like this:

$ cd internal/backend/rest
$ RESTIC_TEST_REST_REPOSITORY=rest:http://localhost:8080/ DEBUG_FILES=*.go go test -tags debug -v -run External/TestBackend

and rclone:

rclone --verbose=2 --dump-bodies --dump-headers serve restic b2:restic-test-an/test-rest-ext/

Maybe you can see something. What I discovered so far was: restic creates and deletes the file c3ab8ff1 a few times, among others in the data/ directory. At the end it tries to list all files in data/, and did not find the file:

        tests.go:787: wrong number of IDs returned: want 4, got 3
                  want:
                  [c3ab8ff1 248d6a61 cc5d46bd 4e54d2c7]
                  got:
                [248d6a61 4e54d2c7 cc5d46bd]

It was found before and for other file types. So, a file/key is missing, instead of one additional key.

I can see in the rclone log file that listing the subdirs of data/ returns only three dirs, 24, 4e, and cc. So the last upload did maybe not succeed? It looks like rclone told restic it did succeed...

Any idea what's going on?

@fd0

This comment has been minimized.

Copy link
Member Author

fd0 commented Mar 21, 2018

So, what's still to do before this can be merged?

From my point of view, we need find a solution for the deletion issue on B2. I'd like to avoid users ending up with plenty of invisible files stored at B2 that they're not aware of. When restic deletes files via rclone, are they eventually deleted on B2? Is there some delayed remove for hidden files? Or do users need to manually run rclone cleanup?

@fd0 fd0 referenced this pull request Mar 21, 2018

Closed

Google Drive backend #1505

6 of 7 tasks complete

@fd0 fd0 force-pushed the rclone-backend branch from 2d756ce to 3f48e0e Apr 1, 2018

@fd0

This comment has been minimized.

Copy link
Member Author

fd0 commented Apr 1, 2018

I've added the two parameters for now, we can remove them from restic once the next version of rclone is released. Thanks a lot for your work @ncw!

Once the integration tests complete I'll merge this!

@fd0 fd0 merged commit 3f48e0e into master Apr 1, 2018

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

fd0 added a commit that referenced this pull request Apr 1, 2018

@fd0 fd0 deleted the rclone-backend branch Apr 1, 2018

@mpsrig

This comment has been minimized.

Copy link

mpsrig commented on cmd/restic/global.go in fe99340 Jul 19, 2018

@fd0 Can you clarify why this is using rclone.Open instead of rclone.Create? Looks like as a result rclone.Create is dead code except for tests.

This comment has been minimized.

Copy link
Member Author

fd0 replied Jul 19, 2018

Hm, looks like an oversight. Can you please open an issue about it (no need to fill out the issue template completely)? I'll never find this comment on a commit again after closing the browser tab :)

This comment has been minimized.

Copy link

mpsrig replied Jul 19, 2018

created #1896

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment