-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Basic REST backend #253
Basic REST backend #253
Conversation
Hey, thanks for your contribution! We need to get this PR in shape before it can be merged, but that shouldn't be too hard. I'll have a look later today. |
It wasn't really intended for a direct merge at this point. The current PR doesn't care about HTTP Auth which is probably the minimal requirement, except if the backend is for demo purpose only. Do you have something in mind regarding this point? |
I don't have the time to look at the code directly. The server component should be split out as far as possible and the tests could be done with a mock-server (should be easy to implement). I'll have a look at this later. HTTP auth is a good idea. |
Ok, the tests are now passing. Regarding the basic authentication, is seems possible to add the credentials in the url (http://username:password@host/)instead of the Http Headers. It's probably a good option to keep things as simple as possible. |
Cool, sounds good. I didn't yet have the time to look at your code (just to give you a heads-up). |
In the following, I'll dump my thoughts I had today: Could you give a high-level overview of how the REST server should look like and what it should do? E.g., what are the paths, methods etc? We should document this below I'm thinking about adding a 'server' command to restic that can later be used to start different servers, e.g. REST, a remote server for stdin/stdout etc. Are you also interested in implementing the REST server part? Like, run |
Ok, first round of comments is done 😁 |
@@ -108,6 +108,7 @@ func testBackend(b backend.Backend, t *testing.T) { | |||
OK(t, err) | |||
|
|||
found, err := b.Test(tpe, id.String()) | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's this empty line?
I'm not sure if the cleanup is necessary since the router is already extracting variables from the path. In this case, a relative path can not be injected in the variable. |
Even if it isn't necessary I'd like to implement it, this approach follows the "defense-in-depth" or "defensive programming" paradigm and I think it's a good idea. It's not much additional code (a call to |
Another thought I had today: The REST backend can (with minor modifications) directly be used to restore a backup that is served via HTTP, e.g. by an Apache web server. In this case locking wouldn't work (because adding new files doesn't work), but we could ignore that with a command-line flag. Very cool! |
From a backend perspective the more defensive approach would be to test the types and the blobIDs. Since we know the types and the blobIDs are sha256 sums it shouldn't be too hard to implement very strong checks on the server side. I just tried an approach that validates the "type" and the "blobID" but it seems that the generic tests are not using valid sha256 sums. I don't want to change the tests at the moment since it will have an impact on the other backends. A different PR would be more suitable. What do you think? var blobTypes = []string{"data", "snapshot", "index", "lock", "key", "temp"}
func isBlobType(blobType string) bool {
for _, t := range blobTypes {
if blobType == t {
return true
}
}
return false
}
func isBlobID(blobID string) bool {
if len(blobID) != sha256.Size {
return false
}
if _, e := hex.DecodeString(blobID); e != nil {
return false
}
return true
} |
Regarding the command line for starting a server in restic, I think it would make sense to have a very small read only http backend over a local directory. However, I wonder if it is the right place to build a bigger backend, since it will probably come with other dependencies. What do you think? |
I agree. We could just use
Oh, they are valid, you just have a small bug in your code: In the representation as a string, the string length is The function
I haven't made my mind about that. The read-only HTTP backend has at least the problem that locking isn't possible because for that a blob of type What server implementation do you use for testing at the moment? |
return nil, err | ||
} | ||
|
||
req.Header.Add("Range", "bytes="+string(offset)+"-"+string(offset+length)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't work, see http://play.golang.org/p/uVA6JAf1qD
You can just use fmt.Sprintf("bytes=%d-%d", offset, offset+length)
.
I wonder why the GetReader()
isn't tested at all, and I've added #254 to track adding such a test.
Thanks again for your contribution! I like it, and the documentation for the REST server interface is good for now. 😃 |
|
||
rb.final = true | ||
|
||
// Check key does not already exist. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, do you think we can just skip the HEAD request here and just execute the POST below? In the (rather unlikely) event that a blob already exists, the POST request returns an error. This way we could save a round-trip (just one HTTP request), but maybe upload a large amount of data before receiving an error...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know the internals of the golang http server. However, if the call on HandleFunc is performed immediately when the http headers are received, it may be possible to cancel the POST request before receiving the full Body of the request (multipart). It would be something similar to the HEAD request with one round-trip. I will remove this call for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. This saves a round-trip for each newly added blob. Because the error condition we're trying to avoid here (uploading a duplicate blob) is unlikely, this should speed up the regular case (new blob) greatly.
Other minor things I noticed:
("Minor things" here means "you don't need to change existing commits, but try to apply the tips next time") |
I"ll have a look in the evening. |
I had a look, the commit indeed caches trees aggressively, but this needs A LOT of RAM (I cancelled my test backup after ~6GB)... So let's finish the REST backend in this PR and I'll deal with the tree cache in another. |
Ok, looks pretty bad. I only tested it with small backups (~3GB) and didn't noticed this memory issue. This is strange, in that case the files loaded in memory represents only 35MB. Let's focus on the REST backend for now. |
I had a look, and the backend works really well. I came across the following issues:
I tested the REST backend against a quickly thrown-together REST server in ruby, see https://gist.github.com/d83e9da3ab9364a1d912 Could you remove the aggressive caching commit and rebase against the current master, then I'll merge the PR. |
Next issue: when the REST server is not reachable, a
|
And the fuse mount with a REST backend doesn't work (at least with my test server, but I don't see any errors):
|
Thank you for the feedbacks, I will be very busy during the next days but I will try to solve these issues one evening by the end of the week. |
Sure, no need to rush anything. Thanks for your contribution! |
Regarding the singular form I mainly used it to avoid the mapping between the directories and the types. The same approach is adopted by the S3 backend. But it's not an issue to change this behavior, what do you think? Regarding the trailing slashes, it seems common to use them to distinguish a group of resources from a specific resource. The fuse mount works with my server implementation but I will check what happening with in your case. |
👍 to using plural, that's more common |
Hm, changing the behavior of the s3 backend should be backwards-compatible. I'd suggest leaving the s3 backend alone for now, but still use the plural form for the REST backend. |
I made some changes that address all the comments. Regarding fuse, it seems to work with the go server but I haven't been able to identify what goes wrong with the ruby server. I also removed completely the dependency to gorilla mux. |
Thanks a lot! Just a quick heads-up: I'm up to my neck in other work, I don't have the time to look at it right now ;) |
I've started porting the REST backend to the current code, this PR is therefore superseded by #464 |
I started working on a very basic REST backend (#23) and test server (server.go). At this point, the test server must be started in the background before running the tests. The I'm not really familiar with go and I don't really know how to stop the server after the execution. Feedbacks appreciated... ;)
Connects #23