Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

serve: add s3 server - the second #7062

Closed
wants to merge 11 commits into from

Conversation

individual-it
Copy link
Contributor

@individual-it individual-it commented Jun 16, 2023

What is the purpose of this change?

Implements s3 server for rclone

This is basically #6461 by @Mikubill rebased on current master and solved the conflicts

Also I've removed the http auth (I don't understand why it was needed) and added the fix of @oniumy to get S3auth working (rclone/gofakes3#1)

Give Honor to Whom It Is Due:

Checklist

  • I have read the contribution guidelines.
  • I have added tests for all changes in this PR if appropriate.
  • I have added documentation for the changes if appropriate.
  • All commit messages are in house style.
  • I'm done, this Pull Request is ready for review :-)

@individual-it individual-it marked this pull request as ready for review June 16, 2023 07:28
This was referenced Jun 16, 2023
@individual-it
Copy link
Contributor Author

individual-it commented Jun 21, 2023

there was an issue when using the mc client with file/folder names that need to be URL encoded, its fixed in the upstream project and waiting for a merge there, that is why I've added a replace line to go.mod
I think this is ready for review

@ncw
Copy link
Member

ncw commented Jun 22, 2023

I note that this doesn't credit any of the original authors...

What I would do is squash Mikubill's work into a single commit. Rebase and amend that commit until it compiles.

Then any changes you make put in a seperate commit.

@individual-it
Copy link
Contributor Author

@ncw no question, the work is done by @Mikubill, I didn't meant to claim it

I've done as you have suggested and now there are two commits:

The first makes the server basically work

The second commit:

I don't understand why the racequicktest failed in CI, it works fine locally, so I hope it will pass in the next run

@ncw
Copy link
Member

ncw commented Jun 23, 2023

Nice - thank you.

Can you change the commits so their first lines are house style?

feat: let rclone act as an S3 compatible server
fix: file name encoding using s3 serve with mc client ...

To something like

serve s3: let rclone act as an S3 compatible server
serve s3: fix file name encoding using s3 serve with mc client ...

Thank you.

I'll re-run the CI now.

I won't be able to give this a thorough review until after (the delayed) 1.63 is out the door but I'd like to merge for 1.64.

@ncw
Copy link
Member

ncw commented Sep 1, 2023

Is this still a candidate for 1.64? We could merge it as experimental? What do you think?

@individual-it
Copy link
Contributor Author

@ncw from my point of view this is ready to be merged. Yesterday I've done a rebase on current master and all tests pass

Copy link
Member

@ncw ncw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking excellent :-) Especially good work on the tests!

I put some minor comments inline. The most concerning is probably the VFS vs Fs one - we should probably have a plan for changing that even if we don't do it immediately.

What is the status of this library? github.com/Mikubill/gofakes I see you forked it. Do you plan to upstream your changes? Or pull back changes from the upstream?

I think we can get this merged for v1.64 :-)

func (db *s3Backend) getObjectsListArbitrary(bucket string, prefix *gofakes3.Prefix, response *gofakes3.ObjectList) error {

// ignore error - vfs may have uncommitted updates, such as new dir etc.
_ = walk.ListR(context.Background(), db.fs.Fs(), bucket, false, -1, walk.ListObjects, func(entries fs.DirEntries) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its best to use the VFS or the lower level Fs layer. Trying to mix and match causes problems as the VFS's view of the world is different the Fs as you've discovered.

Just wondering why you used walk.ListR here?

case gofakes3.LogInfo:
fs.Debugf(nil, fmt.Sprintln(v...))
default:
panic("unknown level")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

panic on an unknown log level seems a little unkind! Perhaps make it fallthrough to the error case?

// fs.Infof(nil, fmt.Sprintln(v...))
switch level {
case gofakes3.LogErr:
fs.Errorf(nil, fmt.Sprintln(v...))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that those nil should probably be something like "serve s3" so the logs have a context?

return
}

func RandString(n int) string {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a re-implementation of lib/random.String()?


Please note that some clients may require HTTPS endpoints.
See [#SSL](#ssl-tls) for SSL configuration.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice (future work) to get this to support the auth proxy. That allows different login credentials to choose a different vfs under the control of a short program which opens up the use of the rclone servers amazingly.


switch b := node.(type) {
case vfs.Node:
o = b.DirEntry().(fs.Object)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that o may be nil here.

This happens when a file is in the process of being uploaded but hasn't made it to the storage yet.

In fact I worked around this exact problem in rclone serve sftp here 7fc573d

case vfs.Node:
o = b.DirEntry().(fs.Object)
case fs.DirEntry:
o = b.(fs.Object)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use case fs.Object then you can avoid a potential panic?

case fs.Object:
    o = b

@ncw ncw modified the milestones: v1.64, v1.65 Sep 11, 2023
@wizpresso-steve-cy-fan
Copy link

wizpresso-steve-cy-fan commented Sep 12, 2023

It seems like the operation is single-threaded. If I have a long blocking task for mc du, then all the subsequent requests are also blocked.

Seems like the current copy feature is not optimal. If I have a super large, deep folders with lots of items (40 directories, more than 10K files each), then the whole copy operation would stuck on LIST BUCKET.

Also, I suggest removing the lock entirely, and turn to optimistic concurrency control instead, since S3 is an eventual consistency file provider, and strong guarantee of things like bucket listing or whether there are new updates or not doesn't matter.

@ncw
Copy link
Member

ncw commented Sep 12, 2023

@individual-it we missed v1.64 but there is always v1.65! Did you see the review comments above?

@individual-it
Copy link
Contributor Author

@ncw yes seen that and @saw-jan is working on it. Some of the comments will be tricky because they are from the original PR and we are not sure why they have been implemented that way, but will try to work it out.

@saw-jan
Copy link
Contributor

saw-jan commented Sep 12, 2023

@ncw I have made some requested changes. Could you review it again, please?

return result, err
}

hasher := md5.New()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was looking at the memory usage on Mikubill's original PR, and noticed that this hash object never has the Sum() method called, so it soaks up memory and CPU calculating the MD5sum but never actually collects it for any purpose. There was an observation that the memory usage rises as an object is uploaded, up to 2x object size, and then drops back down - I suspect this MD5 object contributes to it.

Maybe just remove it and see how it effects memory usage? There doesn't seem to be any place where Sum() would be used in the current code anyway.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sychan I removed the hasher. Please, have a look

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks!

@ncw
Copy link
Member

ncw commented Sep 22, 2023

Let me know when you are ready for another review.

I'd like to merge this for v1.65. Maybe we'll mark it as experimental to start with.

@saw-jan
Copy link
Contributor

saw-jan commented Sep 25, 2023

@ncw PR is ready for another round of review

@maximilize
Copy link

I tested this feature with quite large amount of data, serving it via s3. Everything worked well so far. Would be great to see this merged anytime soon.

@saw-jan
Copy link
Contributor

saw-jan commented Nov 13, 2023

Tests failed in windows. What does it suggest?

--- FAIL: TestMkdirAllOnDrive (0.01s)
    mkdir_windows_test.go:82: 
        	Error Trace:	D:/a/rclone/rclone/lib/file/mkdir_windows_test.go:82
        	            				D:/a/rclone/rclone/lib/file/mkdir_windows_test.go:117
        	Error:      	Received unexpected error:
        	            	mkdir \\?\: The filename, directory name, or volume label syntax is incorrect.
        	Test:       	TestMkdirAllOnDrive
--- FAIL: TestMkdirAllOnUnusedDrive (0.00s)
    mkdir_windows_test.go:92: 
        	Error Trace:	D:/a/rclone/rclone/lib/file/mkdir_windows_test.go:92
        	            				D:/a/rclone/rclone/lib/file/mkdir_windows_test.go:97
        	            				D:/a/rclone/rclone/lib/file/mkdir_windows_test.go:135
        	Error:      	Should be true
        	Test:       	TestMkdirAllOnUnusedDrive
        	Messages:   	mkdir \\?\: The filename, directory name, or volume label syntax is incorrect.
    mkdir_windows_test.go:92: 
        	Error Trace:	D:/a/rclone/rclone/lib/file/mkdir_windows_test.go:92
        	            				D:/a/rclone/rclone/lib/file/mkdir_windows_test.go:98
        	            				D:/a/rclone/rclone/lib/file/mkdir_windows_test.go:135
        	Error:      	Should be true
        	Test:       	TestMkdirAllOnUnusedDrive
        	Messages:   	mkdir \\?\: The filename, directory name, or volume label syntax is incorrect.
    mkdir_windows_test.go:92: 
        	Error Trace:	D:/a/rclone/rclone/lib/file/mkdir_windows_test.go:92
        	            				D:/a/rclone/rclone/lib/file/mkdir_windows_test.go:99
        	            				D:/a/rclone/rclone/lib/file/mkdir_windows_test.go:135
        	Error:      	Should be true
        	Test:       	TestMkdirAllOnUnusedDrive
        	Messages:   	mkdir \\?\: The filename, directory name, or volume label syntax is incorrect.
    mkdir_windows_test.go:92: 
        	Error Trace:	D:/a/rclone/rclone/lib/file/mkdir_windows_test.go:92
        	            				D:/a/rclone/rclone/lib/file/mkdir_windows_test.go:100
        	            				D:/a/rclone/rclone/lib/file/mkdir_windows_test.go:135
        	Error:      	Should be true
        	Test:       	TestMkdirAllOnUnusedDrive
        	Messages:   	mkdir \\?\: The filename, directory name, or volume label syntax is incorrect.
    mkdir_windows_test.go:92: 
        	Error Trace:	D:/a/rclone/rclone/lib/file/mkdir_windows_test.go:92
        	            				D:/a/rclone/rclone/lib/file/mkdir_windows_test.go:101
        	            				D:/a/rclone/rclone/lib/file/mkdir_windows_test.go:135
        	Error:      	Should be true
        	Test:       	TestMkdirAllOnUnusedDrive
        	Messages:   	mkdir \\?\: The filename, directory name, or volume label syntax is incorrect.
    mkdir_windows_test.go:92: 
        	Error Trace:	D:/a/rclone/rclone/lib/file/mkdir_windows_test.go:92
        	            				D:/a/rclone/rclone/lib/file/mkdir_windows_test.go:102
        	            				D:/a/rclone/rclone/lib/file/mkdir_windows_test.go:135
        	Error:      	Should be true
        	Test:       	TestMkdirAllOnUnusedDrive
        	Messages:   	mkdir \\?\: The filename, directory name, or volume label syntax is incorrect.
FAIL
FAIL	github.com/rclone/rclone/lib/file	0.092s

@ncw
Copy link
Member

ncw commented Nov 13, 2023

Tests failed in windows. What does it suggest?

These are failing in master... Something has changed in the CI environment. I will fix this separately!

I'm planning to try to merge this PR this week. I'll start by pulling it locally and giving it a good test!

@ncw
Copy link
Member

ncw commented Nov 13, 2023

I have pulled this locally and I'm polishing it so please don't add any more commits.

There appears to be a problem with upload - if you upload a 1G file then rclone serve s3 grows in memory use by 1G so I guess these are being buffered in memory. This doesn't appear to be in the code here, so it must be in gofakes. This isn't a barrier to the PR being merged but we should try to fix it!

Also, does anyone know why there is so much locking in backend.go? This effectively serializes the uploads so only one happens at a time. I might try removing it in a bit!

Otherwise this is looking excellent :-) I will merge it with an experimental tag in due course.

@ncw
Copy link
Member

ncw commented Nov 14, 2023

I have finished my polishing of this code!

v1.65.0-beta.7484.b80be190d.pr-7062-serve-s3 on branch pr-7062-serve-s3 (uploaded in 15-30 mins)

For some reason I couldn't push it back to this PR like I normally do.

The main changes are in b80be19

- Changes
    - Rename `--s3-authkey` to `--auth-key` to get it out of the s3 backend namespace
    - Enable `Content-MD5` integrity checks
    - Remove locking after code audit
- Documentation
    - Factor out documentation into seperate file
    - Add Quickstart to docs
    - Add Bugs section to docs
    - Add experimental tag to docs
    - Add rclone provider to s3 backend docs
- Fixes
    - Correct quirks in s3 backend
    - Change fmt.Printlns into fs.Logs
    - Make metadata storage per backend not global
    - Log on startup if anonymous access is enabled
- Coding style fixes
    - rename fs to vfs to save confusion with the rest of rclone code
    - rename db to b for *s3Backend

I wrote a lot of documentation and did a lot of testing and I'm very happy with the code - well done everyone :-)

We need to fix this at some point (from the docs)

When uploading multipart files serve s3 holds all the parts in
memory. This is a limitaton of the library rclone uses for serving S3
and will hopefully be fixed at some point.

At some point, but I put a workaround in for rclone so you can disable multipart uploads when using rclone as a client for this server.

I propose to merge this - any objections?

We can add more PRs on top, but let's get this main part of the work merged for v1.65.

@individual-it
Copy link
Contributor Author

I would be very happy if it could get merged (any maybe tested by a wider audience)

@wizpresso-steve-cy-fan
Copy link

Also, does anyone know why there is so much locking in backend.go? This effectively serializes the uploads so only one happens at a time. I might try removing it in a bit!

I guess it is to prevent TOCTOU, I already raised this concern before

@ncw ncw closed this in 93f35c9 Nov 16, 2023
@ncw
Copy link
Member

ncw commented Nov 16, 2023

I've merged this to master now which means it will be in the latest beta in 15-30 minutes and released in v1.65

Thank you everyone for contributing - it is a great feature.

@wizpresso-steve-cy-fan I did remove the locking. I think it is OK for an eventually consistent system like S3 which doesn't give guarantees like a file system does.

@sychan
Copy link

sychan commented Nov 16, 2023

There appears to be a problem with upload - if you upload a 1G file then rclone serve s3 grows in memory use by 1G so I guess these are being buffered in memory. This doesn't appear to be in the code here, so it must be in gofakes. This isn't a barrier to the PR being merged but we should try to fix it!

The memory usage used to increase by 2x the upload size until the an unused md5 hasher object was removed, now I guess it only increases by 1x object size. A quick poke through the gofakes3 code seems to show that on an S3 PUT, they read the whole object into memory, which is probably why you're still seeing a lot of memory usage. This new capability will be great for downloading over S3, but pretty limited for uploading larger objects.

https://github.com/johannesboyne/gofakes3/blob/master/backend/s3mem/backend.go#L224

@ncw
Copy link
Member

ncw commented Nov 16, 2023

@sychan - it is only multipart uploads which use a lot of extra memory. I tweaked the Rclone provider in the s3 backend so it disables them. So you can do rclone -> rclone transfers without using mega memory.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants