-
-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SrcFsType can't be used to reliably detect backend in metadata mapper #7848
Comments
Heads up @rclone/support - the "Support Contract" label was applied to this issue. |
…apper Before this change on files which have unknown length (like Google Documents) the SrcFsType would be set to "memoryFs". This change fixes the problem by getting the Copy function to pass the src Fs into a variant of Rcat. Fixes #7848
Google docs are handled in a special way in rclone. We don't know their size until after they are downloaded so we can't use the standard copy routines. Instead we use the internal equivalent of In this process rclone loses the knowledge of the source backend. I've had a go at fixing this here - can you give it a try? Note that this includes the fix for #7845 which is very relevant! v1.67.0-beta.7962.0681eb1c7.fix-7848-metadata-mapper on branch fix-7848-metadata-mapper (uploaded in 15-30 mins) |
The Now
Before
|
Using Without Note that the mapper is invoked twice for a single file.
With Note that the mapper is invoked only once.
|
It looks like the changes I made are working for Some backends (such as onedrive) can't upload a file of unknown length. You can see these backends with StreamUpload = N in the overview table. Google docs don't have a defined length - you have to download them to find out how long they are. So either we cache them in memory (if size < If we cache them in memory then you will just get the single transfer and the single invocation of the metadata mapper. If we are caching them on disk then you will get the transfer to the local disk (with metadata invocation) followed by the transfer from local disk to destination (with metadata invocation). Rclone takes some care that the metadata is correct when transferring via the local disk. In an ideal world we'd change the onedrive backend to accept streaming uploads. I had a quick review of the uploading methods and still think that you can't upload files without knowing how big they are to onedrive. There is a stack overflow question which confirms this) but you may have a better idea. This would be the best solution if possible. In the mean time I'm going to attempt to rejig the Rcat code so that it doesn't use the rclone machinery to copy an object to local disk. This will mean writing a bit more code but it would mean you'd only get one transfer. I don't think it would make the transfer less reliable. It would affect the stats slightly but I suspect currently they are quite confusing! |
…apper Before this change on files which have unknown length (like Google Documents) the SrcFsType would be set to "memoryFs". This change fixes the problem by getting the Copy function to pass the src Fs into a variant of Rcat. Fixes #7848
… twice The --metadata-mapper was being called twice for files that rclone needed to stream to disk, This happened only for: - files bigger than --upload-streaming-cutoff - on backends which didn't support PutStream This also meant that these were being logged as two transfers which was a little strange. This fixes the problem by not using operations.Copy to upload the file once it has been streamed to disk, instead using the Put method on the backend. This should have no effect on reliability of the transfers. This also tidies up the Rcat function to make it clear there are 3 ways of uploading the data and make it easy to see that it gets verified on all those paths. See #7848
I've had a go at fixing this by re-working the internals of rcat. I've run the local and onedrive integration tests with this and I think it is looking OK, but it could do with more testing. v1.67.0-beta.7967.12d4964f2.fix-7848-metadata-mapper on branch fix-7848-metadata-mapper (uploaded in 15-30 mins) |
My research turned up the same. Uploading files of unknown length does not seem possible with OneDrive/SharePoint. The fix in
|
… twice The --metadata-mapper was being called twice for files that rclone needed to stream to disk, This happened only for: - files bigger than --upload-streaming-cutoff - on backends which didn't support PutStream This also meant that these were being logged as two transfers which was a little strange. This fixes the problem by not using operations.Copy to upload the file once it has been streamed to disk, instead using the Put method on the backend. This should have no effect on reliability of the transfers as we retry Put if possible. This also tidies up the Rcat function to make the different ways of uploading the data clearer and make it easy to see that it gets verified on all those paths. See #7848
… twice The --metadata-mapper was being called twice for files that rclone needed to stream to disk, This happened only for: - files bigger than --upload-streaming-cutoff - on backends which didn't support PutStream This also meant that these were being logged as two transfers which was a little strange. This fixes the problem by not using operations.Copy to upload the file once it has been streamed to disk, instead using the Put method on the backend. This should have no effect on reliability of the transfers as we retry Put if possible. This also tidies up the Rcat function to make the different ways of uploading the data clearer and make it easy to see that it gets verified on all those paths. See #7848
Thanks for testing. I've merged this to master now which means it will be in the latest beta in 15-30 minutes and released in v1.67 |
The associated forum post URL from
https://forum.rclone.org
N/A
What is the problem you are having with rclone?
The metadata mapper receives a JSON object from rclone that contains the
SrcFsType
. We've built our mapper to detect the valuesdrive
andonedrive
so the appropriate mapping logic can be invoked, returning to rclone a JSON object compatible with the target.In some cases (e.g., a Google doc),
SrcFsType
is set to an unexpected value likeobject.memoryFs
, and we don't handle that type. I could further inspect the object to look for something likecontent-type
and infer that a value ofapplication/vnd.google-apps.document
means adrive
backend, but I'd have to identify and maintain a list of content types to look for. I'm also not sure ifobject.memoryFs
is the only oddball value I need to recognize.What is your rclone version (output from
rclone version
)Which OS you are using and how many bits (e.g. Windows 7, 64 bit)
Windows 11, 64-bit
Which cloud storage system are you using? (e.g. Google Drive)
The command you were trying to run (e.g.
rclone copy /tmp remote:tmp
)copy Source: Target:
A log from the command with the
-vv
flag (e.g. output fromrclone -vv copy /tmp remote:tmp
)How to use GitHub
The text was updated successfully, but these errors were encountered: