Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Randomly, pushing files to Dropbox results in corrupted file or directory names that contain Unicode code-points #3609

Closed
cjnaz opened this issue Oct 10, 2019 · 4 comments

Comments

@cjnaz
Copy link
Contributor

cjnaz commented Oct 10, 2019

What is the problem you are having with rclone?

See attached zip with a small python program that empties out a directory, then syncs the same files back into that directory. If you run this a handful of times and watch the files on Dropbox with your browser, eventually you will find that some of the file names get corrupted. It takes about 5 to 10 runs for me. I've not done much isolation on the issue, but I suspect it has to do with creating a file/directory that recently existed. I don't see the problem when running this on Google Drive.

The first image shows the correct filenames. The next two images show two different filenames getting garbled.
image
image
image

bang.zip

What is your rclone version (output from rclone version)

$ rclone version
rclone v1.49.4

  • os/arch: linux/amd64
  • go version: go1.13.1

Which OS you are using and how many bits (eg Windows 7, 64 bit)

$ cat /etc/centos-release
CentOS Linux release 7.7.1908 (Core)

Which cloud storage system are you using? (eg Google Drive)

Dropbox

The command you were trying to run (eg rclone copy /tmp remote:tmp)

rclone sync tdir Dropbox:tdir

A log from the command with the -vv flag (eg output from rclone -vv copy /tmp remote:tmp)

The log looks the same for both passing and failing cases:

$ ./bangremote Dropbox:
2019/10/09 20:21:57 DEBUG : rclone: Version "v1.49.4" starting with parameters ["rclone" "-vv" "delete" "Dropbox:/tdir"]
2019/10/09 20:21:57 DEBUG : Using config file from "/home/xxx/.config/rclone/rclone.conf"
2019/10/09 20:21:57 DEBUG : Dropbox root 'tdir': Using root namespace "43255175"
2019/10/09 20:21:57 INFO : Waiting for deletions to finish
2019/10/09 20:21:58 INFO : filename_contains_\u011b_.txt: Deleted
2019/10/09 20:21:58 INFO : filename_contains_\u08ba_.txt: Deleted
2019/10/09 20:21:58 INFO : \u0420\u0443\u0441\u0441\u043a\u0438\u0439.txt: Deleted
2019/10/09 20:21:58 DEBUG : 4 go routines active
2019/10/09 20:21:58 DEBUG : rclone: Version "v1.49.4" finishing with parameters ["rclone" "-vv" "delete" "Dropbox:/tdir"]
2019/10/09 20:21:58 DEBUG : rclone: Version "v1.49.4" starting with parameters ["rclone" "-vv" "sync" "tdir" "Dropbox:/tdir"]
2019/10/09 20:21:58 DEBUG : Using config file from "/home/xxx/.config/rclone/rclone.conf"
2019/10/09 20:21:59 DEBUG : Dropbox root 'tdir': Using root namespace "43255175"
2019/10/09 20:22:00 INFO : Dropbox root 'tdir': Waiting for checks to finish
2019/10/09 20:22:00 INFO : Dropbox root 'tdir': Waiting for transfers to finish
2019/10/09 20:22:00 DEBUG : \u0420\u0443\u0441\u0441\u043a\u0438\u0439.txt: DropboxHash = e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 OK
2019/10/09 20:22:00 INFO : \u0420\u0443\u0441\u0441\u043a\u0438\u0439.txt: Copied (new)
2019/10/09 20:22:01 DEBUG : filename_contains_\u08ba_.txt: DropboxHash = e250580b3f333a028c41243752bbdc4d084d1c76b672ee5abdbcc76fbefa3cc3 OK
2019/10/09 20:22:01 INFO : filename_contains_\u08ba_.txt: Copied (new)
2019/10/09 20:22:01 DEBUG : filename_contains_\u011b_.txt: DropboxHash = e250580b3f333a028c41243752bbdc4d084d1c76b672ee5abdbcc76fbefa3cc3 OK
2019/10/09 20:22:01 INFO : filename_contains_\u011b_.txt: Copied (new)
2019/10/09 20:22:01 INFO : Waiting for deletions to finish
2019/10/09 20:22:01 INFO :
Transferred: 558 / 558 Bytes, 100%, 437 Bytes/s, ETA 0s
Errors: 0
Checks: 0 / 0, -
Transferred: 3 / 3, 100%
Elapsed time: 1.2s

2019/10/09 20:22:01 DEBUG : 10 go routines active
2019/10/09 20:22:01 DEBUG : rclone: Version "v1.49.4" finishing with parameters ["rclone" "-vv" "sync" "tdir" "Dropbox:/tdir"]

@ncw
Copy link
Member

ncw commented Oct 10, 2019

Thanks for reporting this.

I can replicate your issues with

  • v1.49.4
  • v1.49.5
  • the latest beta

I replicated it by using your tdir and repeating the following command line until the file names looked corrupted.

rclone -vv delete TestDropbox:tdir ; rclone -vv sync tdir/ TestDropbox:tdir ; rclone lsl TestDropbox:tdir

I managed to capture a corruption event

2019/10/10 12:40:16 DEBUG : HTTP REQUEST (req 0xc000463900)
2019/10/10 12:40:16 DEBUG : POST /2/files/upload HTTP/1.1
Host: content.dropboxapi.com
User-Agent: rclone/v1.49.5
Transfer-Encoding: chunked
Authorization: XXXX
Content-Type: application/octet-stream
Dropbox-Api-Arg: {"path":"/tdir/Русский.txt","mode":{".tag":"overwrite"},"autorename":false,"client_modified":"2000-01-01T00:00:00Z","mute":false,"strict_conflict":false}
Accept-Encoding: gzip

0

with the response

2019/10/10 12:40:17 DEBUG : HTTP RESPONSE (req 0xc000463900)
2019/10/10 12:40:17 DEBUG : HTTP/1.1 200 OK
Transfer-Encoding: chunked
Cache-Control: no-cache
Connection: keep-alive
Content-Type: application/json
Date: Thu, 10 Oct 2019 11:40:17 GMT
Pragma: no-cache
Server: nginx
Vary: Accept-Encoding
X-Dropbox-Request-Id: 3a11434e22b08d87588de53cb34c047b
X-Robots-Tag: noindex, nofollow, noimageindex
X-Server-Response-Time: 553

257
{"name": "\u00d0\u00a0\u00d1\u0083\u00d1\u0081\u00d1\u0081\u00d0\u00ba\u00d0\u00b8\u00d0\u00b9.txt", "path_lower": "/tdir/\u00f0\u00a0\u00f1\u0083\u00f1\u0081\u00f1\u0081\u00f0\u00ba\u00f0\u00b8\u00f0\u00b9.txt", "path_display": "/tdir/\u00d0\u00a0\u00d1\u0083\u00d1\u0081\u00d1\u0081\u00d0\u00ba\u00d0\u00b8\u00d0\u00b9.txt", "id": "id:qqS0afUHwS0AAAAAAAPHng", "client_modified": "2000-01-01T00:00:00Z", "server_modified": "2019-10-10T11:40:17Z", "rev": "5948cdd90ee8c082af73a", "size": 0, "is_downloadable": true, "content_hash": "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855"}
0

The path looks sensible in the upload, but if you look at what was returned then the name translates to this

>>> print("\u00d0\u00a0\u00d1\u0083\u00d1\u0081\u00d1\u0081\u00d0\u00ba\u00d0\u00b8\u00d0\u00b9.txt")
РÑÑÑкий.txt

Which is the same corruption I saw in the lsf call

      279 2000-01-01 00:00:00.000000000 filename_contains_ě_.txt
      279 2000-01-01 00:00:00.000000000 filename_contains_ࢺ_.txt
        0 2000-01-01 00:00:00.000000000 Р���кий.txt

So since Dropbox gave us back the corrupted name I conclude this must be a corruption happening at dropbox.

Here is the complete log: try1.log

Any thoughts on this @diwakergupta or suggestions as to where to report this?

@ncw ncw modified the milestones: v1.50, Known Problem Oct 10, 2019
@ncw
Copy link
Member

ncw commented Oct 10, 2019

I stuck a post about this on the dropbox developer forum.

@ncw
Copy link
Member

ncw commented Oct 10, 2019

On the forum I had this reply from Greg K

Thanks for the report! Sending non-ASCII characters in HTTP headers is not officially supported, so please make sure you're encoding any non-ASCII characters in headers as documented here:

https://www.dropbox.com/developers/reference/json-encoding

There was a change on our server stack yesterday that affected how we handled HTTP headers without proper encoding. That resulted in malformed file paths/names like the user reported in the GitHub issue. We've reverted that change, so that should be working again, but please make sure your headers get encoded properly.

So yes there is a problem at Dropbox that has now been rolled back.

However rclone is doing something wrong too.

The problem is caused by the dropbox sdk so I'll report a bug there next...

@ncw
Copy link
Member

ncw commented Oct 10, 2019

Here is the upstream issue in the Dropbox SDK: dropbox/dropbox-sdk-go-unofficial#54

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants