Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: handle special filenames on various backends #3148

Merged
merged 40 commits into from Sep 30, 2019
Merged

WIP: handle special filenames on various backends #3148

merged 40 commits into from Sep 30, 2019

Conversation

B4dM4n
Copy link
Collaborator

@B4dM4n B4dM4n commented Apr 30, 2019

What is the purpose of this change?

This ist the follow up of #2955. It uses the lib/encoder package to translate filenames on backends with a restricted set of usable characters.

It defines the set of all available backend encodings in fs/encodings/encodings.go and updates the existing backends to use them.

As a quick test the change also includes a new flag for rclone mount, --vfs-name-encoding, which can be used to change the mounted filenames. For example setting --vfs-name-encoding=local-windows on a Linux platform will encode all special Windows filename characters and allow the mount to be shared via SMB to Windows clients with all filenames readable.

Backends completed:

  • box
  • drive
  • dropbox
  • jottacloud: changes the set of replaced characters, based on tests with rclone info
  • local: includes a rewrite of path cleaning on Windows. Should be equal to the old implementation, but much simpler to read.
  • onedrive
  • opendrive

Backend encoding added as determined by the last test, but not implemented:

  • s3: should not require to encode CTL and invalid utf-8 bytes
  • pcloud
  • mega
  • b2

I'm currently not sure if the documentation is detailed enough to explain the mechanics. Any help to improve it is welcome.

This PR is not meant to merged as one, but as a central issue to track the progress.
Otherwise it should be ready to be tested.
It is available as a beta under (https://beta.rclone.org/branch/) with name like v1.47.0-???-g????????-filename-encode-beta.

Was the change discussed in an issue or in the forum before?

Fixes:

@ncw
Copy link
Member

ncw commented May 13, 2019

Apologies for the delay in responding - I had a bit of a backlog!

This is an amazing work - thank you :-)

As a quick test the change also includes a new flag for rclone mount, --vfs-name-encoding, which can be used to change the mounted filenames. For example setting --vfs-name-encoding=local-windows on a Linux platform will encode all special Windows filename characters and allow the mount to be shared via SMB to Windows clients with all filenames readable

That is brilliant.

I'm currently not sure if the documentation is detailed enough to explain the mechanics. Any help to improve it is welcome.

I'm very happy to help with the documentation though looking at it I think you've done a good job.

(BTW did you use a tool to generate the markdown tables of encodings?)

I think one thing we should prepare is a document explaining what users might expect in the way of backwards incompatibility when moving from the old system to the new system. This will be dependent on OS and backend of course.

I read through the code. A lot of it is very straightforward. I think the biggest risk for backwards incompatibilities is probably in the local backend.

Thanks for getting rid of WinPath in fstest - that was a terrible idea in retrospect ;-)

I'm just running the integration tests for the branch - I'll post a link when they are done.

@ncw
Copy link
Member

ncw commented May 13, 2019

Here are the results of the integration test: https://pub.rclone.org/integration-tests/2019-05-13-170215/

You can ignore Mega/DirRename - that was failing already.

The jottacloud one looks relevant

HTTP error 400 (400 Bad Request) returned body: "{"code":400,"message":"Illegal character in name of file/path; name=hello? sausage","cause":"","error_id":"IllegalArgumentException","x-id":"b2-onXCcaQjSgCJ"}"

The opendrive one appears to be because of missing spaces at the start of files.

expected: []string{"hello_ sausage/êé/Hello, 世界/ _ ' @ _ _ & _ + ≠"}
actual  : []string{"hello_ sausage/êé/Hello, 世界/_ ' @ _ _ & _ + ≠"}

@B4dM4n
Copy link
Collaborator Author

B4dM4n commented May 19, 2019

Apologies for the delay in responding - I had a bit of a backlog!

Don't worry, I was still working on it in the meantime 😄

It also took me longer to do the next iteration of testing/fixing.

(BTW did you use a tool to generate the markdown tables of encodings?)

No, they are build by hand, with the help of vscode multi cursors.

I think one thing we should prepare is a document explaining what users might expect in the way of backwards incompatibility when moving from the old system to the new system. This will be dependent on OS and backend of course.

Definitely, this is the part I didn't write yet. I expect the backwards incompatibility to be minimal.
Only names that contain the Unicode replacement or the quote character should change.
Otherwise only invalid names will now just work.

For some remotes the encoding isn't matching the restricted set of characters perfectly,
e.g. the OneDrive Business only restrictions are applied to the regular accounts also.

When there is this kind of imperfection for the encoding, duplicate filenames might still occur like before,
leading to files being copied of removed when unintended.

Thanks for getting rid of WinPath in fstest - that was a terrible idea in retrospect ;-)

It's not needed anymore, so it was time for it to say goodbye 😄

I'm just running the integration tests for the branch - I'll post a link when they are done.

The jottacloud one looks relevant
The opendrive one appears to be because of missing spaces at the start of files.

I thought that I caught these during my tests, but those slipped through.
Looking at the test setup again, I can see now why those got lost.
Using bash scripts to handle ASCII control characters and invalid Unicode bytes wasn't the best idea.

I rewrote the test result handling and updated the Google Spreadsheet with the new results. This rewrite also includes a noencode build tag, which replaces every encoder with a minimal version (only 0x00 / . and ..).
The results table is now completely generated from the info output using a Go script.

That change revealed a few other missed cases, which should be fixed now.

I also ran the info command with the encoders enabled and added the results to the spreadsheet. They revealed that the encoding is not working perfectly yet, which is the next problem I will be working on.

@ncw
Copy link
Member

ncw commented May 20, 2019

Apologies for the delay in responding - I had a bit of a backlog!

Don't worry, I was still working on it in the meantime

:-)

It also took me longer to do the next iteration of testing/fixing.

(BTW did you use a tool to generate the markdown tables of encodings?)

No, they are build by hand, with the help of vscode multi cursors.

They are very nice ::-)

I think one thing we should prepare is a document explaining what users might expect in the way of backwards incompatibility when moving from the old system to the new system. This will be dependent on OS and backend of course.

Definitely, this is the part I didn't write yet. I expect the backwards incompatibility to be minimal.
Only names that contain the Unicode replacement or the quote character should change.
Otherwise only invalid names will now just work.

I think that the invalid names now just working is probably the most important part.

I thought of another doc we should write - adding something to the writing a new backend section on testing what character encoding the new backend uses and how to add that info to the backend.

For some remotes the encoding isn't matching the restricted set of characters perfectly,
e.g. the OneDrive Business only restrictions are applied to the regular accounts also.

When there is this kind of imperfection for the encoding, duplicate filenames might still occur like before,
leading to files being copied of removed when unintended.

Could we change the encoding dynamically? We know we are using a onedrive for business after we have instantiated the Fs object.

I'm just running the integration tests for the branch - I'll post a link when they are done.
The jottacloud one looks relevant
The opendrive one appears to be because of missing spaces at the start of files.

I thought that I caught these during my tests, but those slipped through.
Looking at the test setup again, I can see now why those got lost.
Using bash scripts to handle ASCII control characters and invalid Unicode bytes wasn't the best idea.

:-(

I rewrote the test result handling and updated the Google Spreadsheet with the new results. This rewrite also includes a noencode build tag, which replaces every encoder with a minimal version (only 0x00 / . and ..).
The results table is now completely generated from the info output using a Go script.

Great! I see the script in cmd/info/internal/build_csv/main.go.

That change revealed a few other missed cases, which should be fixed now.

I also ran the info command with the encoders enabled and added the results to the spreadsheet. They revealed that the encoding is not working perfectly yet, which is the next problem I will be working on.

I don't fully understand the 'sheet here. In the "With Encoder" tag there are now quite a lot of "REN" meaning Renamed - I would have thought that is good? Though they mostly seem to be coloured Red.

@ghost
Copy link

ghost commented Jun 13, 2019

No wonder I met a problem! I hope this pr will be merged.

@ncw
Copy link
Member

ncw commented Jun 13, 2019

I'm planning the 1.48 release at the weekend then I'd like to get this merged shortly after. What do you think @B4dM4n ?

@ncw
Copy link
Member

ncw commented Jul 25, 2019

I'm going to take over this Pull request.

I've rebased it on master and fixed the conflicts. I've written an itnegration test for it also and I'll run through the backends converting them :-)

Thanks for your hard work @B4dM4n - I'll finish it off.

@ncw
Copy link
Member

ncw commented Aug 11, 2019

Here is a beta for anyone wanting to test

https://beta.rclone.org/branch/v1.48.0-185-gd4053c53-filename-encode-beta/ (uploaded in 15-30 mins)

@imfms
Copy link

imfms commented Aug 13, 2019

@ncw I found some same error when I upload my files for test, this is detail:

context

subcommand = sync
backend = onedrive/crypt
filename_encryption = obfuscate
directory_name_encryption = true

ps: I tried sync files without 'crypt', found it has not error.

error files: sorry I can't found name rule, I tried to rename them more shortly, but it will cause
lost error.

独播:窦唯乐手“血腥”情歌献日本女友[全高清版].mp4
G.E.M. 邓紫棋-死了都要 • 爱.flac
Like Nobody’s Around - Big Time Rush.lrc

version

rclone v1.48.0-185-gd4053c53-filename-encode-beta
- os/arch: linux/amd64
- go version: go1.12.7

OS

LSB Version:    core-2.0-noarch:core-3.2-noarch:core-4.0-noarch:core-2.0-x86_64:core-3.2-x86_64:core-4.0-x86_64:desktop-4.0.fake-amd64:desktop-4.0.fake-noarch:graphics-2.0-amd64:graphics-2.0-noarch:graphics-3.2-amd64:graphics-3.2-noarch:graphics-4.0.fake-amd64:graphics-4.0.fake-noarch
Distributor ID: openSUSE
Description:    openSUSE Tumbleweed
Release:        20190708
Codename:       n/a

config

[product]
type = onedrive
client_id = *
client_secret = *
token = *
drive_id = *
drive_type = business

[betatest2]
type = crypt
remote = product:betatest2
filename_encryption = obfuscate
directory_name_encryption = true
password = *
password2 = *

first sync log

DEBUG : rclone: Version "v1.48.0-185-gd4053c53-filename-encode-beta" starting with parameters ["rclone" "sync" "--create-empty-src-dirs" "/mnt/data/_temp/test" "betatest2://" "-P" "-vv"]
DEBUG : Using config file from "/home/f_ms/.config/rclone/rclone.conf"
INFO  : Encrypted drive 'betatest2://': Waiting for checks to finish
INFO  : Encrypted drive 'betatest2://': Waiting for transfers to finish
DEBUG : 15.L.J.R. 邟紷棗-殇互选覍  爽.kqfh: Starting multipart upload
DEBUG : 132.hEGA jKxKzU‛O WNKQJz - XEC pEIA nQOD.HNy: Starting multipart upload
DEBUG : 19.狼撽J窶唿习扛衐腵惕歜猾旵朼妃叛[典髨渕牘].vy9: Starting multipart upload
DEBUG : 19.狼撽J窶唿习扛衐腵惕歜猾旵朼妃叛[典髨渕牘].vy9: Uploading segment 0/59 size 59
DEBUG : 15.L.J.R. 邟紷棗-殇互选覍  爽.kqfh: Uploading segment 0/59 size 59
DEBUG : 132.hEGA jKxKzU‛O WNKQJz - XEC pEIA nQOD.HNy: Uploading segment 0/59 size 59
INFO  : G.E.M. 邓紫棋-死了都要 • 爱.flac: Copied (new)
INFO  : 独播:窦唯乐手“血腥”情歌献日本女友[全高清版].mp4: Copied (new)
INFO  : Like Nobody’s Around - Big Time Rush.lrc: Copied (new)
INFO  : Waiting for deletions to finish

Transferred:           177 / 177 Bytes, 100%, 16 Bytes/s, ETA 0s
Errors:                 0
Checks:                 0 / 0, -
Transferred:            3 / 3, 100%
Elapsed time:       10.4s
2019/08/13 20:21:22 INFO  : 
Transferred:           177 / 177 Bytes, 100%, 16 Bytes/s, ETA 0s
Errors:                 0
Checks:                 0 / 0, -
Transferred:            3 / 3, 100%
Elapsed time:       10.4s

DEBUG : 12 go routines active
DEBUG : rclone: Version "v1.48.0-185-gd4053c53-filename-encode-beta" finishing with parameters ["rclone" "sync" "--create-empty-src-dirs" "/mnt/data/_temp/test" "betatest2://" "-P" "-vv"]

second sync log

DEBUG : rclone: Version "v1.48.0-185-gd4053c53-filename-encode-beta" starting with parameters ["rclone" "sync" "--create-empty-src-dirs" "/mnt/data/_temp/test" "betatest2://" "-P" "-vv"]
DEBUG : Using config file from "/home/f_ms/.config/rclone/rclone.conf"
INFO  : Encrypted drive 'betatest2://': Waiting for checks to finish
INFO  : Encrypted drive 'betatest2://': Waiting for transfers to finish
DEBUG : 15.L.J.R. 邟紷棗-殇互选覍  爽.kqfh: Starting multipart upload
DEBUG : 132.hEGA jKxKzU‛O WNKQJz - XEC pEIA nQOD.HNy: Starting multipart upload
DEBUG : 19.狼撽J窶唿习扛衐腵惕歜猾旵朼妃叛[典髨渕牘].vy9: Starting multipart upload
DEBUG : 15.L.J.R. 邟紷棗-殇互选覍  爽.kqfh: Uploading segment 0/59 size 59
DEBUG : 132.hEGA jKxKzU‛O WNKQJz - XEC pEIA nQOD.HNy: Uploading segment 0/59 size 59
DEBUG : 19.狼撽J窶唿习扛衐腵惕歜猾旵朼妃叛[典髨渕牘].vy9: Uploading segment 0/59 size 59
INFO  : G.E.M. 邓紫棋-死了都要 • 爱.flac: Copied (new)
INFO  : Like Nobody’s Around - Big Time Rush.lrc: Copied (new)
INFO  : 独播:窦唯乐手“血腥”情歌献日本女友[全高清版].mp4: Copied (new)
INFO  : Waiting for deletions to finish
INFO  : G.E.M. 邓紫棋-死了都要  爱.flac: Deleted
INFO  : Like Nobody’’s Around - Big Time Rush.lrc: Deleted
INFO  : 独播:窦唯乐手血腥情歌献日本女友[全高清版].mp4: Deleted

Transferred:           177 / 177 Bytes, 100%, 45 Bytes/s, ETA 0s
Errors:                 0
Checks:                 3 / 3, 100%
Transferred:            3 / 3, 100%
Elapsed time:        3.9s
2019/08/13 20:22:43 INFO  : 
Transferred:           177 / 177 Bytes, 100%, 45 Bytes/s, ETA 0s
Errors:                 0
Checks:                 3 / 3, 100%
Transferred:            3 / 3, 100%
Elapsed time:        3.9s

DEBUG : 18 go routines active
DEBUG : rclone: Version "v1.48.0-185-gd4053c53-filename-encode-beta" finishing with parameters ["rclone" "sync" "--create-empty-src-dirs" "/mnt/data/_temp/test" "betatest2://" "-P" "-vv"]

@ncw
Copy link
Member

ncw commented Aug 13, 2019

I found some same error when I upload my files for test, this is detail:

I set up my system just like yours with crypt, onedrive for business and I couldn't replicate the problem :-(

After you do the upload can you use rclone lsf betatest2: to see that the files have arrived properly?

Also you should be using betatest: not betatest:// - I don't think this will fix it though.

@imfms
Copy link

imfms commented Aug 14, 2019

@ncw thanks for your reply, this time I tried to supplement my questions.


Also you should be using betatest: not betatest:// - I don't think this will fix it though.

It has no help.


I tried more context, found more information. Maybe it's obfuscate's question?

  • (only change this, not include others) filename_encryption = standard will has no error
  • (only change this, not include others) change to other password will has no error, this time I will attach my encrypt password.

It's really strange, isn't it?


After you do the upload can you use rclone lsf betatest2: to see that the files have arrived properly?

this is new log with lsf and encrypt password:

error files

独播:窦唯乐手“血腥”情歌献日本女友[全高清版].mp4
G.E.M. 邓紫棋-死了都要 • 爱.flac
Like Nobody’s Around - Big Time Rush.lrc

config

[product]
type = onedrive
client_id = *
client_secret = *
token = *
drive_id = *
drive_type = business

[betatest2]
type = crypt
remote = product:betatest2
filename_encryption = obfuscate
directory_name_encryption = true
password = vv05-agfqdci02spfakubgqiwavl5c9k2ab9rfxtxdxg_eblh85-qfo-dlya4hyww3zq3unf7junp7mzoq5nqniytt2aiii90z7ydmbim_8ajf6bxbhmo_tof62gn7ojbngyriyku6lmvumpy1ijm-oyxptrldwilu8o_iy9dhp05uybz4qge-m0gp37ig0ui2brla-ygbnwu48h1ln7z6ntxj3qgoq0jpavao74y9h_cyjhm8pwv-ymdgemqwu_sx-jjbg5abo6flcde_jpm4rgp_31c4mkotob3tpn26duiaufleic0x1pjtpiym5n1xytyywgtnvq-pgcz0bnmel5n57qcr55aflspqp0
password2 = mnvwm9piwmzierppo9fnhofasqb0f16yhjaet3rwjwtijdrmhqwds82ybfintnys_krjaorcujg1lyt70ahlhtaylhmy6efzhmo_kuhhvns7jaxbo65yojwft8rntfgjh9elyaxk_mmh4dukiefup8jk2r40__jxdvsvtezem_6q9eruc_52d42f0y8435gj472vhwewpccnucowjvsuqxxk8lmazsnpqthdgkh1lssn5azggphjvfpqk8iiyczyftknh_mwsdtcf8_wpiqrnaaprdva7yfwgnzh0w4dimyupdnpreblq7yyppvdvcamccap25gm5-alxq0d0hiyuvfl4eeuuto3k5tcieip

first sync log

DEBUG : rclone: Version "v1.48.0-185-gd4053c53-filename-encode-beta" starting with parameters ["rclone" "sync" "--create-empty-src-dirs" "/mnt/data/_temp/test" "betatest2:" "-P" "-vv"]
DEBUG : Using config file from "/home/f_ms/.config/rclone/rclone.conf"
DEBUG : One drive root 'betatest2': Token expired but no uploads in progress - doing nothing
DEBUG : product: Loaded invalid token from config file - ignoring
DEBUG : Keeping previous permissions for config file: -rw-r--r--
DEBUG : product: Saved new token in config file
INFO  : Encrypted drive 'betatest2:': Waiting for checks to finish
INFO  : Encrypted drive 'betatest2:': Waiting for transfers to finish
DEBUG : 132.hEGA jKxKzU‛O WNKQJz - XEC pEIA nQOD.HNy: Starting multipart upload
DEBUG : 15.L.J.R. 邟紷棗-殇互选覍  爽.kqfh: Starting multipart upload
DEBUG : 19.狼撽J窶唿习扛衐腵惕歜猾旵朼妃叛[典髨渕牘].vy9: Starting multipart upload
DEBUG : 19.狼撽J窶唿习扛衐腵惕歜猾旵朼妃叛[典髨渕牘].vy9: Uploading segment 0/59 size 59
DEBUG : 15.L.J.R. 邟紷棗-殇互选覍  爽.kqfh: Uploading segment 0/59 size 59
DEBUG : 132.hEGA jKxKzU‛O WNKQJz - XEC pEIA nQOD.HNy: Uploading segment 0/59 size 59
INFO  : G.E.M. 邓紫棋-死了都要 • 爱.flac: Copied (new)
INFO  : 独播:窦唯乐手“血腥”情歌献日本女友[全高清版].mp4: Copied (new)
INFO  : Like Nobody’s Around - Big Time Rush.lrc: Copied (new)
INFO  : Waiting for deletions to finish
INFO  : 
Transferred:           177 / 177 Bytes, 100%, 26 Bytes/s, ETA 0s
Errors:                 0
Checks:                 0 / 0, -
Transferred:            3 / 3, 100%
Elapsed time:        6.5s

DEBUG : 15 go routines active
DEBUG : rclone: Version "v1.48.0-185-gd4053c53-filename-encode-beta" finishing with parameters ["rclone" "sync" "--create-empty-src-dirs" "/mnt/data/_temp/test" "betatest2:" "-P" "-vv"]

lsf log(found name error)

DEBUG : rclone: Version "v1.48.0-185-gd4053c53-filename-encode-beta" starting with parameters ["rclone" "lsf" "betatest2:" "-vv"]
DEBUG : Using config file from "/home/f_ms/.config/rclone/rclone.conf"

G.E.M. 邓紫棋-死了都要  爱.flac
Like Nobody’’s Around - Big Time Rush.lrc
独播:窦唯乐手血腥情歌献日本女友[全高清版].mp4

DEBUG : 7 go routines active
DEBUG : rclone: Version "v1.48.0-185-gd4053c53-filename-encode-beta" finishing with parameters ["rclone" "lsf" "betatest2:" "-vv"]

second sync log (from the log with start of ###, you will see files was deleted )

DEBUG : rclone: Version "v1.48.0-185-gd4053c53-filename-encode-beta" starting with parameters ["rclone" "sync" "--create-empty-src-dirs" "/mnt/data/_temp/test" "betatest2:" "-P" "-vv"]
DEBUG : Using config file from "/home/f_ms/.config/rclone/rclone.conf"
INFO  : Encrypted drive 'betatest2:': Waiting for checks to finish
INFO  : Encrypted drive 'betatest2:': Waiting for transfers to finish
DEBUG : 15.L.J.R. 邟紷棗-殇互选覍  爽.kqfh: Starting multipart upload
DEBUG : 19.狼撽J窶唿习扛衐腵惕歜猾旵朼妃叛[典髨渕牘].vy9: Starting multipart upload
DEBUG : 132.hEGA jKxKzU‛O WNKQJz - XEC pEIA nQOD.HNy: Starting multipart upload
DEBUG : 132.hEGA jKxKzU‛O WNKQJz - XEC pEIA nQOD.HNy: Uploading segment 0/59 size 59
DEBUG : 19.狼撽J窶唿习扛衐腵惕歜猾旵朼妃叛[典髨渕牘].vy9: Uploading segment 0/59 size 59
DEBUG : 15.L.J.R. 邟紷棗-殇互选覍  爽.kqfh: Uploading segment 0/59 size 59
INFO  : Like Nobody’s Around - Big Time Rush.lrc: Copied (new)
INFO  : 独播:窦唯乐手“血腥”情歌献日本女友[全高清版].mp4: Copied (new)
INFO  : G.E.M. 邓紫棋-死了都要 • 爱.flac: Copied (new)
INFO  : Waiting for deletions to finish
### INFO  : Like Nobody’’s Around - Big Time Rush.lrc: Deleted
### INFO  : 独播:窦唯乐手血腥情歌献日本女友[全高清版].mp4: Deleted
### INFO  : G.E.M. 邓紫棋-死了都要  爱.flac: Deleted
INFO  : 
Transferred:           177 / 177 Bytes, 100%, 48 Bytes/s, ETA 0s
Errors:                 0
Checks:                 3 / 3, 100%
Transferred:            3 / 3, 100%
Elapsed time:        3.6s

DEBUG : 18 go routines active
DEBUG : rclone: Version "v1.48.0-185-gd4053c53-filename-encode-beta" finishing with parameters ["rclone" "sync" "--create-empty-src-dirs" "/mnt/data/_temp/test" "betatest2:" "-P" "-vv"]

lsf log(no result, file was deleted)

DEBUG : rclone: Version "v1.48.0-185-gd4053c53-filename-encode-beta" starting with parameters ["rclone" "lsf" "betatest2:" "-vv"]
DEBUG : Using config file from "/home/f_ms/.config/rclone/rclone.conf"
DEBUG : 7 go routines active
DEBUG : rclone: Version "v1.48.0-185-gd4053c53-filename-encode-beta" finishing with parameters ["rclone" "lsf" "betatest2:" "-vv"]

@ncw
Copy link
Member

ncw commented Aug 14, 2019

Ah, I see the problem there... I will work on a fix @imfms

@imfms
Copy link

imfms commented Aug 14, 2019

@ncw

Ah, I see the problem there... I will work on a fix

Great. the problem is so hidden. I thought it was my own problem.

@ncw
Copy link
Member

ncw commented Aug 14, 2019

What appears to have happened is that your files got uploaded with a different name, hence rclone didn't think they existed and re-uploaded them.

> INFO  : Like Nobody’s Around - Big Time Rush.lrc: Copied (new)
> INFO  : 独播:窦唯乐手“血腥”情歌献日本女友[全高清版].mp4: Copied (new)
> INFO  : G.E.M. 邓紫棋-死了都要 • 爱.flac: Copied (new)
> INFO  : Waiting for deletions to finish
> ### INFO  : Like Nobody’’s Around - Big Time Rush.lrc: Deleted
> ### INFO  : 独播:窦唯乐手血腥情歌献日本女友[全高清版].mp4: Deleted
> ### INFO  : G.E.M. 邓紫棋-死了都要  爱.flac: Deleted

These are not the same

>>> a="Like Nobody’s Around - Big Time Rush.lrc"
>>> b="Like Nobody’’s Around - Big Time Rush.lrc"
>>> a == b
False
>>> a="独播:窦唯乐手“血腥”情歌献日本女友[全高清版].mp4"
>>> b="独播:窦唯乐手血腥情歌献日本女友[全高清版].mp4"
>>> a == b
False
>>> a="G.E.M. 邓紫棋-死了都要 • 爱.flac"
>>> b="G.E.M. 邓紫棋-死了都要  爱.flac"
>>> a == b
False
>>> 

What appears to have happened is that some unicode characters have changed.

so became ’’, and were deleted and was deleted.

I can't replicate this on my onedrive for business, your test works fine for me :-(

$ rclone lsf -R betatest2:test4
G.E.M. 邓紫棋-死了都要 • 爱.flac
Like Nobody’s Around - Big Time Rush.lrc
独播:窦唯乐手“血腥”情歌献日本女友[全高清版].mp4

$ rclone-v1.48 lsf -R betatest2:test4
G.E.M. 邓紫棋-死了都要 • 爱.flac
Like Nobody’s Around - Big Time Rush.lrc
独播:窦唯乐手“血腥”情歌献日本女友[全高清版].mp4

Could this be a setting on your onedrive? Can you set it to UTF-8 encoding?

@imfms
Copy link

imfms commented Aug 14, 2019

@ncw

I can't replicate this on my onedrive for business, your test works fine for me :-(

I will do more test on diff account and env, try to find thr error rule :)

ncw and others added 25 commits September 30, 2019 22:00
…ile name

This tests the encoder is working properly
Co-authored-by: Nick Craig-Wood <nick@craig-wood.com>
Co-authored-by: Nick Craig-Wood <nick@craig-wood.com>
Co-authored-by: Nick Craig-Wood <nick@craig-wood.com>
Co-authored-by: Nick Craig-Wood <nick@craig-wood.com>
Co-authored-by: Nick Craig-Wood <nick@craig-wood.com>
This works around a bug in Ceph which doesn't encode CommonPrefixes
when using URL encoded directory listings.

See: https://tracker.ceph.com/issues/41870
Co-authored-by: Nick Craig-Wood <nick@craig-wood.com>
Co-authored-by: Nick Craig-Wood <nick@craig-wood.com>
@ncw ncw merged commit b9bd15a into master Sep 30, 2019
@ncw ncw deleted the filename-encode branch September 30, 2019 21:28
@ncw ncw added this to the v1.50 milestone Oct 12, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants