New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: handle special filenames on various backends #3148
Conversation
Apologies for the delay in responding - I had a bit of a backlog! This is an amazing work - thank you :-)
That is brilliant.
I'm very happy to help with the documentation though looking at it I think you've done a good job. (BTW did you use a tool to generate the markdown tables of encodings?) I think one thing we should prepare is a document explaining what users might expect in the way of backwards incompatibility when moving from the old system to the new system. This will be dependent on OS and backend of course. I read through the code. A lot of it is very straightforward. I think the biggest risk for backwards incompatibilities is probably in the local backend. Thanks for getting rid of I'm just running the integration tests for the branch - I'll post a link when they are done. |
Here are the results of the integration test: https://pub.rclone.org/integration-tests/2019-05-13-170215/ You can ignore Mega/DirRename - that was failing already. The jottacloud one looks relevant
The opendrive one appears to be because of missing spaces at the start of files.
|
Don't worry, I was still working on it in the meantime 😄 It also took me longer to do the next iteration of testing/fixing.
No, they are build by hand, with the help of vscode multi cursors.
Definitely, this is the part I didn't write yet. I expect the backwards incompatibility to be minimal. For some remotes the encoding isn't matching the restricted set of characters perfectly, When there is this kind of imperfection for the encoding, duplicate filenames might still occur like before,
It's not needed anymore, so it was time for it to say goodbye 😄
I thought that I caught these during my tests, but those slipped through. I rewrote the test result handling and updated the Google Spreadsheet with the new results. This rewrite also includes a That change revealed a few other missed cases, which should be fixed now. I also ran the |
:-)
They are very nice ::-)
I think that the invalid names now just working is probably the most important part. I thought of another doc we should write - adding something to the writing a new backend section on testing what character encoding the new backend uses and how to add that info to the backend.
Could we change the encoding dynamically? We know we are using a onedrive for business after we have instantiated the Fs object.
:-(
Great! I see the script in
I don't fully understand the 'sheet here. In the "With Encoder" tag there are now quite a lot of "REN" meaning Renamed - I would have thought that is good? Though they mostly seem to be coloured Red. |
No wonder I met a problem! I hope this pr will be merged. |
I'm planning the 1.48 release at the weekend then I'd like to get this merged shortly after. What do you think @B4dM4n ? |
I'm going to take over this Pull request. I've rebased it on master and fixed the conflicts. I've written an itnegration test for it also and I'll run through the backends converting them :-) Thanks for your hard work @B4dM4n - I'll finish it off. |
257155e
to
69037c7
Compare
Here is a beta for anyone wanting to test https://beta.rclone.org/branch/v1.48.0-185-gd4053c53-filename-encode-beta/ (uploaded in 15-30 mins) |
@ncw I found some same error when I upload my files for test, this is detail: context
ps: I tried sync files without 'crypt', found it has not error. error files: sorry I can't found name rule, I tried to rename them more shortly, but it will cause
version
OS
config
first sync log
second sync log
|
I set up my system just like yours with crypt, onedrive for business and I couldn't replicate the problem :-( After you do the upload can you use Also you should be using |
@ncw thanks for your reply, this time I tried to supplement my questions.
It has no help. I tried more context, found more information. Maybe it's
It's really strange, isn't it?
this is new log with error files
config
first sync log
lsf log(found name error)
second sync log (from the log with start of
lsf log(no result, file was deleted)
|
Ah, I see the problem there... I will work on a fix @imfms |
Great. the problem is so hidden. I thought it was my own problem. |
What appears to have happened is that your files got uploaded with a different name, hence rclone didn't think they existed and re-uploaded them.
These are not the same
What appears to have happened is that some unicode characters have changed. so I can't replicate this on my onedrive for business, your test works fine for me :-(
Could this be a setting on your onedrive? Can you set it to UTF-8 encoding? |
I will do more test on diff account and env, try to find thr error rule :) |
9ba25d4
to
9feceae
Compare
…ile name This tests the encoder is working properly
Co-authored-by: Nick Craig-Wood <nick@craig-wood.com>
Co-authored-by: Nick Craig-Wood <nick@craig-wood.com>
Co-authored-by: Nick Craig-Wood <nick@craig-wood.com>
Co-authored-by: Nick Craig-Wood <nick@craig-wood.com>
Co-authored-by: Nick Craig-Wood <nick@craig-wood.com>
This works around a bug in Ceph which doesn't encode CommonPrefixes when using URL encoded directory listings. See: https://tracker.ceph.com/issues/41870
Co-authored-by: Nick Craig-Wood <nick@craig-wood.com>
Co-authored-by: Nick Craig-Wood <nick@craig-wood.com>
9feceae
to
b9bd15a
Compare
What is the purpose of this change?
This ist the follow up of #2955. It uses the
lib/encoder
package to translate filenames on backends with a restricted set of usable characters.It defines the set of all available backend encodings in
fs/encodings/encodings.go
and updates the existing backends to use them.As a quick test the change also includes a new flag for
rclone mount
,--vfs-name-encoding
, which can be used to change the mounted filenames. For example setting--vfs-name-encoding=local-windows
on a Linux platform will encode all special Windows filename characters and allow the mount to be shared via SMB to Windows clients with all filenames readable.Backends completed:
rclone info
Backend encoding added as determined by the last test, but not implemented:
I'm currently not sure if the documentation is detailed enough to explain the mechanics. Any help to improve it is welcome.
This PR is not meant to merged as one, but as a central issue to track the progress.
Otherwise it should be ready to be tested.
It is available as a beta under (https://beta.rclone.org/branch/) with name like
v1.47.0-???-g????????-filename-encode-beta
.Was the change discussed in an issue or in the forum before?
Fixes:
:
causes OneDrive remote files to be deleted on sync Look-alike character:
causes OneDrive remote files to be deleted on sync #3128