Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
WIP: handle special filenames on various backends #3148
What is the purpose of this change?
This ist the follow up of #2955. It uses the
It defines the set of all available backend encodings in
As a quick test the change also includes a new flag for
Backend encoding added as determined by the last test, but not implemented:
I'm currently not sure if the documentation is detailed enough to explain the mechanics. Any help to improve it is welcome.
This PR is not meant to merged as one, but as a central issue to track the progress.
Was the change discussed in an issue or in the forum before?
This was referenced
May 2, 2019
Apologies for the delay in responding - I had a bit of a backlog!
This is an amazing work - thank you :-)
That is brilliant.
I'm very happy to help with the documentation though looking at it I think you've done a good job.
(BTW did you use a tool to generate the markdown tables of encodings?)
I think one thing we should prepare is a document explaining what users might expect in the way of backwards incompatibility when moving from the old system to the new system. This will be dependent on OS and backend of course.
I read through the code. A lot of it is very straightforward. I think the biggest risk for backwards incompatibilities is probably in the local backend.
Thanks for getting rid of
I'm just running the integration tests for the branch - I'll post a link when they are done.
Here are the results of the integration test: https://pub.rclone.org/integration-tests/2019-05-13-170215/
You can ignore Mega/DirRename - that was failing already.
The jottacloud one looks relevant
The opendrive one appears to be because of missing spaces at the start of files.
Don't worry, I was still working on it in the meantime
It also took me longer to do the next iteration of testing/fixing.
No, they are build by hand, with the help of vscode multi cursors.
Definitely, this is the part I didn't write yet. I expect the backwards incompatibility to be minimal.
For some remotes the encoding isn't matching the restricted set of characters perfectly,
When there is this kind of imperfection for the encoding, duplicate filenames might still occur like before,
It's not needed anymore, so it was time for it to say goodbye
I thought that I caught these during my tests, but those slipped through.
I rewrote the test result handling and updated the Google Spreadsheet with the new results. This rewrite also includes a
That change revealed a few other missed cases, which should be fixed now.
I also ran the
They are very nice ::-)
I think that the invalid names now just working is probably the most important part.
I thought of another doc we should write - adding something to the writing a new backend section on testing what character encoding the new backend uses and how to add that info to the backend.
Could we change the encoding dynamically? We know we are using a onedrive for business after we have instantiated the Fs object.
Great! I see the script in
I don't fully understand the 'sheet here. In the "With Encoder" tag there are now quite a lot of "REN" meaning Renamed - I would have thought that is good? Though they mostly seem to be coloured Red.
referenced this pull request
Jul 22, 2019
4 times, most recently
Jul 26, 2019
This was referenced
Jul 29, 2019
Here is a beta for anyone wanting to test
https://beta.rclone.org/branch/v1.48.0-185-gd4053c53-filename-encode-beta/ (uploaded in 15-30 mins)
@ncw I found some same error when I upload my files for test, this is detail:
ps: I tried sync files without 'crypt', found it has not error.
error files: sorry I can't found name rule， I tried to rename them more shortly, but it will cause
first sync log
second sync log
I set up my system just like yours with crypt, onedrive for business and I couldn't replicate the problem :-(
After you do the upload can you use
Also you should be using
@ncw thanks for your reply, this time I tried to supplement my questions.
It has no help.
I tried more context, found more information. Maybe it's
It's really strange, isn't it?
this is new log with
first sync log
lsf log(found name error)
second sync log (from the log with start of
lsf log(no result, file was deleted)
What appears to have happened is that your files got uploaded with a different name, hence rclone didn't think they existed and re-uploaded them.
These are not the same
What appears to have happened is that some unicode characters have changed.
I can't replicate this on my onedrive for business, your test works fine for me :-(
Could this be a setting on your onedrive? Can you set it to UTF-8 encoding?