Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explanation of Clashes #287

Closed
ikwyl6 opened this issue Jul 18, 2015 · 10 comments
Closed

Explanation of Clashes #287

ikwyl6 opened this issue Jul 18, 2015 · 10 comments

Comments

@ikwyl6
Copy link

ikwyl6 commented Jul 18, 2015

Using drive 0.2.5 and trying to do a diff or push, I get some errors like the following:

drive diff PEng/
clashes detected. use `ignore-name-clashes` to override this behavior
drive diff -ignore-name-clashes PEng/
Cannot access download link for 'NPPE_Application_2015.zsheet'
/Documents/PEng/NPPE_Application_2015.xlsx only on remote
drive diff|more
clashes detected. use `ignore-name-clashes` to override this behavior

I'm trying to find out what an actual 'clash' is and what -ignore-name-clashes does besides adding an unnecessary flag. This must have been included in a revision within the last month or so. Searching the readme, I find:

https://github.com/odeke-em/drive#pushing

but if these errors are thrown for name clashes (same file name but different size in same directory), why won't it show me this on the drive diff without the ignore-name-clashes? I find this a redundant flag to always to include if you have same filenames in same directory. Shouldn't it (and I thought it did before) report that there are same filenames in the same directory on the printed report of a push or pull before it asks the user to confirm to proceed?

I would expect the drive diff to show me these name clashes and for me not to ignore them just to properly execute the diff command.

Unless I'm missing something.....

@odeke-em
Copy link
Owner

Hello @ikwyl6, thank you for reporting this.

I'm trying to find out what an actual 'clash' is
A clash is a situation where more than one file/folder within a folder shares a name with more than one other item. Google Drive can handle these situations alright since file uniqueness remotely is by fileId, however ordinarily your file system cannot handle this and will end up overwriting anything with the same name.

and what -ignore-name-clashes does besides adding an unnecessary flag. This must have been included in a revision within the last month or so. Searching the readme, I find:

The -ignore-name-clashes mitigates this situation by greedily selecting the first clashing item and then trying to match it up locally. The README clearly states that drive cannot handle a situation where file names clash and has been in there since day one of this project here.
screen shot 2015-07-18 at 11 40 10 am
Despite that declaration, users still expect to use drive. See #107, #159, #160, #58, #43 and others when interacting with users.

Under the hood diff uses the functionality of change resolution and that's why you'll need that flag.

shouldn't it (and I thought it did before) report that there are same filenames in the same directory on the printed report of a push or pull before it asks the user to confirm to proceed?
It does so, but what happens when you try to diff inside a folder that you've never pulled?

@ikwyl6
Copy link
Author

ikwyl6 commented Jul 19, 2015

Can drive report these name clashes only and list them and then exit as separate comand? If there are clashes, can drive still continue the normal 'push' or 'pull' operations, warn the user of clashes and list these clashes?

FYI When I was reading the README, I was only searching for 'clashes' and 'ignore-name-clashes'.

I feel like a normal 'drive push' shouldn't fail because of clashes, it should warn the user of clashes and do the push with the files that are not on remote and then report clashes so the user can deal with those separately. Not exit from the command all together and do no push/pull operations.

@odeke-em
Copy link
Owner

I disagree: despite warning users about this as shown previously, users will still complain that pushes are failing or never completing. I've had not very happy users when drive just did what it was felt it should do instead of mitigating data loss scenarios please see #57

@ikwyl6
Copy link
Author

ikwyl6 commented Jul 19, 2015

Ok I understand but issue #57 only dealt with 1 user and that issue was for user collaboration, which I didn't think drive was concentrating on.

In my opinion, most users will have files on Google Drive that will have same file names. If this is the case, each typical push/pull will need to be run twice because drive will exit with a warning stating to use -ignore-name-clashes. Shouldn't drive report what clashes exist on the first run? Is there a separate command showing how to list clashes? Shouldn't this be by default 'drive diff'?

I'm sorry I'm harping on this but this seems I would have to run two consecutive commands just to do a typical push/pull when a directory has same file names. Doesn't this seem inefficient from a software and user process?

In the end it is your application and your decision, which I'm grateful that you forked it/created it.

@odeke-em
Copy link
Owner

Hello @ikwyl6, but drive already reports what exact files clash, try it out. It provides the exact ids and files. Drive even allows for operations with ids so you can do a pull, push, delete etc. Try doing a push or pull and see for yourself. It has always done so, unless something just broke in the last few days.

@allanstreib
Copy link

The -ignore-name-clashes mitigates this situation by greedily selecting the first clashing item and then trying to match it up locally.

Hi @odeke-em, thanks for this project. When you say "greedily selecting" what does this mean? Does it pull the most recent file, or is it in any other way deterministic?

@odeke-em
Copy link
Owner

Hello @allanstreib actually just the first returned result, of which usually the sort order is by creation (that's as deterministic as it is right now) but the sort order could change so the answer would be for sure the first returned item.

@jonbrock
Copy link

Would it be possible to resolve conflicts by appending the file id in parentheses? So if your drive as abc.jpg twice, locally it would become abc (as89djv24aslkdf).jpg and abc (ij234kltjlkasdvi0i8).jpg.

The idea comes from this thread: https://productforums.google.com/forum/#!topic/drive/Yjmkd4nbhw4

Basically the desktop client for google drive seems to strip out parenthesized text from filenames when uploading. So if you uploaded abc.jpg and abc (1).jpg, now you have abc.jpg as a duplicate file on your drive. But since you can get the underlying file id with the api, you can map both of those to the correct local file.

@odeke-em
Copy link
Owner

Thanks @jonbrock for that suggestion. Sounds good, but it involves renaming the remote file and a bit of fiddling since currently we are just looking up by path then only matching up by id. Sure stripping out the parenthesis in brackets sounds like a plan but what happens if a user already has such on their drive? I'll explore the idea more. If I may ask, could we move this discussion with your suggestions to #156 which is actually more relevant and opened to actually address clashes than the question here. I had an idea for how to handle this without a lot of fiddling but it will involve a little more research and creating a simple file system with ids as keys (just like Google Drive does remotely), but that's just a thought for now.

@odeke-em
Copy link
Owner

odeke-em commented Aug 1, 2015

@ikwyl6 ping!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants