New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New conflict resolution modes #127

Closed
untitaker opened this Issue Oct 18, 2014 · 21 comments

Comments

Projects
None yet
4 participants
@untitaker
Member

untitaker commented Oct 18, 2014

I think the current options are not very useful. None is a sane default, but nearly useless in practice. * wins, on the other hand, might lead to data-loss.

keep both would delete the old item from both sides and upload both versions as new items. This is not very easy to implement as we would have to change or remove the item's UID. We already use icalendar for item-changing operations (e.g. splitting big VCALENDAR files) but that library is not very vcard-aware and since #70 i am kinda scared to further involve icalendar in vdirsyncer's operations.

Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

@hansbkk

This comment has been minimized.

hansbkk commented Feb 12, 2015

Complete noob suggestion:

"a wins, drop b's collision-data"
"a wins, b's collision-data appended to Note"
"b wins, drop a's collision-data"
"b wins, a's collision-data appended to Note"

Use a string like "vdirsyncer-conflict:" prefixing the note data for the user to use for searching during a cleanup session.

@untitaker untitaker changed the title from New conflict resolution mode: "keep both" to New conflict resolution modes Feb 12, 2015

@untitaker

This comment has been minimized.

Member

untitaker commented Feb 12, 2015

"a wins, drop b's collision-data"

This is already done with a wins and b wins. Another option is to vimdiff both files or something like that.

@hansbkk

This comment has been minimized.

hansbkk commented Feb 12, 2015

Yes I was just including them for completeness.

I assumed you were defining a collision as a conflict at the field level -
are you doing this on a whole-item basis?

I think Annemarie's tool I just discussed with you has a "merge" choice -
is that what you meant by diff'ing?

On Thu, Feb 12, 2015 at 3:21 PM, Markus Unterwaditzer <
notifications@github.com> wrote:

"a wins, drop b's collision-data"

This is already done with a wins and b wins. Another option is to vimdiff
both files or something like that.


Reply to this email directly or view it on GitHub
#127 (comment)
.

@untitaker

This comment has been minimized.

Member

untitaker commented Feb 13, 2015

I'm currently doing this on a per-item basis, so I can avoid parsing the items in normal situations. Perhaps per-field conflict resolution could be implemented on top of it. I don't know how that tool lools like.

@hansbkk

This comment has been minimized.

hansbkk commented Feb 13, 2015

Yes, an intra-item parse and merging when no fields in common are in
conflict is the best option if you can swing it.

"That tool" meaning Annamarie's? She really is very approachable, friendly
helpful etc, strongly encourage you to get in touch with her if you think
that would be helpful, her sync tool really seems to be close to a
vdirsyncer for Android, at least my perception, very much in the same
spirit, your tools together can really provide a great end-to-end solution.
. .

On Fri, Feb 13, 2015 at 1:22 AM, Markus Unterwaditzer <
notifications@github.com> wrote:

I'm currently doing this on a per-item basis, so I can avoid parsing the
items in normal situations. Perhaps per-field conflict resolution could be
implemented on top of it. I don't know how that tool lools like.


Reply to this email directly or view it on GitHub
#127 (comment)
.

@untitaker

This comment has been minimized.

Member

untitaker commented Feb 13, 2015

Yes, an intra-item parse and merging when no fields in common are in
conflict is the best option if you can swing it.

That'd be ideal, although I currently don't know how to do that in a reliable
and still generic way. I don't think there is any tool out there actually doing
that.

"That tool" meaning Annamarie's?

Yeah. I didn't mean to describe ContactSync in such a degoratory tone.

She really is very approachable, friendly helpful etc, strongly encourage you
to get in touch with her if you think that would be helpful, her sync tool
really seems to be close to a vdirsyncer for Android, at least my perception,
very much in the same spirit, your tools together can really provide a great
end-to-end solution.

I want to take a closer look at it before doing anything else. There are a
lot of badly written DAV clients that e.g. don't do etag checking while
uploading.

@WhyNotHugo

This comment has been minimized.

Member

WhyNotHugo commented Jul 9, 2015

If we have new conflict resolution methods, comparing the item's revision (vCard) and sequence number(iCal) would be my first choice, since software that edits the entries should be incrementing these.

Why the IETF didn't pick the same field name on both specs amazes me.

@untitaker

This comment has been minimized.

Member

untitaker commented Jul 9, 2015

👍 for comparing revisions, not sure if sequence numbers are useful for deciding which item is more up-to-date.

@geier

This comment has been minimized.

Member

geier commented Jul 14, 2015

not sure if sequence numbers are useful for deciding which item is more up-to-date.

The RFC is pretty ambiguous, quote:

It is monotonically incremented [...] each time the "Organizer" makes a significant revision to the calendar component.

So what exactly is a 'significant' revision? Also what happens if I change a parameter and change it back after saving that event, but without it being uploaded to the server? Should the SEQUENCE be incremented once or twice?

With all that said, I'd still enable the "later sequence wins" strategy as I believe it would work good enough in practice.

@untitaker

This comment has been minimized.

Member

untitaker commented Jul 14, 2015

I can't imagine a case where the sequence number is useful in the case of
conflicts
. If both sides have been changed (which implies a conflict), the
sequence number only shows which of the items has changed more often, but
that doesn't seem like a very good indicator on which item is more
up-to-date
.

Or it only shows which side increases SEQUENCE more aggressively, which is an
even less useful indicator for conflict resolution.

Also what happens if I change a parameter and change it back after saving that
event, but without it being uploaded to the server

At least in the case of vdirsyncer the item is uploaded anyway, and I think
most clients behave like that (e.g. DavDroid marks items as "dirty" after
they've been edited). I've thought about implementing some sort of diffing
feature but I think it would degrade performance for very little gain.

On Tue, Jul 14, 2015 at 02:18:48AM -0700, Christian Geier wrote:

not sure if sequence numbers are useful for deciding which item is more up-to-date.

The RFC is pretty ambiguous, quote:

It is monotonically incremented [...] each time the "Organizer" makes a significant revision to the calendar component.

So what exactly is a 'significant' revision? Also what happens if I change a parameter and change it back after saving that event, but without it being uploaded to the server? Should the SEQUENCE be incremented once or twice?

With all that said, I'd still enable the "later sequence wins" strategy as I believe it would work good enough in practice.


Reply to this email directly or view it on GitHub:
#127 (comment)

@WhyNotHugo

This comment has been minimized.

Member

WhyNotHugo commented Jul 14, 2015

Also what happens if I change a parameter and change it back after saving that event, but without it being uploaded to the server?

The program that edits the event doesn't do anything dav-related, so it would increment the revision every time there's a change.

I can't imagine a case where the sequence number is useful in the case of conflicts

You're perfectly right. Silly oversight on my part. revision might make sense in that it's a datetime, so we can just "keep latest" (but not as a default, of course).

At least in the case of vdirsyncer the item is uploaded anyway, and I think most clients behave like that (e.g. DavDroid marks items as "dirty" after they've been edited)

How does DavDroid know it has been modified? Is it an event editor as well as syncer? Or just by timestamp?

@untitaker

This comment has been minimized.

Member

untitaker commented Jul 14, 2015

As far as I understand it the dmfs tasks provider for android has a bool dirty flag for each task, which gets set to true by the dmfs Tasks app if the task is edited. When syncing, DavDroid uploads those tasks and resets the flag. Vdirsyncer uses mtime from the filesystem for that: Our "flag" is comparing the current mtime with a previously saved value. So both vdirsyncer and DavDroid are (I think) not caring if a change has been reverted: The flag/mtime did change and that's what matters.

It just occured to me we have the same problem with using REVISION as with any other property: If one editor doesn't support it, it might make vdirsyncer biased towards the other side. So maybe we should only use this if both REVISION values changed so we know both clients support it - but then we'd have to keep track of old values.

I think I'm slowly tending to be against that resolution mode because of this.

Basically for almost any mode where we inspect the item content, I think we need the item content from the previous sync as well, just like we store etags at the moment. Then we could do all the fancy things git does, i.e. merging items per property. But this would require ALL CONTENT FROM ALL ITEMS, which is an insane amount of data to fetch on first sync.

@WhyNotHugo

This comment has been minimized.

Member

WhyNotHugo commented Jul 14, 2015

But this would require ALL CONTENT FROM ALL ITEMS, which is an insane amount of data to fetch on first sync.

Uhm, aren't we fetching all this anyway?

@untitaker

This comment has been minimized.

Member

untitaker commented Jul 14, 2015

Uhm, aren't we fetching all this anyway?

Ah yes, indeed. Then the only disadvantage is storing all the content once again on disk. I'm not sure if that's worth it...

@untitaker

This comment has been minimized.

Member

untitaker commented Jul 14, 2015

Also the status file will get immensively big, so we have to change its format to avoid writing the whole file on each sync.

@WhyNotHugo

This comment has been minimized.

Member

WhyNotHugo commented Dec 23, 2015

manual would be a great choice. Leave the new, conflicting entry in the same directory as .ics.conflict, and the let the user diff that with the original one.

Eg, if qlSZGmy0Gy1vPsa1RvMKz5FG.ics has a conflict, grab the new one as qlSZGmy0Gy1vPsa1RvMKz5FG.ics.conflict.

Note that this does not violate the vdir spec, but the new file should be ignored by any vdir-complaint tool.

What I'm still unsure is how to tell vdirsyncer "hey, I'm merged this manually, keep the local version". Changing the settings might result in accidentally overwriting another, new conflicting entry.

@untitaker

This comment has been minimized.

Member

untitaker commented Dec 23, 2015

I rather had something interactive in mind:

  • Print: "vimdiff is going to be opened. Edit the files to be equal in content"
  • Let user hit enter
  • Open vimdiff on two temporary files
  • wait for the user
  • Check files for equal content

This kind of conflict resolution is much more easily discoverable, and I would make it a default.

With your suggestion I see the problem that the user might not grep the vdir for conflict files (like I myself never look for pacnew files), and the new files are "just sitting there". In such a scenario we're effectively back to the current from a and from b modes we have now.

Also it's not implementable in a generic way since it assumes that one storage of the sync pair is the filesystem storage.

@WhyNotHugo

This comment has been minimized.

Member

WhyNotHugo commented Dec 23, 2015

I really like that proposal, but it needs to deal with being run non-interactively (since I run it via cron and get the output via email, and I'm sure I'm not the only one doing so).

Maybe an extra flag can trigger --resolve to trigger this might be enough.

You can scrub my previous suggestion completely in favour of this one. :D

@untitaker

This comment has been minimized.

Member

untitaker commented Dec 23, 2015

I really like that proposal, but it needs to deal with being run non-interactively (since I run it via cron and get the output via email, and I'm sure I'm not the only one doing so).

One probably should be able to configure a command other than vimdiff. In your case you'd probably have a script that copies one file into the other and sends you a mail that it performed conflict resolution or something. Or it copies both files into a backup dir...

Would you be willing to implement this @hobarrera? :)

@WhyNotHugo

This comment has been minimized.

Member

WhyNotHugo commented Dec 23, 2015

One probably should be able to configure a command other than vimdiff.

Yes of course, I'd assumed we both had that in mind

In your case you'd probably have a script that copies one file into the other and sends you a mail that it performed conflict resolution or something. Or it copies both files into a backup dir...

I'd rather the conflict resolution do nothing if not running interactively. Maybe don't-run manually if we're not currently on a tty or something?

Would you be willing to implement this @hobarrera? :)

Sure, looks like a fun (and useful) challenge. :)

@untitaker

This comment has been minimized.

Member

untitaker commented Dec 23, 2015

I'd rather the conflict resolution do nothing if not running interactively. Maybe don't-run manually if we're not currently on a tty or something?

You could just exit with a non-zero exit code from your script for the same effect. This presumably already happens if you invoke vimdiff (which would probably be the default command) on a non-tty. Those are implementation details though, and while I don't think that we need to check for ttys ourselves, I agree with you about the user-facing behavior.

Sure, looks like a fun (and useful) challenge. :)

Great. Feel free to ask any questions about vdirsyncer's internals or things like that.

@untitaker untitaker added the planning label Mar 19, 2016

untitaker added a commit that referenced this issue Sep 18, 2016

Refactor of status handling in sync
- Using `info.idents` as new status, this saves a few operations where
  no storage actions have to be taken, but the status has to be updated.

- Rename StorageSyncer to _StorageInfo and make it a private API again.

- Ability to pass custom functions for conflict resolution. This part is
  a preparation for #127.

untitaker added a commit that referenced this issue Sep 18, 2016

Refactor of status handling in sync
- Using `info.idents` as new status, this saves a few operations where
  no storage actions have to be taken, but the status has to be updated.

- Rename StorageSyncer to _StorageInfo and make it a private API again.

- Ability to pass custom functions for conflict resolution. This part is
  a preparation for #127.

untitaker added a commit that referenced this issue Sep 18, 2016

Refactor of status handling in sync (#505)
- Using `info.idents` as new status, this saves a few operations where
  no storage actions have to be taken, but the status has to be updated.

- Rename StorageSyncer to _StorageInfo and make it a private API again.

- Ability to pass custom functions for conflict resolution. This part is
  a preparation for #127.

@untitaker untitaker closed this in 2256857 Sep 19, 2016

@untitaker untitaker removed the planning label Sep 19, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment