Skip to content

Commit

Permalink
Fix remote conflict detection
Browse files Browse the repository at this point in the history
- teach Uploader to set the dmd metadata's last_downloaded_uri
to db_entry.last_uploaded_uri
- teach Downloader to call did_upload_version properly
and set last_uploaded_uri to filecap
and set last_downloaded_uri to last_uploaded_uri
- fix Downloader remote conflict detection:
 - conflict "2a" is triggered appropriately for deletion versus modification
 - conflict "2cii" and "2ciii" shall not be triggered when the version is zero
  • Loading branch information
david415 committed Nov 4, 2015
1 parent c9b26cb commit 23e76b0
Show file tree
Hide file tree
Showing 2 changed files with 33 additions and 15 deletions.
43 changes: 28 additions & 15 deletions src/allmydata/frontends/magic_folder.py
Expand Up @@ -357,8 +357,9 @@ def _maybe_upload(val, now=None):
metadata = { 'version': new_version,
'deleted': True,
'last_downloaded_timestamp': last_downloaded_timestamp }
if db_entry.last_downloaded_uri is not None:
metadata['last_downloaded_uri'] = db_entry.last_downloaded_uri

if db_entry is not None and db_entry.last_uploaded_uri is not None:
metadata['last_downloaded_uri'] = db_entry.last_uploaded_uri

This comment has been minimized.

Copy link
@daira

daira Nov 5, 2015

This is not correct. The intent really is that the last_downloaded_uri in the metadata is the URI of the last version downloaded by this client. The last_downloaded_uri field in the database should be set only on downloads, and the last_uploaded_uri field should be set only on uploads.

If, for example, Alice downloads version x from Bob, then uploads versions x+1 followed by x+2, then the last_downloaded_uri in the metadata of version x+2 should be the URI of version x, not the URI of version x+1. This is because a Bob that initially has version x should be able to go directly to version x+2 (which might happen if both uploads happened between two of Bob's scans), and treat that as an overwrite, not a conflict. If, on the other hand, Bob downloads version x+1 first and then version x+2, both of those are still overwrites because their last_downloaded_uris are both equal to Bob's last_uploaded_uri, which is the URI of version x.

I think part of the problem with the tests is that did_upload_version is incorrectly called for downloads as well as uploads (on line 707 of magic_folder.py).

It is more complicated if there are three or more clients. I'm not entirely sure that the conflict detection algorithm works properly in that case, but that should not affect the current tests, all of which use at most two clients.

This comment has been minimized.

Copy link
@david415

david415 Nov 5, 2015

Author Owner

This algorithm definitely does not work for three or more parties.

Alice downloads version x from Bob, then uploads versions x+1 followed by x+2.
The metadata last_downloaded_uri will be set to uri of version x.
Ada downloads version x+2 from Alice.
Ada uploads version x+3, setting the last_downloaded_uri in the metadata to URI of x+2 and last_uploaded_uri to version x+3.
Bob downloads version x+3 and sees it as a conflict because last_downloaded_uri is set to URI of x+2 and is not equal to Bob's last_uploaded_uri.

This comment has been minimized.

Copy link
@daira

daira Nov 24, 2015

In the example, once Alice has uploaded version x+2, Bob will see and download that version on his next scan. So unless the updates of Alice's and Ada's DMDs (for versions x+2 and x+3 respectively) happen out-of-order from Bob's perspective, Bob has sufficient information to be able to tell that there have been no conflicts.

The current documented algorithm does not handle this case correctly, because last_uploaded_uri is only updated on uploads. I'll fix this when I get back; I think basically the solution is to maintain a last_seen_uri in the database instead of last_uploaded_uri, but it needs some careful thinking about.


empty_uploadable = Data("", self._client.convergence)
d2 = self._upload_dirnode.add_file(encoded_path_u, empty_uploadable,
Expand Down Expand Up @@ -395,9 +396,7 @@ def _failed(f):
return upload_d
elif pathinfo.isfile:
db_entry = self._db.get_db_entry(relpath_u)

last_downloaded_timestamp = now

if db_entry is None:
new_version = 0
elif is_new_file(pathinfo, db_entry):
Expand All @@ -409,8 +408,8 @@ def _failed(f):

metadata = { 'version': new_version,
'last_downloaded_timestamp': last_downloaded_timestamp }
if db_entry is not None and db_entry.last_downloaded_uri is not None:
metadata['last_downloaded_uri'] = db_entry.last_downloaded_uri
if db_entry is not None and db_entry.last_uploaded_uri is not None:
metadata['last_downloaded_uri'] = db_entry.last_uploaded_uri

This comment has been minimized.

Copy link
@daira

daira Nov 5, 2015

Same issue as above.


uploadable = FileName(unicode_from_filepath(fp), self._client.convergence)
d2 = self._upload_dirnode.add_file(encoded_path_u, uploadable,
Expand Down Expand Up @@ -696,16 +695,17 @@ def _process(self, item, now=None):

def do_update_db(written_abspath_u):
filecap = file_node.get_uri()
last_uploaded_uri = metadata.get('last_uploaded_uri', None)
last_downloaded_uri = filecap
last_downloaded_timestamp = now
last_uploaded_uri = metadata.get('last_uploaded_uri', None)

written_pathinfo = get_pathinfo(written_abspath_u)

if not written_pathinfo.exists and not metadata.get('deleted', False):
raise Exception("downloaded object %s disappeared" % quote_local_unicode_path(written_abspath_u))

self._db.did_upload_version(relpath_u, metadata['version'], last_uploaded_uri,
last_downloaded_uri, last_downloaded_timestamp, written_pathinfo)
self._db.did_upload_version(relpath_u, metadata['version'], filecap,
last_uploaded_uri, last_downloaded_timestamp, written_pathinfo)

This comment has been minimized.

Copy link
@daira

daira Nov 5, 2015

The first change (to filecap) is correct. I think that did_upload_version should not be changing last_downloaded_uri or last_downloaded_timestamp at all, so those two arguments should be removed (and we need something like a did_download_version method as well).

self._count('objects_downloaded')
def failed(f):
self._log("download failed: %s" % (str(f),))
Expand All @@ -718,20 +718,33 @@ def fail(res):
d.addCallback(fail)
else:
pathinfo = get_pathinfo(abspath_u)

db_entry = self._db.get_db_entry(relpath_u)
dmd_last_downloaded_uri = metadata.get('last_downloaded_uri', None)

# See <docs/proposed/magic-folder/remote-to-local-sync.rst#conflictoverwrite-decision-algorithm>.
is_conflict_2a = not pathinfo.exists

This comment has been minimized.

Copy link
@daira

daira Nov 5, 2015

Oops, I got this completely wrong (in the implementation, not the spec). 2a says that if the file doesn't exist, it is an overwrite (and the 2c conditions are ignored), not a conflict.

So it should be something like:

if not pathinfo.exists:
    is_conflict = False
else:
    is_conflict_2c_i = ...
    is_conflict_2c_ii = ...
    is_conflict_2c_iii = ...
    is_conflict = is_conflict_2c_i or is_conflict_2c_ii or is_conflict_2c_iii
#iii. either dmd_last_downloaded_uri or db_entry.last_uploaded_uri (or both) are absent, or they are different.
print "metadata %r" % (metadata,)
is_conflict_2a = False
if metadata['version'] != 0:
if 'deleted' in metadata.keys():

This comment has been minimized.

Copy link
@daira

daira Nov 5, 2015

Use if metadata.get('deleted', False): (if this code is still needed).

if not pathinfo.exists:
is_conflict_2a = True

This comment has been minimized.

Copy link
@daira

daira Nov 5, 2015

See above; not pathinfo.exists should always imply an overwrite.

else:
if db_entry is None:
is_conflict_2a = True

is_conflict_2c_i = self._is_upload_pending(relpath_u)
is_conflict_2c_ii = db_entry is None or is_new_file(pathinfo, db_entry)
is_conflict_2c_iii = (dmd_last_downloaded_uri is None or db_entry is None or db_entry.last_uploaded_uri is None
or dmd_last_downloaded_uri != db_entry.last_uploaded_uri)
if metadata['version'] == 0:

This comment has been minimized.

Copy link
@daira

daira Nov 5, 2015

I don't think it can be correct to treat version 0 differently from later versions, because the first version that a given client sees may legitimately not be version 0, and that shouldn't affect conflict detection.

is_conflict_2c_ii = False
is_conflict_2c_iii = False
else:
is_conflict_2c_ii = db_entry is None or is_new_file(pathinfo, db_entry)
is_conflict_2c_iii = (dmd_last_downloaded_uri is None or db_entry is None or db_entry.last_uploaded_uri is None
or dmd_last_downloaded_uri != db_entry.last_uploaded_uri)

print "conflict?", is_conflict_2a, is_conflict_2c_i, is_conflict_2c_ii, is_conflict_2c_iii

is_conflict = is_conflict_2a or is_conflict_2c_i or is_conflict_2c_ii or is_conflict_2c_iii

if is_conflict:
self._count('objects_conflicted')

Expand Down
5 changes: 5 additions & 0 deletions src/allmydata/test/test_magic_folder.py
Expand Up @@ -805,6 +805,8 @@ def Alice_to_delete_file():
os.unlink(self.file_path)
self.notify(to_filepath(self.file_path), self.inotify.IN_DELETE, magic=self.alice_magicfolder)
d.addCallback(_wait_for, Alice_to_delete_file)
d.addCallback(lambda ign: self._check_downloader_count('objects_conflicted', 0, magic=self.alice_magicfolder))
d.addCallback(lambda ign: self._check_downloader_count('objects_conflicted', 0))

def notify_bob_moved(ign):
d0 = self.bob_magicfolder.uploader.set_hook('processed')
Expand All @@ -828,6 +830,9 @@ def notify_bob_moved(ign):
d.addCallback(lambda ign: self._check_downloader_count('objects_failed', 0))
d.addCallback(lambda ign: self._check_downloader_count('objects_downloaded', 2))

d.addCallback(lambda ign: self._check_downloader_count('objects_conflicted', 0, magic=self.alice_magicfolder))
d.addCallback(lambda ign: self._check_downloader_count('objects_conflicted', 0))

def Alice_to_rewrite_file():
print "Alice rewrites file\n"
self.file_path = abspath_expanduser_unicode(u"file1", base=self.alice_magicfolder.uploader._local_path_u)
Expand Down

1 comment on commit 23e76b0

@daira
Copy link

@daira daira commented on 23e76b0 Nov 5, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-1; my mistake about the sense of condition 2a is probably confusing things. I suggest reverting this commit then fixing that first, and then fixing the issue I described in 23e76b0#commitcomment-14200290

Please sign in to comment.