Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automerge #30

Open
gmaclennan opened this issue Feb 23, 2017 · 5 comments
Open

Automerge #30

gmaclennan opened this issue Feb 23, 2017 · 5 comments

Comments

@gmaclennan
Copy link
Member

The most common cause of forked documents is when two users edit a way. This is common for long rivers that cross a large area. We have observed that users are very tempted to fix rivers to adjust the alignment or "round" a sharp corner. Two users working in different areas could easily adjust parts of the same river.

One way around this is to create rivers as a relation of multiple shorter ways. This is the recommended way for representing large ways or areas in OSM. If two users edit two different way sections of a river, the relation itself does not change, and no forks are created. This behaviour would be broken however by digidem/osm-p2p-db#49 which would modify the version of a relation for every edit of its members, and this change in behaviour is likely necessary for fixing bugs related to replication, deletions and forks.

A solution to this problem if digidem/osm-p2p-db#49 is implemented is to do "auto-merging" in osm-p2p-server or osm-p2p-api. Auto-merging would work by:

  • Is a relation forked?
  • If yes, get the common parent and compare the versions of members.
  • If member versions have only changed on one of the forks (comparing to the common parent) then the relation can be auto-merged by including the most recent version of each of its members.
  • a forked relation where each fork has a different change to the same member would not be able to be auto-merged.
  • The fork can be presented to the client (i.e. iD editor) as a single unforked relation with a special version number that refers to the forks that were merged. One way of doing this would be to set the version number as a comma-separated list of versions of the merged relations.
  • On submission, iD includes the version IDs that were edited, in this case a comma-separated list.
  • osm-p2p-server can use the comma-separated version id to set opts.links that is passed to hyperlog/hyperkv.
  • after editing there is now only one head (forks are actually merged)
@hackergrrl
Copy link
Contributor

hackergrrl commented Feb 23, 2017 via email

@hackergrrl
Copy link
Contributor

hackergrrl commented Feb 23, 2017 via email

@gmaclennan
Copy link
Member Author

I think we're talking about slightly different approaches. What I am suggesting with "auto-merging" is presentational - there are two forks, but they are presented on the client as a single merged fork. It is only when the client edits the relation / member of the relation that they create a new doc that points to the previous forks - an actual merge. I wonder how quickly this could get out of control if everybody is editing different sections of a large relation?

Regarding the second issue you mention, I think there are several ways that a relation could point to members which are not the head, and ways can point to nodes that are not the head. It is possible to create a node and add a way later. Perhaps one way to "auto-merge" these is to create a "virtual fork" that is presented to the user. In this case it would be a second relation_v2 that points to Way_v2.

@hackergrrl
Copy link
Contributor

hackergrrl commented Feb 24, 2017 via email

@gmaclennan
Copy link
Member Author

osm-p2p-server will need to be smart enough to intercept any requests
made against such faux documents

It has done this since the first version, we've just never used it: https://github.com/digidem/osm-p2p-server/blob/master/api/put_changes.js#L82-L84 - it was designed as the way to represent the links array in XML. The nice thing about this technique is that iD editor just treats it as a regular version number and it just gets passed through - iD never touches the version number. It's when osm-p2p-server gets it back that it knows that this "version id" is actually an array of two version numbers, comma-separated.

What do you mean here by "out of control"? In terms of extra data
created by rippling changes? By conflicts?

In a scenario where multiple users are continuously editing different (mergeable) segments of a long river relation, would new forks be created by merges faster than they are being merged?

but I wonder if we can have less "magic" between the forking
data model and what users actually interact with?

I'm not sure, I think it is always going to be a hard thing for users to understand. My current thinking is that it's best to present a single version to the user, based on something like modification time and a deforking step, but also give a visual indication that more than one version exists, and a UI that can display the DAG - that is where we need the UX work to make that clear and understandable to the user who is actually interested in reviewing forks and resolving/merging - I think it is a subset of users who even need to know about this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants