Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Splitting Braid Patch into Range Patch and Synchronization Types #20

Merged
merged 20 commits into from
Nov 1, 2019

Conversation

mitar
Copy link
Member

@mitar mitar commented Oct 31, 2019

Fixes #10 and fixes #6 and fixes #7 and fixes #4.

@mitar mitar changed the title Renaming files Splitting Braid Patch into Range Patch and Synchronization Types Oct 31, 2019
@mitar
Copy link
Member Author

mitar commented Oct 31, 2019

@toomim I have finished a pass on the Synchronization Types spec. Now moving to the range patch one.

@mitar mitar marked this pull request as ready for review October 31, 2019 08:27
@mitar mitar requested a review from toomim October 31, 2019 08:27
@mitar
Copy link
Member Author

mitar commented Oct 31, 2019

Done. First pass over the split done. Please review. I might not have time anymore in the following days to look into this, so feel free to use it in whichever way you want (merge it, not merge it, change it completely).

@mitar
Copy link
Member Author

mitar commented Oct 31, 2019

FYI, braid http spec might require updates related to what this PR is changing, but I have left that out on purpose so that we first finalize these two.

@mitar
Copy link
Member Author

mitar commented Oct 31, 2019

I am currently still working on adding The +patch Structured Syntax Suffix section.

@mitar
Copy link
Member Author

mitar commented Oct 31, 2019

Pushed and finished another iteration. Feedback welcome. Feel free to merge if you accept things.

@mitar
Copy link
Member Author

mitar commented Oct 31, 2019

I defined the new range unit as json but we can change it if you prefer. I made it also for JSON-compatible structures, so it can be reused. But I think json is good because people will understand easily what it means without having to describe its properties.

If you have ideas for other range units we might want to add, we can add them as well. (Or you can add them, just follow the template I made. Of course, I have no idea if it is correct template.)

and integrity protection services, or which is designed for use
only within a secure environment. Types SHOULD always document
whether or not they need such services in their security
considerations.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the idea here is that some patches might have built-in hash-chain (like git) to previous versions. So it is easier to detect tampering. While some patches might be stand-alone. So in the former cases a synchronization type might detect that something is off. But in the later it would need additional things like signatures and stuff.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this should be part of the synchronization type. I think the synchronization type only needs to be there in order for two synchronizers to consistently resolve to the same state — that just means that they need to use the same function to merge multiple parent versions into a single resulting state.

So a synchronization type doesn't need to be more than a function merge (parent_versions) => state_of_child_version. This doesn't need to care about whether the version dag is merkle; whether it is hashes; it doesn't need to validate. The version identifiers can be determined by something separate, and validation can be a separate process and separate specification.

Overall, I think that this can make the actual Synchronizer-Type a lot more specific, which also reduces the potential security problems, and as a result, I think a lot of the security considerations don't need to be in there.

Copy link
Member

@toomim toomim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll be giving feedback on one file at a time.

Feedback for Range-Patch

Good work! I like this a lot overall. There's great symmetry here. These are my concerns:

  • I think this should work for PUT in addition to PATCH, because PATCH isn't idempotent.
    • I know that a PATCH seems more specific than a PUT, but if GET+range can grab a subset of a resource, I think that PUT+range should be able to put a subset of a resource.
    • But perhaps this isn't important enough to care about right now.
  • I think we need the server to advertise that it accepts patches of a range
  • multipart/byteranges is just for byteranges. I think we need a new type multipart/ranges, that can be any range, where the range is specified internally.
    • I prefer separating the multiple parts via a content-length rather than separator string, but perhaps we don't need to prioritize this yet
  • I don't understand why we are specifying the stand-alone patch format here
  • JSON range also needs to be able to handle slices.

The part about a server advertising that it accepts ranges is relevant to the Server MUST NOT ignore the Range header field when used with a PATCH request. part. I think this only should be necessary if a server supports the Range header. In which case the client needs to know whether or not it supports it, which I think we could determine with some sort of Accepts header.

@mitar
Copy link
Member Author

mitar commented Oct 31, 2019

I think this should work for PUT in addition to PATCH, because PATCH isn't idempotent.

I disagree that the PATCH isn't idempotent, by the spec it can be, but I agree that we should just define it for all non GET request types. Will add that.

I think we need the server to advertise that it accepts patches of a range

Of course, but that is already standardized by Accept-Ranges for querying. For how different non-standard methods behave for any given resource, this I do not think is specified anywhere and we should also not specified here (but we can specified it in Braid HTTP spec). So we are not standardizing here that you must accept PATCH or PUT or any other method. But that if you accept it, you can use Range in this way to define patches. So that goes at the app level I think, not into the spec.

So we have to find some other way to communicate in Braid HTTP spec that the server is Braid compatible, if this is what you want. Maybe through extending OPTIONS for a resource.

multipart/byteranges is just for byteranges.

See here:

Despite the name, the "multipart/byteranges" media type is not limited to byte ranges. The following example uses an "exampleunit" range unit:

I don't understand why we are specifying the stand-alone patch format here

Because we can and we will need it for the Braid HTTP spec.

JSON range also needs to be able to handle slices.

Will add that.

I think this only should be necessary if a server supports the Range header.

So making OPTIONS request will return Accept-Ranges. The issue is that that is meant only for GET request types. How to identify that you can also use it for other request types, not sure. I will see if I can improve language there a bit.

@mitar
Copy link
Member Author

mitar commented Nov 1, 2019

Pushed improvements for few issues you raised. I will work on JSON range for slices later tonight.

@mitar
Copy link
Member Author

mitar commented Nov 1, 2019

I think I just addressed all issues you raised.

@toomim
Copy link
Member

toomim commented Nov 1, 2019

I'm still reading and thinking about this, but clearly this should be merged already. I'm sorry for leaving it open for so long.

@toomim toomim merged commit fc5943e into master Nov 1, 2019
@mitar mitar deleted the split-braid-patch branch November 1, 2019 01:52
@toomim
Copy link
Member

toomim commented Nov 1, 2019

Review of Synchronization Types

  • This is very cool. All we are doing is generalizing them from GET to
    other methods, and then — blammo — you have a general patch
    language!
  • Intro is good.
    • First sentence of second paragraph needs revision
    • What is "o Description" for?
  • I can take care of the page breaks. When we submit, we need to
    manually insert ^L characters, and label the page numbers in Table
    of Contents. I did this last time manually, and am happy doing it
    again.
  • I think we can clarify the difference between a Synchronization Type
    and its implementation.
    • The Type only needs to specify enough so that two implementations
      can synchronize together and obtain consistent output.
      • It does not, for instance, need to prescribe which algorithms or
        data structures the implementation uses, or care about the
        performance of the synchronizer.
    • Therefore, a type only needs to specify how to merge parallel edits
      together
      , aka how to resolve conflicts.
    • Here are some edits to clarify this (and here's the commit):
    2.  Definitions

       For the purpose of this document we define "synchronization" as the
       resolution of conflicts amongst patches.  There are multiple
       approaches to resolving conflicts, and for two synchronizers to be
       compatible, they must resolve them in the same way.  A
       "Synchronization Type" is an identifier that defines a method of
       resolving patches, along with a set of (optional) parameters.

       A Synchronization Type MUST specify the equivalent of a function that
       takes a set of parent versions as input and returns the state of the
       resulting merged version.
  • I think this means that much of the security considerations are not
    necessary. They mostly seem to be about particular algorithms or data
    structures, or validation or versioning, rather the behavior of the
    conflict-resolution function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants