Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON Range format doesn't support [n:n], [-n], or [-n:-n] #29

Closed
toomim opened this issue Nov 1, 2019 · 10 comments
Closed

JSON Range format doesn't support [n:n], [-n], or [-n:-n] #29

toomim opened this issue Nov 1, 2019 · 10 comments

Comments

@toomim
Copy link
Member

toomim commented Nov 1, 2019

As discussed here: 5729979#r35759079

Zero-length ranges

Range starts and ends should be allowed to be equal. That is how you insert new text at a zero-length location.

Negative numbers

It is also nice to be able to use negative numbers, to specify a distance from the end. That's useful if you want to write a synchronizer by hand, that just implements an append-only log.

For instance, let's say that you're writing a chat application. It's nice to make it easy to add a message to the end of the array. Your code to do that is even simpler if it doesn't have to keep track of how long the array is, and can just say "Insert this at -0".

Or let's say that you have a server in your corporate LAN that maintains logs of all your systems. Then whenever you need to add a new line to the end of the log, you just issue a patch that does a PUT http://log_server/log '[-0:-0] = "new event: alert on system Q\n"'.

Obviously for these cases it's hard to use - as the character to separate the two numbers. It's better to use :.

@mitar
Copy link
Member

mitar commented Nov 1, 2019

TODO: We also have to extend this for bytes range.

@mitar
Copy link
Member

mitar commented Nov 1, 2019

So I agree fully with zero-length ranges. It will need some explanation, but yes, we need that.

But I strongly oppose negative numbers. To me patches are fixed diffs between two known states. Not programs which convert arbitrary state to another state. So you have to know what is the previous state to issue a patch to the next state. We are building whole machinery to do that. So when we start adding stuff like negative indices, then why not also add much more of some query-programming language, like: append only if previous value is not the same as the current one, append only if value is not in the set, and so on. We then get into the question of building a custom query language/programming language and where do we draw the line.

To me, the line is: diff is between two known versions of the state. Not a program which can apply to unknown state.

@toomim
Copy link
Member Author

toomim commented Nov 1, 2019

To me patches are fixed diffs between two known states.

That is one use-case we can support. But we can also support the use-case where a client does not know the entire state that they are modifying.

We are making a general protocol. We will succeed if we can get everyone to use it for all their use-cases, instead of writing custom protocols. The more stuff people put on this protocol, the more useful and successful it becomes. If people can simply use HTTP to store their logs and chat messages, writing the code by hand, then even embedded devices, like CISCO routers, could send their log files over HTTP, to some intranet server, and your client on your phone can subscribe to that log, just like any state, and it will even synchronize when you go offline, so that anytime you get internet, your logs update, just like that automatically, and you can much more easily debug your intranet's problems, all with very simple implementations and simple code written by everyone!

If you require the embedded device to implement its own full-fledged synchronizer just to send logs... that's a non-starter.

We want this protocol to fail gracefully to low-fi situations. You shouldn't have to implement all the features if you only need to use it in a very specific way.

when we start adding stuff like negative indices, then why not also add much more of some query-programming language, like: append only if previous value is not the same as the current one, append only if value is not in the set, and so on.

I'm not saying we should add a full query language; I'm saying that supporting low-fi synchronization is a core value of the Braid extensions (it was founded with that principle), and that appending to a log is a high-value use-case that we want to support, and negative-numbered indices provide that support. They are also extremely simple to implement. We've implemented the whole parser in <60 lines of javascript, IIRC.

@mitar mitar self-assigned this Nov 2, 2019
@mitar
Copy link
Member

mitar commented Nov 2, 2019

Assigning myself for the zero-length ranges.

@mitar
Copy link
Member

mitar commented Nov 2, 2019

But we can also support the use-case where a client does not know the entire state that they are modifying.

We can, but that can also be a different spec. A different patch language. Which is a full-fledged query language. Like MongoDB query language, where you can have conditions and smart operations. Not just setting values to a value, but like also incrementing the value, and other modifications which can be useful. Why is appending the only case you care about? What about incrementing the number? So this is to me the difference, here we have a spec for the use case with known state. And for unknown state we need much more.

This was also your argument against JSON patch, that they test operation. And is also a problem of JSON patch, that they allow a bit of querying, but very little, so many issues for JSON patch2 is about "how querying is not powerful enough":

And that is what I worry about, once we try to cover the other use case, we will never be able to fully cover it unless we have a complicated query language. Which I am OK to exist, but it can always be a separate spec which instead of a static patch defines a query you apply. And the interesting thing is that once it is made it will continue to be fully compatible with Braid. I just do not think it should be part of this spec.

We are making a general protocol.

But also generally used protocol, where we can expect that any Braid endpoint talks it. So that also means that we need to limit the amount of features we expect everyone to implement. So for special cases like append-only states, I think the answer is that this is not something which needs to be general and that everyone will try to use against a random Braid site, but is particular to that particular app accepting such changes.

And Braid is well compatible with that: it means that the same resource can also accept POST requests with custom logic on what is provided and how the state gets updated. How I see it is that POST requests are object methods on the object, while state is an object. So we are defining a protocol how to sync the state of the object (properties) between peers, however the state was changed: through direct manipulation of properties or through calling methods on the object. How one defines methods and what is logic for those methods is in my view out of scope for Braid (I would do it as JSON-RPC).

Also, we already expect that the user knows the version of the parent state. So it would be really surprising if they do not know the state itself as well. And if they do not know it, you can have a proxy (app) in between which accepts method calls and translates that to state patches. (That is what method definitions would be anyway.)

If you require the embedded device to implement its own full-fledged synchronizer just to send logs...

No, they just call POST on some server app. And this is it. That app then updates the state. If peer cannot sync states fully, then they need something else which can.

We want this protocol to fail gracefully to low-fi situations. You shouldn't have to implement all the features if you only need to use it in a very specific way.

Yes, so you can have a low resource device talk to a specific server. And they have the method defined between them. App specific. But I cannot imagine a use case where we would want a low resource device to talk to any Braid site on the Internet and want it to be able to modify their state without knowing what that state is. Please give me an example for that. I do not think it exists. And if it does not, then the support for low resource devices is already there, it is is called HTTP and does not need Braid at all. The device talks to a particular server over whichever protocol it wants. It does not have to be standardized at all.

I feel very strongly about this. I really think that Braid should limit itself to state syncing and not include state manipulation. Braid will provide ways to limit resource consumption by pruning history and things like that. But when you are not talking about syncing, but just one way sending of data to another server, we do not need Braid for that.

mitar added a commit that referenced this issue Nov 2, 2019
mitar added a commit that referenced this issue Nov 2, 2019
@mitar mitar removed their assignment Nov 2, 2019
@mitar
Copy link
Member

mitar commented Nov 2, 2019

OK, after more thought I think I would support a compromise: We allow - to represent the zero-length range at the end of the array/string. This also makes it compatible with how JSON patch is using it. This would allow appending. But I would not allow more complicated ranges.

Can we agree on this?

mitar added a commit that referenced this issue Nov 2, 2019
mitar added a commit that referenced this issue Nov 2, 2019
mitar added a commit that referenced this issue Nov 2, 2019
mitar added a commit that referenced this issue Nov 2, 2019
@toomim
Copy link
Member Author

toomim commented Nov 3, 2019

To ground this discussion, here's where we've defined our core values:
https://github.com/braid-work/braid-spec/blob/master/draft-xx-httpbis-braid-http-00.txt#L126

1.1.  Design Goals

   This spec is designed to be:
     - Backwards-compatible with existing HTTP.
     - Easy to implement for simple synchronizers.
     - While also supporting arbitrary synchronization algorithms, such as
       Operational Transform and CRDTs.

We take this seriously. The Braid protocol makes all these optional:

  • Versions
  • Parents
  • Patches (you can just provide the full state instead)
  • Merge-Types (if you need consistency)

And there are use-cases for each of these optional features individually. For instance:

Also, we already expect that the user knows the version of the parent state. So it would be really surprising if they do not know the state itself as well.

No, we don't expect them to know the versions. The client can make a change without specifying the version, and/or without the parents. They can make a change without having a full copy of the state themselves. Both of these are common in the use-case of appending to a system log, that I've already described... (sigh...)

When you are appending to a system log, you don't need the full state of the log. You don't care what version it is, and you really care too much how the log events get ordered w.r.t. other log events from other clients, so you don't need to specify the parent versions that you are appending from.

@mitar
Copy link
Member

mitar commented Nov 3, 2019

So, can we then agree that appending is the only one we want to allow? My PR in #44 adds such language. I think it aligns with everything you wrote in the last comment. But I do not want it to be more than that. I think if we want a more turing-complete patch/query language, I would prefer another spec for that. So that implementors on the server side can decide if they want to support only simple patches or complicated queries. So I would keep "range patches" simple, but you are free to define another query language as well. I just think that is a big thing to define a really useful query language. I mean, MongoDB is working on it for years and they still find strange edge cases they cannot support or that their own code handles strangely. Especially when you start applying things to nested arrays.

@toomim
Copy link
Member Author

toomim commented Nov 3, 2019

I feel bad repeating myself, but here goes:

  • I don't want a full query language
  • I want to support [n:n], [-n], and [-n:-m]

This entails a little bit more than append. Append is a replacement of [-0:-0] with new content. We should support arbitrary patches to [-n:-m]. It's very elegant and simple to implement.

@mitar
Copy link
Member

mitar commented Nov 3, 2019

So appending is not supported for the json range unit, but negative indices are not.

Closing in favor of #47.

@mitar mitar closed this as completed Nov 3, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants