Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Subscriptions RFC: Are Subscriptions and Live Queries the same thing? #284

Closed
robzhu opened this issue Mar 6, 2017 · 78 comments

Comments

@robzhu
Copy link
Contributor

@robzhu robzhu commented Mar 6, 2017

Re-define Live Queries? Can Live Queries make Subscriptions unnecessary?
@smolinari @paralin @laneyk @dschafer @taion @Siyfion @jamesgorman2 @leebyron

Continuing the conversation from: #267 (comment)

@taion

This comment has been minimized.

Copy link

@taion taion commented Mar 6, 2017

I think in many cases, developers use subscriptions to approximate live queries, but subscriptions are more powerful and easier to implement.

For example, in my case, where I have many microservices on my backend, where some nested fields go to other services, it's not really straightforward to define how live queries would work, and I've chosen explicitly to model things as event streams.

Live queries would be a nice abstraction on top, but it's only that – it's not, in the general case, a great backend building block.

@stubailo

This comment has been minimized.

Copy link
Contributor

@stubailo stubailo commented Mar 6, 2017

I don't think this discussion should be judging about whether or not live queries are better than subscriptions, just whether they are different enough that they should be considered independently.

I think "building block" is a great way to look at it though - subscriptions are a great well-specified unit of realtime data push that can be used to build a lot of other cool stuff. The fact that it's very easy to implement a spec-compliant subscription on the server side is pretty awesome, even if it's not always the thing you want as a client-side developer.

@smolinari

This comment has been minimized.

Copy link

@smolinari smolinari commented Mar 6, 2017

I'd like to know first what people consider live queries to be. What is the definition? I ask, because I think there are different perspectives or ideas in play here, and thus the discussion can run in unnecessary tangents.

So, what is a live query? 😄

Scott

@stubailo

This comment has been minimized.

Copy link
Contributor

@stubailo stubailo commented Mar 6, 2017

Here's my impression of a live query in one sentence:

"A live query is a query where some or all of the parts can be marked as 'live', and the client expects to receive updates whenever any of those parts would have ended up with a different result if refetched again."

In short, they should be a drop-in replacement for polling.

@robzhu

This comment has been minimized.

Copy link
Contributor Author

@robzhu robzhu commented Mar 6, 2017

I'll quote my original definition in the RFC:

Live Queries- the client issues a standard query. Whenever the answer to the query changes, the server pushes the new data to the client. The key difference between Live Queries and Event-based Subscriptions is that Live Queries do not depend on the notion of events. The data itself is live and includes mechanisms to communicate changes.

Another way @stubailo and I have described it is: "infinitely fast/cheap polling".

@smolinari

This comment has been minimized.

Copy link

@smolinari smolinari commented Mar 6, 2017

Here is my definition.

A live query is a query, which is designated by the client as "live". This designation is passed on to the GraphQL server (one could say it is a subscription). The server then observes for triggers or data input from the underlying data sources needed to fulfill any part of the query. This in turn means any updates from the underlying data sources will be passed to the client automatically via bi-directional communication.

Scott

@paralin

This comment has been minimized.

Copy link

@paralin paralin commented Mar 7, 2017

If you want a demo of a live query, complete with a GraphiQL editor, see http://rgraphql.org

Live queries are infinitely more powerful than subscriptions because you can model live, reactive data in a way that efficiently encodes changes all the way from the data source to the browser, into something like react and angular. And it's not true that they cannot be done at scale - is definitely possible with a good enough scheduler / balancer.

@stubailo

This comment has been minimized.

Copy link
Contributor

@stubailo stubailo commented Mar 7, 2017

Live queries are infinitely more powerful than subscriptions

I agree with this.

And it's not true that they cannot be done at scale - is definitely possible with a good enough scheduler / balancer.

I also agree with this.

However, I think they are significantly different from subscriptions nonetheless, and both have their place in the GraphQL ecosystem.

@paralin

This comment has been minimized.

Copy link

@paralin paralin commented Mar 7, 2017

However, I think they are significantly different from subscriptions nonetheless, and both have their place in the GraphQL ecosystem.

Agreed. I don't think live queries are necessarily difficult to do, though, either - the argument I'm arguing against is - "live queries at scale isn't a solved science, so we're going to ignore the concept entirely."

@robzhu

This comment has been minimized.

Copy link
Contributor Author

@robzhu robzhu commented Mar 7, 2017

"live queries at scale isn't a solved science, so we're going to ignore the concept entirely."

I don't think anyone is ignoring the concept. But "live queries at scale isn't a solved science" has some truth to it. We hope to share more details in the coming months as we continue to learn from our live query experiments at Facebook. However, assuming live queries work perfectly, we believe live queries and subscriptions are different tools in the real-time API toolbox.

@stubailo

This comment has been minimized.

Copy link
Contributor

@stubailo stubailo commented Mar 7, 2017

Yeah I think it's important that the spec proposal doesn't say "this is the only thing we will ever do for realtime data" or even "this is the best way to do realtime data" - it should just say "this is the way of doing realtime data that is understood clearly enough to specify"

@robzhu

This comment has been minimized.

Copy link
Contributor Author

@robzhu robzhu commented Mar 7, 2017

For example, in my case, where I have many microservices on my backend, where some nested fields go to other services, it's not really straightforward to define how live queries would work, and I've chosen explicitly to model things as event streams.

@taion I'm curious to know if you think of subscriptions and live queries as semantically different. Suppose you had both subscriptions and live queries at your disposal, when would you use one over the other?

@smolinari

This comment has been minimized.

Copy link

@smolinari smolinari commented Mar 7, 2017

That is a great question, but it blows my definition of live queries out of the water. Doesn't it? Hehehe... LOL! 😄

If I may answer too. I think with Christian's (@paralin) rqraphql system, live queries are a server-side and domain specific decision. From what I understand from the rgraphql docs, if you want a live query, the ability to observe for updates is "baked" into one or more resolvers for that query. And, I believe this is where this concept has a general concern (and something still missing in the spec too). It requires a front-end dev to have intimate knowledge of the back-end decisions, as the type of query (live or not) cannot be directly "seen" through introspection, whereas, it should be. Sure, one could add some type of comment, but is that really a good solution for flagging queries as "potentially live" with introspection?

The other question that burns in my mind is, how does the server know who to broadcast these updates to? The docs mention killing long running processes. That is only scratching the scaling issue.

I guess I am the stupid guy on the fence between these two solutions. I don't think GraphQL should be working with events internally. They aren't needed, as Christians's rgraphql system proves. Yet, I don't think pure live queries, without some sort of subscription system, are also the right solution either.

Oh. And just because a live query has a subscription system tagged to it, doesn't mean it can't be called a live query. 😉

Scott

@taion

This comment has been minimized.

Copy link

@taion taion commented Mar 7, 2017

@robzhu

I do think of them as semantically different. It would be awkward to do a toast notification with a live query rather than an event-based subscription stream, for example.

That said, I am mostly using subscriptions as a poor man's live query system, with easier-to-understand semantics on the backend. If I had a reactive backend that supported live queries, I would mostly move to using live queries – but I don't, and I decided it wasn't worth the architectural trade-offs required to do so.

Additionally, I expect the majority of users of GraphQL subscriptions as-is to use them to do something somewhat similar to my use case of emulating live queries in an easier-to-implement manner for complex back end systems.

@paralin

This comment has been minimized.

Copy link

@paralin paralin commented Mar 7, 2017

@smolinari don't you mean my system? :)

@smolinari

This comment has been minimized.

Copy link

@smolinari smolinari commented Mar 7, 2017

@paralin - Sorry about that. You are right. Me goes correcting

Scott

@Siyfion

This comment has been minimized.

Copy link

@Siyfion Siyfion commented Mar 7, 2017

So are we saying that the difference between a "Live Query" and a "Subscription" is essentially how the updates are pushed? A LQ will send you any updates automatically that effects the original query, whether it be an add/remove/update, a Subscription needs a "manual" push of new data, allowing the programmer to be selective about what updates are sent?

@paralin

This comment has been minimized.

Copy link

@paralin paralin commented Mar 7, 2017

@Siyfion in rgraphql at least the developer still has fine control over what gets sent to the client, the system just manages getting those changes to the client and applying them properly.

The only difference I can see really is that subscriptions are limited to the root level of the query only, and cannot be updated after they have begun. These properties are probably good for when you're subscribing to general streams of events. I wouldn't use it for live data though.

Imagine you're trying to build a news feed with comments. What happens if someone edits a comment? Do you just push a event saying it was edited via a subscription? But then all of the logic to apply the updates has to be hand built separately for each of the types of things you might want to update. That seems wrong to me.

Instead you can just subscribe to the same streams of data on the server, interpret them correctly, and then send back updates to the client tailored to the data they already have.

@smolinari

This comment has been minimized.

Copy link

@smolinari smolinari commented Mar 7, 2017

@paralin - How does your system know when to send the news feed updates or rather, to which clients?

Scott

@paralin

This comment has been minimized.

Copy link

@paralin paralin commented Mar 7, 2017

@smolinari that's up to the developer to decide. In Go we have strong concurrency patterns around streams of data, and Magellan supports all of those patterns when resolving fields. When a user subscribes to some live query the server decides how it will fill that query, and the developers code can return many different permutations of result representations, including ones that change over time.

@smolinari

This comment has been minimized.

Copy link

@smolinari smolinari commented Mar 7, 2017

When a user subscribes to some live query

I missed how this can be done with rgraphql. Can you point me to the docs (or code), where this is explained (done)?

Scott

@paralin

This comment has been minimized.

Copy link

@paralin paralin commented Mar 7, 2017

@smolinari http://github.com/rgraphql/soyuz

Not much in the way of docs yet, mostly focusing on optimizing and getting in mutations right now. But the interface is the same as in Apollo. Call query, returns an observable, subscribing to the observable triggers the query to actually be applied. The system merges together the entire tree of active queries into one query object and keeps that in sync with the server.

There is a lot of information on how it works in the protocol.md doc under I think Magellan (I'm on my phone right now, apologies for the lack of a link)

@laneyk

This comment has been minimized.

Copy link

@laneyk laneyk commented Mar 7, 2017

What happens if someone edits a comment? Do you just push a event saying it was edited via a subscription?

Yup, that's how we would do it.

But then all of the logic to apply the updates has to be hand built separately for each of the types of things you might want to update. That seems wrong to me.

For us, the subscription payload that gets pushed to the client is the same type as a comment_edit mutation payload, and the client already has logic for updating the comments UI in response to a comment_edit mutation response. In general, on our native clients and in Relay, we have client-side infra that is smart about taking GraphQL responses, sticking them into a GraphQL cache, and updating the UIs accordingly, so it's not actually as bad as you make it sound to add logic to handle a subscription response.

@paralin

This comment has been minimized.

Copy link

@paralin paralin commented Mar 7, 2017

And yet you have to base every change on mutations.

I'm building a app right now that is extremely reliant on outside data - that is, sensor data, position data, connectivity, etc from a large number of sources. To make a mutation to affect every little change to this data would be impossible. This type of live data is something well suited to GraphQL, because the client can subscribe to only what it needs. It's also something that cannot be done with subscriptions in any tractable way.

This example I believe reveals that there are actually two types of live data that a GraphQL user might want to have: streams of updates to individual fields, along with batch updates as a result of measurable transactions.

I believe this is the best argument yet for building two different live mechanisms into GraphQL.

@laneyk

This comment has been minimized.

Copy link

@laneyk laneyk commented Mar 7, 2017

Just catching up on everything in this thread. I'm seeing two general questions being discussed here:
(1) How hard is it to implement live queries?
(2) Are GraphQL subscriptions useful in their own right, even in a world with working live queries?

Re: (1), we believe based on experience at Facebook and discussions with other folks that the general problem of implementing live queries at scale is not easy. This doesn't mean that it is always hard; with an efficient reactive backend, implementing live queries becomes fairly straightforward. As @taion mentioned, though, some folks might have "many microservices on [the] backend." Some might have tens or even hundreds of different DBs and services backing the data in their GraphQL schema. The general problem of moving all of the backing data for a GraphQL schema to a reactive backend is quite challenging.

However, I think we're getting off-topic by focusing on question (1). The more relevant question for this RFC is (2). Based on my experience working with a bunch of Facebook product teams building real-time features and rolling out GraphQL Subscriptions at scale over the past two years, I believe that the answer to question (2) is yes. We've seen cases where product folks explicitly design their real-time experience around events. They need control over things like which specific events get priority when the rate is too high to deliver all updates. @paralin said previously that "Live queries are infinitely more powerful than subscriptions." I'm not sure if I agree with this, and I'm also not sure that it's useful to debate the meaning of "powerful" (super relevant talk: https://www.youtube.com/watch?v=mVVNJKv9esE) but one thing I will say about subscriptions is that they put more control into the hands of the product developers over which updates they'll receive.

We have also seen examples that lend themselves nicely to live queries, and some people in this thread have mentioned examples of that sort. Internally, we are still experimenting and working with product teams to arrive at a general understanding of which use cases are better served by subscriptions and which are better served by live queries, but we are confident that the former is not an empty set.

@paralin

This comment has been minimized.

Copy link

@paralin paralin commented Mar 7, 2017

@laneyk Agreed in full. I don't dispute that I've been overstating the worth of live queries a bit, primarily because I'm passionate about seeing them considered due to their value in my particular niche applications. I don't believe that live queries are the only way to do it, just that they are an effective mechanism in a lot of small to mid scale applications.

It makes sense that live data and events would have very different mechanisms.

@calebmer

This comment has been minimized.

Copy link

@calebmer calebmer commented Mar 7, 2017

One thing I will say about subscriptions is that they put more control into the hands of the product developers over which updates they'll receive.

This is one thing that I see consistently in the design of GraphQL. Besides the debate between live queries vs. subscriptions it may be worth thinking about this client-developer-control as a key design point of GraphQL.

If you think about mutations, they require a lot of work on the client developer’s side to update the cache. This is a problem that Apollo Client, Relay, and any future GraphQL clients will struggle with. A lot of GraphQL beginners really want mutations to be “magical.” They want to send a mutation to submit a comment and have that comment be automagically inserted into their pre-existing list with zero boilerplate, but GraphQL wasn’t designed to be magical it was designed to be practical.

In its practicality GraphQL tries to enable both the server and the client developer as much freedom and flexibility to work in and around the query language without over-prescribing. The server developer may require a token in an HTTP header, or return a JSON blob as a scalar field. The client developer may implement super custom updates to their data based on a mutation or subscription which takes into account variables only the client knows, like a local priority based on what screen the user is on. However, this practicality comes at the cost of some higher-level “magic” features that would make development much faster such as live queries or zero boilerplate mutations on the client.

I like that GraphQL has chosen to be practical. It’s the same choice React has made whereas Angular has chosen the “magic” route. If you want magic in the data API space I heavily encourage you to check out Falcor. Unlike GraphQL, Falcor’s design is optimized for some of these magic features like live queries and simple mutations that people would like to see (Albeit you probably won’t get any magic from Falcor in its current form, but I think the design is there. Also forget about the fact that Falcor doesn’t have a schema! You could easily write a version of Falcor with static types and get the same GraphiQL experience).

What do you think? Do you see the same consistent choice in design decisions? Do you agree that live queries are a “magical” feature?

My point isn’t so much to argue for-or-against live queries (or even for-or-against magic!), I just wanted to make an observation about the design of GraphQL that I’ve noticed from time to time 😊

(since it was mentioned this talk is amazing https://youtu.be/mVVNJKv9esE and its concepts apply to this observation as well)

@paralin

This comment has been minimized.

Copy link

@paralin paralin commented Mar 7, 2017

@calebmer You don't need to have a feature in the spec to build it. Projects like mine that add real-time to GraphQL operate with GraphQL in its current form, and declare their own rules as to how data is handled. Therefore they are derivative of GraphQL and perhaps compatible while not GraphQL in their own sense.

GraphQL definitely can support these types of things, and I believe it's productive to at least discuss inside the bounds of GraphQL without deferring to other products entirely.

Your point absolutely holds - GraphQL's spec doesn't really need to have real-time built in. It would be nice, but it would always be labeled as an optional feature anyway. Maybe it's best to leave these features to derivative projects to define, with loose guidelines in the spec? I believe subscriptions should be in the spec for sure, but real-time maybe not. That talk's really good and definitely applies here, thanks for the link!

@jamesgorman2

This comment has been minimized.

Copy link

@jamesgorman2 jamesgorman2 commented Mar 7, 2017

@smolinari

This comment has been minimized.

Copy link

@smolinari smolinari commented May 4, 2017

Ahhhhh. Very interesting.

The logic to show a different value than what is persisted sort of eludes me. I guess I can't question FB's business decisions in the end. So, I won't. 😄

I also still say the event based solution is an implementation detail on how to make data change events reactive. But, I digress on that too.

Thank you so much for your patience. I'll be very much looking forward to the reference implementation and learning and hopefully also helping a whole lot more in the future.

Scott

@taion

This comment has been minimized.

Copy link

@taion taion commented May 4, 2017

Maybe another way to think about it is that event-based subscriptions are one option for live query implementation, but in that context they're a transport-level concern. By contrast, for an event stream, the subscriptions actually do map to what's logically happening.

The "like count" thing is an interesting example, because visually it resembles live queries, so I'd argue that it's closer to a workaround over real reactivity there being really, really hard – but having tried to build conceptually similar things on our end with subscriptions, it's a very defensible one.

@laneyk

This comment has been minimized.

Copy link

@laneyk laneyk commented May 4, 2017

@taion: I agree that likes is not the best example to talk about subscriptions in this particular discussion since it's something that is probably a better fit for a live query (assuming both options exist).

@taion

This comment has been minimized.

Copy link

@taion taion commented May 4, 2017

One example where product requirements might specifically dictate events over live queries is something like Twitter's timeline, which shows a badge for new updates rather than immediately displaying new updates – if the user's about to interact with a timeline entry, you don't want to bump the timeline down in an unsolicited manner and make the poor user retweet the wrong thing, or something like that.

@paralin

This comment has been minimized.

Copy link

@paralin paralin commented May 4, 2017

@taion live queries still apply there, you would just restrict the query to never add new entries without an explicit argument change.

@acjay

This comment has been minimized.

Copy link

@acjay acjay commented Nov 22, 2017

What's funny is that the same argument plays out over and over again.

For example, Redux, at its core, is event driven, although they call them actions. It gives you a structure for producing a live view of your state in the form of its reducers and selectors. MobX has you mutate your live model directly, and to the extent that events need to trigger processes, you need to handle that in your mutation logic.

There are strong reasons to build systems around the changing data itself. You don't have to worry about accounting for all the causes.

There are strong reasons to build systems around events. Sometimes, user experience does care about the causes.

Events can be depicted in a live query model by having field that will be the most recent event or null. After all, the schema need not restrict itself to depicting only things that are literally persisted. Clients then would be responsible for queuing up any events that happen to appear. It would be awkward, but possible.

Likewise, subscriptions can support live queries by pushing the full state (or changes thereof) in every event. The event becomes "your data has changed". Also awkward to set up, use, and optimize.

I think it's probably a good idea to have first-class approaches to both paradigms, even in the same application.

@taion

This comment has been minimized.

Copy link

@taion taion commented Nov 22, 2017

@acjay

I think we're in general agreement there, and most of us are targeting live queries. The core issue is just that live queries, even at a schema level, require making more decisions – e.g. do you use something like JSON patch to communicate the updates? Or if not what do you use?

Right now a number of implementations mock live queries with polling, but I think a general solution requires the kind of general consensus on how to push live query updates that does not yet exist.

@smolinari

This comment has been minimized.

Copy link

@smolinari smolinari commented Dec 11, 2017

What events can happen that aren't persisted? If they aren't persisted, can they be? If they can be, and I know they can be, then those events can be "triggered" over live queries. Right?

Can every live query be modeled into an event system? Sure they can. But, then you'd be building another separate system. I've seen this done for MongoDB in many ways for example. So it is clear the want for live querying is relatively large. Why is that?

Obviously too, only databases that can send live query messages can be used in a proper live query system. Otherwise, you are back again to needing a messaging/ queuing events system/ bus, etc.

I can understand why FB went with events. AFAIK, they don't have databases that support live queries. But, maybe they should? If they did, I bet this whole discussion and any solutions would get a whole lot easier. 😉

Scott

@taion

This comment has been minimized.

Copy link

@taion taion commented Dec 11, 2017

Any sort of stream data – trades, clickstream, &c. aren't nicely modeled by live queries and would have to be emulated there.

@acjay

This comment has been minimized.

Copy link

@acjay acjay commented Dec 11, 2017

@smolinari If you have a 15 minutes, check out the other issue I made above your comment. I'm increasingly convinced that all the pieces needed for a live query system more or less already exist in today's subscriptions. Although, since it's so far just been a big thought experiment, some details might be missing.

@robzhu

This comment has been minimized.

Copy link
Contributor Author

@robzhu robzhu commented Dec 11, 2017

There was a recent talk on Live Queries at GraphQL Summit by @rodmk, one of the engineers who works on the Live Query system at Facebook. I think it addresses several of the recurring questions in this thread. https://www.youtube.com/watch?v=BSw05rJaCpA.

@smolinari

This comment has been minimized.

Copy link

@smolinari smolinari commented Dec 13, 2017

@acjay - Absolutely. I never doubted GraphQL's capabilities to accommodate Live Queries. My whole argumentation here was to say that the added event driven system to make subscriptions work is basically unnecessary for (proper) GraphQL, because it can and should support live queries and that is the better answer to subscriptions and state management. Maybe my thinking was a bit ahead of its time???? 😄

@robzhu - Hah! Wow! Excellent video! Rodrigo demonstrates everything I've been trying to get across here. I'm all giddy now. 😛 And no, I don't mean to say, "I told you so.". 😄 I do still get FB's need to not go straight away with a live query solution, because of FB's legacy systems, which Rodrigo also mentions. (i.e. you can't rewrite all of the PHP code.) It demonstrates how FB's own internal issues drive directions in its open source projects and that is all fine and dandy, as a lot of dev shops out there will have those same kinds of issues. But there are also those, who are starting anew and want the best they can get too and Live Queries are the better/ simpler solution, granted only with a true reactive data store.

I've enjoyed this whole discussion and I'd like to thank you all again for the opportunity.

Scott

@acjay

This comment has been minimized.

Copy link

@acjay acjay commented Dec 18, 2017

@taion I just re-read #284 (comment), and now I think I get exactly what you mean. And from my side thread, my conclusion to the title question, "Are Subscriptions and Live Queries the same thing?" is now "yes", qualified only by the need to answer the question of how to send updates.

In the best case scenario, those semantics can be defined at the spec level, leaving very little to be decided by library and application developers.

But, what if there's no natural one size fits all solution to describing updates? Much as scalar leaves basically every aspect of implementation to the client and server, could something similar be done for the concept of updates? If so, I think there's one major advantage to implementing live queries within subscription: you can subscribe to both new events and the changing state.

@robzhu, since you closed this ticket with the opposite conclusion, namely that live queries should be something separate from subscriptions, I'm curious whether this would address your concerns.

@taion

This comment has been minimized.

Copy link

@taion taion commented Dec 19, 2017

The spec thing sort of is the thing, though. We were more or less able to ship subscriptions as of v0.4.8 that added support at a parsing level. The v0.10.0 release that changed the API to add first-class support – that was very, very nice from an API perspective, but ultimately didn't amount to much more than a minor API refactor: https://github.com/edvinerikson/relay-subscriptions/pull/39/files

By contrast, contra @rodmk, I can't see how to nicely implement live queries in a way that lets me handle lists efficiently, without pushing down the entire list every time the query updates, without some additional spec-level support. A subscription is so similar to a mutation from the schema perspective. A mutation isn't.

There is another distinction, too. Ultimately it's not that awkward to subscribe to add, delete, and change events. Doing something like Twitter's "new tweets" alert (instead of reactively showing new tweets) with subscriptions is... possible, but extremely annoying. And there are cases where you either want to or have to ship updates in that manner (e.g. we're doing HIPAA-related stuff, we may want to only indicate the availability of new data, rather than pushing down new private-ish data to the client... ).

@acjay

This comment has been minimized.

Copy link

@acjay acjay commented Dec 19, 2017

I can't see how to nicely implement live queries in a way that lets me handle lists efficiently, without pushing down the entire list every time the query updates, without some additional spec-level support.

I'm not sure if my point isn't clear, or if I'm missing something you're saying. I think we agree that lists would seem to be the trickiest data type for coming up with a globally accpetable scheme of representing updates.

But do you get my point in analogizing that with the scalar situation? The handling of custom scalars is one of the more interesting (and initially confusing) parts of GraphQL to me. The spec basically completely punts on anything having to do with how they're represented. They're just dumb leaf data. It's up to the client and server to determine the convention for their representation. This is great, because it avoids clogging up the spec with arbitrary choices for things like dates and times.

Can't the same approach be used for the representation of updates, since there are several reasonable approaches? On a really simplified level, the server needs to implement some function (lastState, newState) => changeRepresentation for each type, and the client needs a corresponding set of functions (lastState, changeRepresentation) => newState. For argument's sake, let's just say the reference server implementation provides a default for all types could just be just send newState directly, ignoring lastState. Presumably, the reference server implementation would allow this default to be overriden by something more optimized.

There is another distinction, too. Ultimately it's not that awkward to subscribe to add, delete, and change events.

Yeah, I get that, but for reasons I think everyone agrees with, the event approach just isn't a great fit for every application. I'm just trying to say, I don't think it's actually that much more complex to do live queries using the exact same mechanism as has been built for events, with really just one additional concept of what I might call "modular update representation".

I hope this makes my point clearer, and sorry if I've misunderstood what you're trying to say.

@taion

This comment has been minimized.

Copy link

@taion taion commented Dec 19, 2017

@acjay

What you're saying makes sense. The distinction I was drawing was that, with subscriptions, there was an "obvious" choice of the semantic GraphQL payload to send back to the client that exactly matches what things look like with a mutation.

The issue with live queries (esp. lists) is exactly as you say – the specific implementation needs to define its own format to use for encoding deltas, which is a problem that didn't arise with event subscriptions. It's just more stuff to decide for the app developer.

@smolinari

This comment has been minimized.

Copy link

@smolinari smolinari commented Dec 21, 2017

Just to throw in what I've been understanding as a live query, which seems to be different to the discussion here and even a bit to what Rodrigo explained too, but I believe live queries shouldn't return whole datasets or deltas of the changed data, but rather only send a trigger to the client to re-request its "affected" query again. That way, the back-end can stay fairly dumb, because the client is the one asking for the new data through the particular query and only the updated data gets "pulled" back into the client.

Does that make any sense?

Scott

@paralin

This comment has been minimized.

Copy link

@paralin paralin commented Dec 21, 2017

That sucks because the server has to look everything up again and can't keep any context in memory.

You're getting caught up in the implementation details. There are a lot of ways of accomplishing this. Two way socket, pub-sub change notification channel, long polling, merkle tree data hash comparison and state sync, server-side in-memory meekle tree result caches....

@robzhu

This comment has been minimized.

Copy link
Contributor Author

@robzhu robzhu commented Dec 21, 2017

@taion I just re-read #284 (comment), and now I think I get exactly what you mean. And from my side thread, my conclusion to the title question, "Are Subscriptions and Live Queries the same thing?" is now "yes", qualified only by the need to answer the question of how to send updates.

since you closed this ticket with the opposite conclusion, namely that live queries should be something separate from subscriptions, I'm curious whether this would address your concerns.

Re-reading the thread now, I have not found compelling arguments for why the answer to this question is "yes". To quote from Rodrigo's presentation at GraphQL Summit, "Live Queries observe data, subscriptions observe events"

For example, suppose you had a server-side clock that tracks the current time. The current time has two interesting properties: the value itself, and when it "ticks".

If you want to observe the current time, use a Live Query.
If you want to observe the "tick" event, use a Subscription.

These are (awkwardly) isomorphic because you can always record the set of events in a list and observe that list. For example, you can use a CQRS-style log, but it seems silly to have a CQRS log for seconds in the day.

Another angle: a Live Query is essentially a Query. You can poll any Query to simulate its behavior as a Live Query. By contrast, polling a Subscription (where the subscription does not have a stateful channel between polls) doesn't make sense.

Hope that communicates my current thinking. I'm not seeing the recent arguments cover new ground, so I'm inclined to keep the issue closed, but please let me know if I'm missing some context.

@paralin

This comment has been minimized.

Copy link

@paralin paralin commented Dec 21, 2017

@robzhu summarizes it nicely. It's easily possible to add an events (subscriptions) implementation on top of whatever live query system, and it's also probably possible to make a live query system using subscriptions as some kind of awkward transport.

At the end of the day data is data and the way you transfer it depends on what you want to do with it and how often it changes.

@acjay

This comment has been minimized.

Copy link

@acjay acjay commented Dec 21, 2017

@paralin But the point I'm trying to make is that if we can "forget" for a minute that subscription was created with an event paradigm in mind, it's actually very close to being suitable for live queries, as well. What seems to be missing is simply a concept of a difference between the intial response and the stream of updates and a (modular?) scheme for representing those updates. Not to minimize those issues, but it feels like a manageable hump. Which is also why I'm thrilled the answer has been revised to "yes" :D

@paralin

This comment has been minimized.

Copy link

@paralin paralin commented Dec 21, 2017

@acjay Those two things that you just described - including "modular scheme for representing those updates" - is a live queries system. There's no reason to use a subscription channel as your transport for a live queries system. It adds nothing over just a websocket transport. Therefore a subscription channel is not suitable for live queries, as well. It's suitable for the event-based paradigm, which was what it was designed for.

I built a prototype of an efficient live-queries system with magellan and it doesn't look anything similar to the subscriptions system - for performance I binary encode and batch changes to different parts of the result tree, which wouldn't be possible via a subscriptions channel anyway.

@acjay

This comment has been minimized.

Copy link

@acjay acjay commented Dec 21, 2017

@paralin Maybe I'm missing something, but if the assumption is that a web socket server could simply choose to interpret a vanilla query as being a subscription for live query updates, why wouldn't the exact same thing work for events? It's just a single query that's responded to multiple times, when the server deems it appropriate.

@taion

This comment has been minimized.

Copy link

@taion taion commented Dec 21, 2017

@acjay I think that's exactly right. A minimum (not especially efficient) implementation could just hold onto the full query and re-run the entire thing and push the results down to the client every time it gets an update. That's in fact how I read the "call to make a prototype" bit at the end of @rodmk's talk.

@paralin

This comment has been minimized.

Copy link

@paralin paralin commented Dec 21, 2017

@taion @acjay I would struggle to call that a live query system at all. As we're discussing what a real implementation of something like that would look like, or in essence trying to figure out what the "best approach" would be, I'm not really considering hacks like sending the entire state over a subscription channel as a "live query system."

You can do the exact same thing with just a websocket and a server-side polling [run query, check if changes happened, wait 3 seconds] loop, and remove the entire graphql stack. In that way it's not useful to have the subscriptions stack in the mix at all for something like this. It is for this reason that I would say that the two things are entirely separate and should be treated as such.

I went and watched Rodrigo's talk and while I would argue that saying Subscriptions and Live Queries are interchangeable is misleading, he is right in that you can build almost any application with either approach. One approach will just be better for certain types of things than the other.

@taion

This comment has been minimized.

Copy link

@taion taion commented Dec 22, 2017

@paralin Let's move this discussion to #386 instead of continuing to comment on a closed issue.

@smolinari

This comment has been minimized.

Copy link

@smolinari smolinari commented Dec 23, 2017

Live Queries observe data, subscriptions observe events

Live Queries observe "data store events", i.e. record creations, updates and deletes. Also, those data store events could be due to other events.

Scott

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
You can’t perform that action at this time.