Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

discussion: livequery integration #13

Closed
sebakerckhof opened this issue Oct 11, 2016 · 7 comments
Closed

discussion: livequery integration #13

sebakerckhof opened this issue Oct 11, 2016 · 7 comments

Comments

@sebakerckhof
Copy link

sebakerckhof commented Oct 11, 2016

Let's discuss how a possible Meteor livequery integration could look like.

Advantages
Livequery integration would allow the full expressiveness of the mongo query syntax (limited by the minimongo limitations). Especially the ability to sort & limit queries seems interesting to me. This makes subscriptions possible like: give me updates on the top 3 voted posts. This would be hard to implement in a traditional pubsub system.

Disadvantages
Hard to extend to other DB's beyond mongo (but maybe postgres and rethink).
Only as scalable as livequery is (although no merge box required?)

Possible approach with the current system
The current system doesn't seem particularly well fit, but one could imagine something like:

const subscriptionManager = new SubscriptionManager({
  schema,
  pubsub,
  setupFunctions: {
    topComments: (options, args) => ({
      someName: {
        channelOptions: {
          cursor: Comments.find({reponame: args.repoFullName},{sort:{votes:-1},limit:3}),
          added: true,
          changed: true,
          removed: true
        }
      },
    }),
  },
});

The pubsub system could then add an observer on the cursor and trigger the changes the user indicated it wants to listen to (added, changed and/or `removed).

Problems with this approach
It's quite obvious that it's quite hacky, the current system doesn't seem well fit. E.g. the channel name (someName) has no meaning.

It's also not really a pubsub system. We're only subscribing, but can't publish.

@davidyaha
Copy link
Contributor

Hey @sebakerckhof!

I like your API actually. The great thing (that I think of) about the ability to create one to many relation on the triggers is that you can than feed the topic with events from different collections and make use of graphql queries here.

If I try to add to your example:

const subscriptionManager = new SubscriptionManager({
  schema,
  pubsub,
  setupFunctions: {
    topComments: (options, args) => ({
      comments: { // name of the collection queried.
        channelOptions: {
          cursor: Comments.find({reponame: args.repoFullName},{sort:{votes:-1},limit:3}),

          // or take the query arguments so you can build and maybe cache and reuse queries
          selector: {reponame: args.repoFullName},
          sort: {votes:-1},
          limit: 3,
          fields: {text: 1, votes: 1, ownerId: 1},

          added: true,
          changed: true,
          removed: true
        }
      },
      // Create dependent query
      users: {
        dependentOn: 'comments', // Each change will trigger a function that takes the results as       
                                                      // arguments and create a dependent live query
        onChange: (comments, {fields}) => Users.find(comments.map(c => c.ownerId), {fields})
        fields: {name: 1, points: 1,},
        changed: true,
      }
    }),
  },
});

A change on the parent query or the dependents will create a publish event. Also, you still can use the publish call and allow for further tuning of this reactive system. Let's say you realise some of your live queries are creating too much load on the system, you can easily revert to an event created from the mutation to reduce the load. All that is without changing your client because it is still getting events as it would like from graphql's subscriptions.

Personally I see this as one missing link to making meteor scalable. With meteor as it is, once you got into a really intense publish (Like one that has alot of merges or dependent on an array with a growing number of members) than you would basically need to build a whole other system to do that publishes for you. With this, you don't have to.

One last thing, this approach could definitely be ported to postgres. There is actually a fork of my redis subscriptions package that connects to postgres's LISTEN/NOTIFY mechanism . And by using postgres's TRIGGER system, you could create basically the same behaviour as you suggest here just it might be less configurable from your API at the moment.
Check it out here https://github.com/jtmthf/graphql-postgres-subscriptions

cc @jtmthf for comments on this.

@sebakerckhof
Copy link
Author

sebakerckhof commented Oct 12, 2016

Btw, For those interested I've created a graphql-mongo-subscriptions package: https://github.com/sebakerckhof/graphql-mongo-subscriptions .
It also tails the oplog and filters documents with minimongo. But you can't do sorting etc.

@davidyaha
Copy link
Contributor

Super cool! I'll look into it more closely tomorrow!

@DxCx
Copy link

DxCx commented Oct 13, 2016

Hi,

i think that this is not natural enough,
IMO, there is much better solution, using @Urigo 's meteor-rxjs
with my rxjs subscription approach.

so, eventually, a subscription will look like this

const resolvers = {
    query: { ..... },
    subscription: {
        Tasks(root, args, ctx) {
             return ctx.Tasks.find(args);
        }
    }
}
....

where Tasks (And the rest of Meteor's Collections) can live inside the context object.

This approach seems much more native to me, also, the same resolver/Tasks Object can be used for simple queries as well and they will be similar to subscriptions.

@jtmthf
Copy link

jtmthf commented Oct 13, 2016

@davidyaha continuing off what you had,

I think that an API like that could be implemented in pg. Can't think of the full implementation without going through some trial & error first. I've detailed the capabilities and limitations of pg that would define such an API.

What I have right now makes use of pg LISTEN/NOTIFY along with table triggers that can be ran on each insert, update, and delete. Whenever an operation happens that mutates a table, that table's corresponding trigger function is called. That function then bundles both the old_val, new_val, and table for the corresponding row. This is similar to how RethinkDB works. That is then JSON stringified and sent by NOTIFY on a global graphql-subscription channel.

The json structure would look like this:

{
  "table": "table_name",
  "old_val": {
    ...
  },
  "new_val": {
    ...
  }
}

Inserts would only have a new_val, updates would have old_val and new_val, and deletes would just have old_val. This setup is due to constraints of the pg LISTEN/NOTIFY architecture. While it is possible to make pg live queries on a particular query and only listen on updates on it, it's not scalable. A pg connection can only listen to one channel at a time and while doing so, it can do nothing else. Unfortunately a pg database runs into memory constraints once you start to have over 100 simultaneous connections. This explains the need for a single global channel name. With this approach, each application server needs only a single listener connection. This allows scalability to several dozen application servers.

The issue remaining here is that application servers receive all row updates from every single data mutation on the database. These updates need to be mapped back to the particular data set that is being subscribed. Three approaches are possible.

  1. For every update, rerun the query for each subscriber. Easiest API, but this could trigger massive amounts of db overhead.
  2. Determine if updated data pertains to subscription through a set of filter operations. Then rerun query. Because updated rows are sent one at a time, filter operations can't compare to other rows. Rerunning the query would be used for operations including limit, offset, and order. This would limit the number of queries ran, but could still be database intensive.
  3. Run updated data through filters. If data passes through filters, push to client immediately. Client is then responsible for determining whether to keep the data or not. Examples would be checking if the data is in the top 3 and inserting into the correct place. Hardest API to implement, but most performant.

Right now it's too hard to tell which option would work best, but I'm leaning towards 2 right now. Testing through implementation will tell.

@dandv
Copy link
Contributor

dandv commented Jul 9, 2018

It also tails the oplog and filters documents with minimongo.

@sebakerckhof: that should be easier in MongoDB 3.6+ with change streams.

@grantwwu
Copy link
Contributor

grantwwu commented Oct 2, 2018

This seems out of date/old, plus I don't think GraphQL is going with Live Queries?

@grantwwu grantwwu closed this as completed Oct 2, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants