Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API and database access #1003

Closed
ttsirkia opened this issue Apr 7, 2019 · 19 comments
Closed

API and database access #1003

ttsirkia opened this issue Apr 7, 2019 · 19 comments

Comments

@ttsirkia
Copy link

ttsirkia commented Apr 7, 2019

This is more like an ideological question rather than an issue in the current version.

What is the proposed way of creating a custom API endpoints? Is it just like this server.app.get('/api/doSomething', (req, res) => { for every endpoint that is required? Could there still be some server-side routing for such cases?

Is there a way to access Mongoose directly as earlier? So how to make more complicated database queries that custom APIs could require? I see that there are new adapters for different datasources but does this prevent using Mongoose directly?

It would be very nice to see TypeORM as the ORM layer but unfortunately it works best with TypeScript and might be therefore hard to integrate here. Although writing own adapters can be tricky and require lots of work.

There can be some answers already but these were my first questions after looking this new version which otherwise looks very promising.

@jesstelford
Copy link
Contributor

jesstelford commented Apr 7, 2019

It sounds like there are three distinct features you're asking about here:

  1. Adding custom routes/URIs to the API
  2. Custom Mutations
  3. Programmatic access to running queries/mutations from within a Keystone App

Adding custom routes/URIs to the API

Is it just like this server.app.get('/api/doSomething', (req, res) => { for every endpoint that is required?

That's right! We expose the express instance so you can add whatever you like.

Could there still be some server-side routing for such cases?

We may be even less opinionated about this going forward, see https://github.com/keystonejs/keystone-5/pull/960#issuecomment-480101846

Custom Mutations

This is currently undocumented, but we have the ability to add custom GraphQL mutations and their resolvers, which gives you a lot of power to directly extend the schema that is auto generated for you.

Programmatic access to running queries/mutations from within a Keystone App

We currently expose an API via your keystone instance which gives you access to the same functions which are executed internally by KS during queries and mutations. Normally the other features I mentioned above should be favoured, but this API exists for advanced use cases. It's also not yet documented.

@ttsirkia
Copy link
Author

ttsirkia commented Apr 7, 2019

Thanks for the answer!

One of my concerns is that must everything be done via GraphQL in Keystone 5 or is it still possible to expose APIs that run Mongoose queries in the server-side without any relation to GraphQL.

I guess that the relevant question is that can you still access the Mongoose models and Mongoose methods directly or not?

@jesstelford
Copy link
Contributor

We're moving toward settling on GraphQL as an abstraction layer on top of your data.

While Keystone v5 does use mongoose, that's just an implementation detail of the MongoDB database adapter.

There is also the SQL adapter which happens to use Knex. And there will be many more adapters in the future which use various libraries.

While it would be technically possible to directly access the particular adapter in use, we want to encourage Keystone app developers to use the built in GraphQL engine which abstracts that detail away.
The goal is to make your code more portable (across Keystone apps), easier to upgrade as new KS features are added, and makes potential future database ports easier for you.

Are you able to outline the specific use case you had in mind? It may already be solved in another manner (probably undocumented, sorry!), or it might be a new feature we can add so you don't have to drop down to the mongoose layer.

@jesstelford
Copy link
Contributor

I should also note that when performing a programmatic GraphQL query/mutation via the Keystone instance, it does not go via the network; there is very little overhead in executing it.

Also, the adapter layer does not respect any access control or hooks logic which the GraphQL layer does.

@jesstelford
Copy link
Contributor

@ttsirkia fyi; I've closed the issue as ✅ answered, just so we can keep our issue backlog clean.

I'm very much still interested to hear your use case for direct DB/Adapter access 👍

@ttsirkia
Copy link
Author

ttsirkia commented Apr 8, 2019

Thanks again! I think my concern is that this new model is perfect for applications in which all data is public and queries are rather simple.

Custom APIs

I see many use cases in which I don't expose the query in the client-side. The query may contain some business logic that should not be visible for users. Therefore, I would like to see an easy way to make API calls and do the queries in the server-side. This applies for both reading and writing. Server can still use GraphQL, Mongoose queries, raw SQL or whatever.

Many applications also contain data that should be visible only for some users. Before running queries, there can be very fine-grained access control etc. and I don't see that a generic framework can provide all the needed control for this.

If there is large amounts of data, giving a GraphQL endpoint may cause a possible route for making intentional or unintentional DOS attacks. A query which fetches lots of data in a way that indexes cannot be used etc.

Creating custom resolvers can resolve some of the issues but is highly linked to the topic in below.

Raw database access

GraphQL works nicely for rather simple queries. But more complex queries, especially containing Mongo aggregation pipeline, are not possible. Therefore, I see it as mandatory feature that there is a way to access Mongoose (or other sources) directly in the server-side code.

@JedWatson
Copy link
Member

@ttsirkia some really good points / questions here, going to see if I can help clear a couple of them up.

Internal APIs

The internal GraphQL APIs that @jesstelford is talking about are best thought of as an ORM layer. They ideally replace the usual things you'd get from mongoose like find() and update(), with the benefit of being built on top of Keystone's internal abstractions (including field type logic and rules)

There's no strong link between supporting things at the schema level (including custom mutations) and making them publicly available, although we're still working through some of the tasks required to make it possible to create additional limited GraphQL endpoints (at the moment, there is only one, this will change soon)

But more complex queries, especially containing Mongo aggregation pipeline, are not possible.

I'm not sure that this is true, it's what I'd expect custom queries and mutations to enable.

Raw database access

To answer your original question, yes the underlying database communication layer will always be exposed by the list, as was the case in Keystone 4. We're hoping to make it less important, but will never stand in the way - especially because, as you noted, people need access to advanced features like aggregations etc.

Server can still use GraphQL, Mongoose queries, raw SQL or whatever.

This is the idea.

It'll also be important for people migrating projects from older versions of keystone, which relied on a lot of mongoose usage.

Access Control

Before running queries, there can be very fine-grained access control etc. and I don't see that a generic framework can provide all the needed control for this.

Have you had a look at our access control spec? It's quite deep, and I'd be interesting in collecting use-cases that it doesn't cover because we're investing quite a lot in this part of Keystone 5.

One that came up recently was context-aware permissions (i.e a user can access certain fields on a model through some interfaces, like a query of related items, but not others) which is pretty interesting, but I'm hoping we can solve that by making custom queries and mutations easy to implement at the schema level.

@ttsirkia
Copy link
Author

ttsirkia commented Apr 8, 2019

Thanks @JedWatson! I must say that the Access Control part of the documentation is quite impressive and I didn't read it so well earlier.

The aggregation pipeline can do pretty amazing tricks and if there is only a limited wrapper around the DB layer, I think there are many cases in which the raw access is at least useful if not mandatory. Simple finds and updates are probably trivial.

One reason for the custom APIs is that a query/request from the client-side may require multiple queries in server-side and some logic between them to control the following queries.

@JedWatson
Copy link
Member

Yeah, the aggregation pipeline is powerful. We don't want to get in the way of that 🙂

To try and close this out, the idea is that you'd set up custom queries or mutations that wrapped your aggregations. This isn't to say it's the only way of doing it, or even the best because every use-case is unique, but just to make sure the idea is clear:

queryOrMutation: (args) => adapter.api(...)

where adapter.api is mongoose's aggregation api, or multiple queries, etc.

So we're giving you a specific part of the system to implement whatever complex query logic you want to, using either the internal APIs that keystone provides (if sufficient) or the underlying connection (if you need raw features).

The equivalent in a relational database scenario is that we're never going to stop you writing raw SQL against the database in an action or resolver if you need to.

@ttsirkia
Copy link
Author

ttsirkia commented Apr 8, 2019

I think the best way is to start doing some example cases with the project when it is a bit more mature and the documentation covers more parts of the project. I'm really interested of the new concept behind Keystone 5.

@ttsirkia
Copy link
Author

ttsirkia commented Apr 8, 2019

The current blog example project might be a bit too simple to get a better overview of the more advanced ideas behind the framework, especially in the server-side. It is an excellent example to get familiar with the new concept but raised many questions. 🙂

@jesstelford
Copy link
Contributor

@ttsirkia I'm still interested to hear the specific usecase you have in mind - what kind of aggregation queries are you making? What levels of access control do you need which aren't already possible?

Ideally we'd be able to provide all your needs via the Keystone APIs, and avoid you having to drop down to the database level, because at that point we can no longer make guarantees about the structure of the data or any pre/post processing which might normally happen when going via the Keystone APIs.

@jesstelford
Copy link
Contributor

jesstelford commented Apr 8, 2019

queries are rather simple

I encourage you to play around with the GraphQL API currently implemented a bit more - it is exceptionally powerful and has provided everything we need for multiple quite complex applications with lots of complicated data relationships.

Again, if you've got a specific example that you're able to outline, it'd be great to make sure we can cover those cases too!

@ttsirkia
Copy link
Author

ttsirkia commented Apr 9, 2019

Here are some real examples:

      var query = {
        station: selectedStation._id,
        $and: [
          { timestamp: { $gt: last12h } },
          { timestamp: { $lt: next12h } }
        ],
        $or: [
          { $and: [{ arrivalTime: { $lte: next4h } }, { arrivalTime: { $gt: last2h } }, { arrivalTimeActual: null }] },
          { $and: [{ departureTime: { $lte: next4h } }, { departureTime: { $gt: last2h } }, { departureTimeActual: null }] },
          { $and: [{ arrivalTimeExpected: { $lte: next4h } }, { arrivalTimeExpected: { $gt: last2h } }, { arrivalTimeActual: null }] },
          { $and: [{ departureTimeExpected: { $lte: next4h } }, { departureTimeExpected: { $gt: last2h } }, { departureTimeActual: null }] }
        ]
      };
Trainspot.model.aggregate([
          { $match: { user: req.user._id } },
          { $group: { _id: '$trainNumber', number: { $sum: 1 } } },
          { $sort: { number: -1 } }, { $limit: 5 }
        ], ...
Train.model.aggregate([{
      $match: {
        started: true,
        running: false,
        minutesLate: { $gte: limit },
        departureTime: { $gte: startDate.toDate(), $lt: endDate.toDate() }
      }
    }, {
      $group: { _id: { n: '$number', t: '$type' }, avg: {$avg: '$minutesLate'}, count: { $sum: 1 } }
    }, {
      $match: { count: { $gte: 3 } }
    }, {
      $sort: { count: -1, avg: -1 }
    }, {
      $group: { _id: '$_id.t', n: { $push: { number: '$_id.n', count: '$count', avg: '$avg' } } }
    }, {
      $project: { data: { $slice: ['$n', 10] } }
    }, {
      $sort: { _id: 1 }
    }], ...

I'm not an expert in GraphQL but my assumption is that even logical operators (and and or) are difficult.

@jesstelford
Copy link
Contributor

Wonderful, thank you for those examples!

@jesstelford
Copy link
Contributor

jesstelford commented Apr 9, 2019

For your first one, I'm not 100% sure on the order of operations there, but am going to assume that the station/$and/$or are combined as an implicit $and.

Here's how you can write it as a GraphpQL query in Keystone v5 today:

query getTrains(
  $stationId: ID!,
  $last12h: DateTime!,
  $next12h: DateTime!,
  $next4h: DateTime!,
  $last2h: DateTime!
) {
  allTrains(where: {
    station: { id: $stationId },
    timestamp_gt: $last12h,
    timestamp_lt: $next12h,
    OR: [
      { AND: [{ arrivalTime_lte: $next4h }, { arrivalTime_gt: $last2h }, { arrivalTimeActual_not: null } ]},
      { AND: [{ departureTime_lte: $next4h }, { departureTime_gt: $last2h }, { departureTimeActual_not: null } ]},
      { AND: [{ arrivalTimeExpected_lte: $next4h }, { arrivalTimeExpected_gt: $last2h }, { arrivalTimeActual_not: null } ]},
      { AND: [{ departureTimeExpected_lte: $next4h }, { departureTimeExpected_gt: $last2h }, { departureTimeActual_not: null } ]},
    ]
  }) {
    id
  }
}

@jesstelford
Copy link
Contributor

If I'm understanding the second example correctly, the english version of this is:

For User X, get their 5 most frequently ridden trains, sorted by number of trips descending

The functionality to group & aggregate isn't yet exposed on our API, so I've created an issue to add it: #1016

@jesstelford
Copy link
Contributor

I don't quite understand the query in the third (my Mongo-fu isn't strong enough!). Can you explain what the end result is expected to be based on some input data?

@ttsirkia
Copy link
Author

ttsirkia commented Apr 9, 2019

The third one has the following meaning:

  • Search the trains which have been actually driven but are not running at the moment, which have been late more than the given limit and their departure time is between the given dates
  • Create a group based on train's number and type, calculate the average delay and the number of days when the limit has been reached
  • Take only those groups which have at least three entries
  • Sort them by using count and average, descending order
  • Group them again so that the train type will be the key and add the individual trains to a list which belongs to this train type
  • Limit the groups so that there are only the top ten trains in each group
  • Sort the groups by the train type

Here is a picture:

kuva

Boxes are the train types, in each box there is a train number and how many times it has been late over the limit and what is the average delay.

There are so many aggregation pipeline stages and pipeline operators that I don't see it very beneficial to implement all these to GraphQL. Therefore, I was asking that is there still a possibility to write these queries using Mongoose directly.

I'm not that familiar with GraphQL but I'm also wondering how easily you can make dynamical queries. I mean that you produce the query at runtime instead of having it ready in the code. As GraphQL is a string instead of nested objects (as in Mongoose), it requires at least a library to do that. Manipulating objects and arrays to construct the query is very simple and no injections can occur.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants