Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] [PROPOSAL] API version 2 #2686

Closed
Kami opened this issue May 17, 2016 · 7 comments
Closed

[RFC] [PROPOSAL] API version 2 #2686

Kami opened this issue May 17, 2016 · 7 comments

Comments

@Kami
Copy link
Member

Kami commented May 17, 2016

Background

From the end-user perspective, API v1 is not too bad (there are definitely worse RESTful APIs out there), but from developer and maintainability perspective, API v1 is quite horrible (it has been causing us a lot of pain).

A lot of that is related to our (ab)use of Pecan. We use Pecan in ways it was never really meant to be used. This results in a lot of hacks (jsexpose is a terrible monstrosity) and hard-to-maintain controller code. In addition to that, using Pecan ties us to a particular not-too-restful URL scheme (in theory, we could actually make it work with any kind of URL scheme we want, but this would require horrible hacks to the code and code would be even harder to follow and maintain).

Goals / requirements

New API should address the following things:

  • Nicer and more "RESTful" paths (e.g. /v1/packs/views/files/<pack name> vs /v2/packs/<pack name>/files, /v1/config_schema/<pack name> vs /v2/packs/<pack name>/config_schema, etc.)
  • Controller related code should be easy understand and to maintain
  • It should be possible and easy to auto-generate API documentation from the code and specification
  • It should be possible and easy to auto-generate low-level API clients for multiple programming languages. We might still want to write more idiomatic and higher-level client libraries ourselves, but auto-generated ones should be a good start.

Open questions and Implementation details

There are some open questions and implementation details we need to decide about:

  1. Should we switch away from Pecan (we also need to check the latest version since some things have been improved)? What are the good alternatives for light-weight API services?

I've use Django, Flask and Tornado for such purposes in the past. Django is obviously not a good fit and over-kill and Tornado requires Tornado-specific async code which is no go for us since we are already bought into the eventlet ecosystem.

Flask might be an OK middle-ground, but I still think it's a slight over-kill and too bloated for simple API services.

  1. Should we use some kind of framework (e.g. Swagger or grpc) which allows us to describe API schema and then auto generates client libraries and docs?

I do think we should consider using something like Swagger or grpc. Not matter what kind of approach we go with, having some kind of description / "schema" for API is a good thing. Manually writing low-level API clients and API docs is error prone and a waste of time (it's too easy for things to get out of sync, hard to maintain, etc.).

Using something like Swagger / Thrift / grpc would potentially also allow us to version models and API responses, but I think that's out of scope of this change (doing all of that mentioned above is already quite a lot of work and we can't do and fix everything as part of this change).

I think all of us still need to do some more research, but let's get the debate going. Feedback and comments are welcome.

@manasdk
Copy link
Contributor

manasdk commented May 17, 2016

I like swagger. Good community around it, I would also encourage us to look at RAML

@enykeev
Copy link
Member

enykeev commented May 25, 2016

Should we switch away from Pecan (we also need to check the latest version since some things have been improved)? What are the good alternatives for light-weight API services?

I wonder if we need REST framework at all. At its core, API service is very simple: each request goes through a multiple levels of middlewares that translates incoming data into user-friendly api (check cache headers and return proper code if data haven't changed, validate request body and respond with error if it's malformed, parse cookies into list) then gets routed to the particular controller, forms into a response and gets sent back though the another bunch of middlewares (return proper error code if controller throws an error, serialize controller return object into json, add proper headers, cache response).

Pecan complexity comes from the fact that it tries to make some routine actions easier by introducing magical constructs such as singleton Request\Response objects and heavy use of decorators. It makes sense for some simple applications built by people new to web development, but it creates additional obstacles when you know what you're doing. This opinionated design along with poor modularity on Pecan side and our initial copycat approach to st2api development lead us to where we are now.

Should we use some kind of framework (e.g. Swagger or grpc) which allows us to describe API schema and then auto generates client libraries and docs?

Swagger seems like a decent way of writing REST spec, but server code they generate for python flask is even more magical that we have now (http://editor.swagger.io/, see for yourself) and for now I don't really see how it's going to handle code changes after we decide to alter the spec and regenerate it. We probably should not expect it to solve all of our problems out the box and be ready to the fact that some assembly might be required.

@enykeev
Copy link
Member

enykeev commented May 25, 2016

we also need to check the latest version (of Pecan) since some things have been improved

I went ahead and compared the version we're using with current pecan master (StorminStanley/pecan@st2-patched...pecan:master). Only a couple of things have really changed:

  • pecan's own expose decorator now has route argument that potentially allows to build custom routing scheme [1]. The docstring for the function paints a very specific usecase though [2] so it might not be all that hugely beneficial for us.
  • it now attempts to replicate wsexpose\jsexpose argument guessing behavior out the box with its own getargspec implementation [3] (previously was a part of inspect module [4]). Makes no attempt to assert the type though (it treats all of them as strings then, I guess)

@enykeev
Copy link
Member

enykeev commented May 26, 2016

What are the good alternatives for light-weight API services?

It seems to me that in python world, simplerouter is pretty close to a minimum viable implementation of http service. You can probably throw away WebOb too if you fine working with environ directly, though it's likely a level of convenience we want to keep. Redirects may not be all that hugely beneficial to us; we may also want to convert our current hooks to work as wsgi middlewares or integrate them into router pipeline somehow.

@enykeev
Copy link
Member

enykeev commented May 26, 2016

As far as Swagger-compatible frameworks go, connexion seems to be the only option. It pulls Flask with it, although mostly it only uses its Request\Response objects (same thing WebOb provides). It also only works with gevent and not eventlet (as far as I can tell, they are mostly compatible, but the particular degree is something we would need to investigate more thoroughly).

@enykeev
Copy link
Member

enykeev commented May 27, 2016

Since no one is biting, I'll continue in a form of monologue =)

Thing is, for all the issues in this thread, only "Nicer and more "RESTful" paths" is a breaking change to the API, the rest is just refactoring that would be good to do either before or during v2 implementation. I'd like us to discuss other types of breaking changes that are desperately needed:

  • Separate data structure and API for execution results. Some executions produce such a massive output, that storing and transferring it alongside other execution data is increasingly problematic (especially in light of synchronous and non-streaming nature of JSON parser and code highlighter we're using and specific performance expectations we have in relation to interface responsiveness). Other projects that have the same problem (think of CircleCI) store such data as static files and it seem like a reasonable approach.
  • Static resource references. Since we initially designed API v1, we figured there is also some other types of "static" content we want to expose through API, namely entrypoints and icons. Now we have to pipe them through st2api where it reads them from disk and outputs as a binary data (and to be honest, I'm not really sure if we're actually streaming them or we have to load them in memory first). It seemed like we all agreed that st2api isn't really suitable to serve static and there is better tools to do that (nginx or cdn, depending), what we need is some sort of a universal "reference" structure that would tell client to go and download this content elsewhere (something like $ref in yaml).
  • Intermediate results. Talking about execution results, another frequent request we have is to allow actions to post intermediate updates on progress so user would have an understanding what happens with some long running process that is not a workflow. From an API perspective, we can simply start to post messages to stream more often (not only on state changes as we do now), but if we are to implement results API endpoint, it is reasonable to start streaming results for executions that are not finished instead of terminating the connection the moment we sent everything we had to the moment.
  • We need to expose more data in the stream, namely updates for entities other than executions (newly registered actions, updated rules and so on). Doing so would make UI more responsive.
  • At the same time, we need to start filtering stream based on rbac.
  • For the purpose of chatops rbac, we may need to add a notion of a user acting on behalf of other users. Hubot should be able to run an alias execution for the user and get the results of such execution for every user he needs to. User at the same time, should only get executions he has an access to.
  • Better pagination approach. Our current pagination for executions (which based on an ordinal number of the document in collection) only works for not very intensive environments. Having new execution pop up every second makes it almost impossible to predict how next page would look like. We've previously discussed switching to pagination based on timestamps, but with rbac in play it gets even more hairier. Suddenly, you can't share url of the page anymore since every user would have different set of executions on this "page". We would need to came up with some way to link the particular page so user could say something like "hey, I'm pretty sure there should be an execution somewhere around that, but don't really see it. Take a look [link]". And we want to somehow keep a number of executions between 10 (minimum to have a context) and 50 (maximum to keep UI responsive) at all times.
  • And since a thousands of actions on the single st2 is our new reality, we need pagination for the rest of entities too. We would need to figure out whether we ok about ordinal pagination in cases of actions and rules.
  • More flexible filtering for pretty much all the entities. We've initially discussed having elastic index for executions. I'm not insisting on this particular solution, but we need an option to build a little bit more complicated requests than filters we have now (see StackStorm/discussions#172 as an example).
  • Settings API endpoint. ST2 should expose constants, schemes for entities, config parameters specific to this particular installation, a list of features this version of st2 supports so we would not have to update st2web code on every change in the model. That's another part where swagger spec might be hugely beneficial to us.
  • Pluggable authentication. The approach we took with st2auth is decent, but not always ideal. We've been asked in the past on how to integrate SSO or JWT into the system and while to my knowledge we haven't received any requests from customers, eventually we will. Authentication in our case is just a single middleware (hook in case of st2api) that interrupts a flow and returns an error in case of an invalid token. In theory, plugging JWT check in place of it doesn't look hard at all.
  • User meta. At some point in time, when st2 rbac will get more traction or when we implement TZ shift, we would need to expose some additional data about the user to the UI. Another endpoint that would be useful to have in v2.

When we're talking about API v2, this are the things I think about in the first place, not whether we're going to use swagger or grpc, Pecan or Flask. Those can be done for v1 providing we have enough motivation...

@Kami
Copy link
Member Author

Kami commented Jun 20, 2017

I'll close this since a lot of it has been resolved in #2727 and more will be once we introduce new v2 API endpoints without the legacy pecan code and paths.

@Kami Kami closed this as completed Jun 20, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants