Use Swagger #390

qstokkink · 2018-12-20T08:17:02Z

We should use Swagger for our REST endpoints. See below for the original exploration of this idea.

Old

Nobody wants to write documentation. Instead, we should automatically generate this. Also, nobody likes to do precondition and postcondition checks. We can do both as follows.

First we can scan all the distinct endpoints available by instantiating a RootEndpoint and inspecting its children. We can then automatically generate documentation from a .restapi variable tied to the REST API handler functions (render_POST, render_GET, etc.).

We can attach and generate the .restapi variable to these functions statically by using annotations/wrapper functions. Specifically we should expose:

RESTInput, which takes an option, a jsontype and a description argument.
RESTOutput, which takes a conditional_lambda and a output_format argument.

`RESTInput`

The first input a RESTInput takes is the option. This option specifies what option in the request URI is being described. The jsontype then specifies the JSON datatype, which can be a combination of:

number/boolean/null
string with specified encoding, for example STR_TYPE["UTF-8"] or STR_TYPE["ASCII"]
array/object
tuple of (jsontype, description) which adds additional documentation to an inner construction

Lastly, each input should have a description. This is not very useful when using RESTInput for runtime precondition checking, but mostly for humans reading generated documentation.

`RESTOutput`

The RESTOutput takes a conditional_lambda, which is a lambda function taking the input and returning True if its output format should be applied. I have created a function for converting lambda bytecode to human readable form already, so this is a multi-purpose argument (both for checking postconditions programmatically and for generating documentation). Lastly the output_format specifies what the output matches to (or should match to). The output_format follows the same rules as the jsontype for the RESTInput. The description is once again to explain in human terms what the output is used for.

Example

class NetworkEndpoint(resource.Resource):
    """
    This endpoint is responsible for handing all requests regarding the state of the network.
    """

    def __init__(self, session):
        resource.Resource.__init__(self)
        self.session = session

    # Takes no RESTInput
    @RESTOutput(lambda input: input is None,
                {
                    "peers": {
                        (STR_TYPE["BASE64"], "The sha1 of the peer's public key."): {
                            "ip": STR_TYPE["BASE64"],
                            "port": NUMBER_TYPE,
                            "public_key": STR_TYPE["BASE64"],
                            "services": [(STR_TYPE["BASE64"], "The sha1 of the Community's public key.")]
                        }
                    }
                },
                "All of the known peers and their services.")
    def render_GET(self, request):
        return json.dumps({"peers": self.render_peers()})

Would generate the following documentation:

NetworkEndpoint

This endpoint is responsible for handing all requests regarding the state of the network.

Input

Option	Type	Description

Output

Given	Output	Annotation
input is None	{ "peers": { string (base64): "ip": string (base64), "port": number, "public_key": string (base64), "services": [ string (base64) ] } } }	The sha1 of the peer's public key. The sha1 of the Community's public key.

The text was updated successfully, but these errors were encountered:

ichorid · 2018-12-20T11:24:24Z

Automating development procedures is always a great idea, especially if it reduces code complexity. However, I wonder how would that fit in our current development process? At this moment, we have REST endpoints documentation and examples embedded in the source files. If we opt to use this, we'll have to remove the stuff from the code. In addition, each pull request will have to get an additional hook for documentation generation. And the documentation will have to be kept out of the main source tree, lest we face the problem of patching pull requests 'on the go'.

How do you see the real place of this in our development stack?

qstokkink · 2018-12-21T11:02:27Z

At this moment, we have REST endpoints documentation and examples embedded in the source files. If we opt to use this, we'll have to remove the stuff from the code.

We can keep the documentation, but right now that documentation is just for humans. With this new system I hope that we can use the same specification for both (a) generating human documentation and (b) runtime pre- and postcondition violations. The former, generating human documentation, would gain nothing over the status quo for programmers (except in IPv8, which has virtually no documentation on its REST endpoints). An added benefit is that we can export the documentation to others (god forbid, like Javascript developers) without shipping the source code of the service though. Most importantly the latter, being able to check preconditions and postconditions at runtime, can be a great help in maturing our code. Right now any violations of our own REST specification will remain unnoticed by the service itself, any violations will have to be caught by a higher layer.

In addition, each pull request will have to get an additional hook for documentation generation.

Yep.

How do you see the real place of this in our development stack?

Most of this is based on the whole theory of provable execution/security/software engineering. We know that precondition/postcondition specification programming is fallible. How this will work out in practice, I do not know. However, I'm bored, on vacation and in the mood for something adventurous. So, I'll implement this and report back once I know more about the practical side of this and whether or not we should want it at all 😃.

ichorid · 2018-12-21T11:33:54Z

Being a great fan of pre- post- condition checking, I totally support this one. Though, we need @xoriole's opinion on how to best integrate it in our development process

qstokkink · 2018-12-21T11:47:26Z

After reflecting on my own answer a bit, I realized I basically want the REST API to guarantee the following two things at runtime:

(Any valid input) implies ((valid output) or (an error))
(Any invalid input) implies (an error)

Of course validity in this context only applies to the typechecking of the data.

qstokkink · 2018-12-21T17:57:56Z

Well, the NetworkEndpoint prototype for runtime precondition and postcondition checking is working: qstokkink@d057bf4#diff-963a914f2490b6181d59564ab191dbc1

I still need to finish the documentation generator. Smooth sailing so far.

qstokkink · 2018-12-23T09:59:28Z

I added a specification to the OverlayStatisticsEndpoint:

@RESTOutput(lambda input: True,
                ({
                     "statistics": [
                         {
                             (STR_TYPE["ASCII"], "The Community name."): {
                                 (STR_TYPE["ASCII"], "The message name."): {
                                     "identifier": (NUMBER_TYPE, "The message number."),
                                     "num_up": NUMBER_TYPE,
                                     "num_down": NUMBER_TYPE,
                                     "bytes_up": NUMBER_TYPE,
                                     "bytes_down": NUMBER_TYPE,
                                     "first_measured_up": NUMBER_TYPE,
                                     "first_measured_down": NUMBER_TYPE,
                                     "last_measured_up": NUMBER_TYPE,
                                     "last_measured_down": NUMBER_TYPE
                                 }
                             }
                         }
                     ]
                 },
                 "The statistics per message per community, if available."))
    def render_GET(self, _):
        ...

    @RESTInput("enable", (BOOLEAN_TYPE, "Whether to enable or disable the statistics."))
    @RESTInput("overlay_name", (STR_TYPE["ASCII"], "Class name of the overlay."))
    @RESTInput("all", (BOOLEAN_TYPE, "Update applies to all overlays."))
    @RESTOutput(lambda input: True,
                {
                    "success": (BOOLEAN_TYPE, "Whether the request succeeded, see error."),
                    "error": (STR_TYPE["ASCII"], "The available information in case of no success.")
                })
    def render_POST(self, request):
        ...

Here's what the generated OverlayStatisticsEndpoint documentation looks like (generated from the webpage with wkhtmltopdf):

OverlayStatisticsEndpoint.pdf

qstokkink · 2018-12-25T12:25:39Z

And here is the complete, corrected and auto-generated IPv8 REST API specification:

IPv8 REST API.pdf

qstokkink · 2019-01-20T12:51:31Z

Lessons learned:

Our current API is not beautiful enough.
If we rework it anyway and we no longer require custom REST API generator code to deal with our own interesting constructions, we might as well switch to a Python Swagger library.

We will have to check out the sizable list of Python Swagger libraries and see what works best for our Twisted-based stack (https://github.com/gangverk/flask-swagger seems like a good contender).

I'll close #397 in favor of this plan.

DanGraur · 2019-07-22T19:20:46Z

After taking a long look into the world of web frameworks for Python I have reached a few conclusions. Moreover, I've also written 4 prototypes to exemplify how some candidate frameworks would work in a real application. There is a prototype for each of the following frameworks: Flask, Klein, Quart, and Quart-Trio (with Trio as the asynchronous framework). These prototypes are in a public repository. Each of the prototypes has a module called server.py, which holds the handlers for HTTP requests, implemented with the help of the aforementioned frameworks. Bar Flask (which is purely synchronous), all prototypes handle 5 types of requests: echo (both sync and async), a simple calculator (both sync and async), a random sentence generator (only async). Flask implements the echo and the calculator handlers, but only in a synchronous fashion. There is a README for each of the prototypes which details the dependencies, as well as instructions on how to start up the demos. It should be mentioned that each demo has a caller.py module, which sends requests to the server programmatically (although it is equally possible to forward requests manually via a browser). Where possible, this also starts up the server itself, and lets it run indefinitely. Where this is not possible, another module is employed: start.py. Regardless, the following are my conclusions following this survey:

Flask: is a synchronous framework which can only be hooked up to Twisted via an WSGI (more specifically the WSGIResource in Twisted). Since this framework is synchronous, its request handlers cannot make use of asynchronous code. Thus, considering the importance of asynchronicity for the Tribler ecosystem, I wouldn't recommend we go with this, since it is unsuitable for our needs.
Klein: is an asynchronous web framework built on top of Twisted. It is in fact maintained by the Twisted devs, and located in the Twisted repo. Since there is a high degree of integration between the two frameworks, changing our current API to Klein would not require a large amount of effort. In fact, the endpoints can largely remain the same, although there will be no need of Twisted's Resource class. Hence, the code will be simplified. However, the decision of using this framework is largely based on one important factor: do we intend to keep using Twisted, or do we intend to switch to some other asynchronous framework? If the answer is yes, we do intend to keep Twisted for the foreseeable future then using Klein might be worthwhile. If the answer is no, we intend to change to something other than Twisted then this would be wasted effort, and using this framework would not be advisable.
Quart: is an asynchronous web framework based on Asyncio (as it uses the Asyncio event loop). Quart is essentially designed to be Flask for asynchronous handlers, so using it to set up handlers is a straight forward task. The asynchronous code inside the handlers is based on the async/await constructs (introduced in Python 3.5) and Asyncio (it makes no use of Twisted), so this will intuitively mean that all our current handlers will have to be switched to Asyncio. However, as I have seen on the Technology Roadmap Ticket there is an intention to do so quite soon. This would suggest that Quart is the way to go in case the decision to move to Asyncio is certain (Klein would not be viable anyway since Twisted will be phased out).
Quart-Trio: I've seen a lot of interest for Trio both online, and in some of the discussions here in the Tribler discussion boards on GitHub. As a short intro, Trio is an alternative to both Asyncio and Twisted. It's gained a large following in the Python community for the straight-forward nature in which it allows its users to implement some types of asynchronous tasks. Since Trio is not really a web-framework, I employed Quart-Trio, which is a small extension of Quart so that it makes use of the Trio event loop, rather than the Asyncio event loop (as a side note, the two event loops are incompatible, hence, Trio and Asyncio cannot be used together). Quart-Trio is meant to work like Quart, and in fact, it still requires that it is installed, since it uses some of its components (such as Blueprints, which are required in order to enable multi-module applications). Besides some framework specific constructs, such as nurseries in the case of Trio, there are no major differences in the handlers for Quart-Trio and Quart (in both cases async and await are used as one would expect). I also used the asks framework here to forward asynchronous requests, since this framework is implemented based on the Trio event loop. I should however mention, that Quart-Trio requires Python 3.7. This framework should be used if the Trio framework is chosen as the next asynchronous engine of the Tribler ecosystem, otherwise I don't think it's a good time investment. Moreover, Quart-Trio is a relatively young project (7 months old), and has little following currently, but it is maintained by the same dev that maintains regular Quart, so that's a plus.

I should also mention that the initial goal of this task was to look through the stack of Swagger integrations, and identify those which provide support for the current asynchronous nature of the Tribler ecosystem, and also seamlessly add support for the OpenAPI Specification. Unfortunately, there are no such integrations available. And the Swagger part will have to be developed in-house.

As a conclusion, I think the decision of which web framework to use depends entirely on which asynchronous framework is to be used. If Twisted is here to stay, then Klein is the choice. If a move is made towards Asyncio, then Quart is probably our best way forward. Finally, if Trio is favored, then Quart-Trio is the way to go. I do encourage those interested in getting a glimpse of how the endpoints and frameworks (both the web and the aysnc ones) are used, and generally how these multi-module applications should be wired up for each of the discussed frameworks, to look into the prototype repository.

ichorid · 2019-07-22T20:33:19Z

BTW, quart uses decorators for routes the same way klein and flask do. @DanGraur , I wonder why didn't you use the decorators in quart and quart-trio examples?

ichorid · 2019-07-22T20:44:14Z

Indeed a googling for swagger quart netted some promising results! Apparently, we don't have to build the swagger+quart integration completely on our own.

DanGraur · 2019-07-22T20:57:24Z

True, decorators are an option, but the decorators are essentially a shorthand to add_url_rule (see the Flask source code), so using either has the same effect.

I essentially didn't use decorators in those cases, since I instantiated the Blueprint objects as fields rather than static variables. Since they're fields I need to use them with the self. prefix, which doesn't work with decorators. If, however, static variables are used, then decorators are a possibility (since no self is required). I did this in the klein example, but not in the others (I think my intention was to have a bit o diversity in how these frameworks can be used, but I wrote that quite a while back, so I'm not sure if this was really my intention 🙂).

qstokkink · 2019-09-09T13:30:26Z

This will coincide with the move to Python 3, as such I will put it on hold.

ichorid · 2019-09-09T15:15:04Z

At the moment, @egbertbouman almost completed porting Tribler to asyncio. He used aiohttp to reimplement the endpoints because it requires almost no other dependencies.

qstokkink · 2019-11-12T15:10:40Z

With the merge of #612, we now have Swagger docs.

qstokkink added the priority: medium Enhancements, features or exotic bugs label Dec 20, 2018

This comment has been minimized.

Sign in to view

qstokkink self-assigned this Dec 23, 2018

This comment has been minimized.

Sign in to view

qstokkink mentioned this issue Jan 8, 2019

WIP: Added formal REST specification tools #397

Closed

2 tasks

qstokkink removed their assignment Jan 20, 2019

qstokkink changed the title ~~Automatically generate REST API specification~~ Use Swagger Jan 21, 2019

qstokkink assigned DanGraur May 27, 2019

qstokkink added the priority: low Should be addressed at some point in the future label Jul 31, 2019

qstokkink unassigned DanGraur Sep 5, 2019

qstokkink added the on hold label Sep 9, 2019

qstokkink mentioned this issue Nov 11, 2019

Added swagger docs #612

Merged

qstokkink closed this as completed Nov 12, 2019

qstokkink removed priority: medium Enhancements, features or exotic bugs on hold labels Nov 24, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use Swagger #390

Use Swagger #390

qstokkink commented Dec 20, 2018 •

edited

ichorid commented Dec 20, 2018

qstokkink commented Dec 21, 2018

ichorid commented Dec 21, 2018

qstokkink commented Dec 21, 2018

qstokkink commented Dec 21, 2018

This comment has been minimized.

This comment has been minimized.

qstokkink commented Dec 23, 2018

This comment has been minimized.

This comment has been minimized.

qstokkink commented Dec 25, 2018

qstokkink commented Jan 20, 2019 •

edited

DanGraur commented Jul 22, 2019 •

edited

ichorid commented Jul 22, 2019

ichorid commented Jul 22, 2019

DanGraur commented Jul 22, 2019

qstokkink commented Sep 9, 2019

ichorid commented Sep 9, 2019

qstokkink commented Nov 12, 2019

Use Swagger #390

Use Swagger #390

Comments

qstokkink commented Dec 20, 2018 • edited

Old

RESTInput

RESTOutput

Example

NetworkEndpoint

Input

Output

ichorid commented Dec 20, 2018

qstokkink commented Dec 21, 2018

ichorid commented Dec 21, 2018

qstokkink commented Dec 21, 2018

qstokkink commented Dec 21, 2018

This comment has been minimized.

This comment has been minimized.

qstokkink commented Dec 23, 2018

This comment has been minimized.

This comment has been minimized.

qstokkink commented Dec 25, 2018

qstokkink commented Jan 20, 2019 • edited

DanGraur commented Jul 22, 2019 • edited

ichorid commented Jul 22, 2019

ichorid commented Jul 22, 2019

DanGraur commented Jul 22, 2019

qstokkink commented Sep 9, 2019

ichorid commented Sep 9, 2019

qstokkink commented Nov 12, 2019

qstokkink commented Dec 20, 2018 •

edited

`RESTInput`

`RESTOutput`

qstokkink commented Jan 20, 2019 •

edited

DanGraur commented Jul 22, 2019 •

edited