Skip to content

A new way of writing WSGI middleware and apps, that's transparently interoperable with standard WSGI.

Notifications You must be signed in to change notification settings


Repository files navigation

Creating Simpler Middleware with WSGI Lite

Wouldn't it be nice if writing correct WSGI middleware was this simple?

>>> from wsgi_lite import lite, lighten

>>> def latinator(app):
...     # Make sure that `app` can be invoked via the Lite protocol, even
...     # if it's a standard WSGI 1 app:
...     app = lighten(app)
...     @lite
...     def middleware(environ):
...         status, headers, body = app(environ)
...         for name, value in headers:
...             if name.lower()=='content-type' and value=='text/plain':
...                 break
...         else:
...             # Not text/plain, pass the request through unchanged
...             return status, headers, body
...         # Strip content-length if present, else it'll be wrong
...         headers = [
...             (name, value) for name, value in headers
...                 if name.lower() != 'content-length'
...         ]
...         return status, headers, (piglatin(data) for data in body)
...     return middleware

Using just two decorators, WSGI Lite lets you create correct and compliant middleware and applications, without needing to worry about start_response, write and close calls. And with those same two decorators, it also lets you manage resources to be released at the end of a request, and automatically pass in keyword arguments to your apps or middleware that are obtained from the WSGI environment (like WSGI server extensions or middleware-supplied parameters such as request or session objects).

For more details, check out the project's home page on BitBucket, and scroll down to the table of contents.

WSGI Lite is currently only available for Python 2.x (tested w/2.3 up to 2.7) but the source should be quite portable to 3.x, as its magic is limited to inspecting function argument names, and cloning functions using new.function().

If you've seen the Latinator example from the WSGI PEP, you may recall that it's about three times longer than the example shown above, and it needs two classes to do the same job. And, if you've ever tried to code a piece of middleware like that, you'll know just how hard it is to do it correctly.

(In fact, as the author of the WSGI PEPs, I have almost never seen a single piece of WSGI middleware that doesn't break the WSGI protocol in some way that I couldn't find with a minute or two of code inspection!)

But the latinator middleware example shown above is actually a valid piece of WSGI 1.0 middleware, that can also be called with a simpler, Rack-like protocol. And all of the hard parts are abstracted away into two decorators: @lite and lighten().

The @lite decorator says, "this function is a WSGI application, but it expects to be called with an environ dictionary, and return a (status, headers, body) triplet. And it doesn't use start_response(), write(), or expect to have a close() method called."

The @lite decorator then wraps the function in such a way that if it's called by a WSGI 1 server or middleware, it will act like a WSGI 1 application. But if it's called with just an environ (i.e., without a start_response), it'll be just like you called the decorated function directly: that is, you'll get back a (status, headers, body) triplet (similar to the "Rack" protocol, aka Ruby's version of WSGI).

Pretty neat, eh? But the real magic comes in with the second decorator, lighten(). lighten() accepts either a @lite application or a WSGI 1 application, and returns a similarly flexible application object. Just like the output of the @lite decorator, the resulting app object can be called with or without a start_response, and the return protocol it follows will vary accordingly.

This means that you can either pass a @lite app or a standard WSGI app to our latinator() middleware, and it'll work either way. And, you can supply a @lite or lighten()-ed app to any standard WSGI server or middleware, and it'll Just Work.

For efficiency, both @lite and lighten() are designed to be idempotent: calling them on already-converted applications returns the app you passed in, with no extra wrapping. And, if you call a wrapped application via its native protocol, no protocol conversion takes place - the original app just gets called without any conversion overhead. So, feel free to use both decorators early and often!

One of the subtler edge cases that can arise in writing correct middleware is that when you call another WSGI app, it's allowed to change the environ you pass in.

And what most people don't realize, is that this means it's not safe to pull things out of the environment after you call another WSGI app!

For example, take a look at this middleware example:

def middleware(environ, start_response):
    response = some_app(environ, start_response)
    if environ.get('PATH_INFO','').endswith('foo'):
        # ...  etc.

Think it'll work correctly? Think again. If some_app is a piece of routing middleware, it could already have changed PATH_INFO, or any other environment key. Likewise, if this middleware looks for server extensions like wsgi.file_wrapper or wsgiorg.routing_args, it might end up reading the child application's extensions, rather than those intended for the middleware itself.

To help handle these cases, the @lite decorator can bind a function's keyword arguments to values based on the contents of the environ argument:

@lite(path='PATH_INFO', routing='wsgiorg.routing_args')
def middleware(environ, path='', routing=((),{})):
    response = some_app(environ, start_response)
    if path.endswith('foo'):
        # ...  etc.

When @lite is called with keyword arguments whose argument names match argument names on the decorated function, it wraps the function in such a way that the matching keys from the environ are passed in as keyword arguments. This automatically ensures that you aren't using possibly-corrupted keys from your child app(s), and lets you specify default values (via your function's argument defaults, as shown above).

As a convenience for frequently used extensions or keys, you can save calls to lite() and give them names, for example:

>>> with_routing = lite(routing='wsgiorg.routing_args')

And the resulting decorator is precisely equivalent to invoking @lite() directly:

>>> @with_routing
... def middleware(envrion, routing=((),{})):
...     """Some sort of middleware"""

You can even stack multiple @lite() calls (direct or saved), or give them names, docstrings, and specify what module you defined them in:

>>> with_path = lite(
...     'with_path', "Add a `path` arg for ``PATH_INFO``", "__main__",
...     path='PATH_INFO'
... )

>>> help(with_path)
Help on function with_path in module __main__:
    Add a `path` arg for ``PATH_INFO``

>>> @with_routing
... @with_path
... def middleware(environ, path='', routing=((),{})):
...     """Some combined middleware"""

By the way, the underlying decorator is smart enough to tell when it's being stacked, and automatically merges the wrappings so there's only one level of calling overhead added, no matter how many of them you stack. (As long as they're not intermingled with other decorators, of course!)

Sometimes, an extension may be known under more than one name - for example, an x-wsgiorg. extension vs. a wsgiorg. one, or a similar extension provided by different servers. You could of course bind them to different arguments, but it's generally simpler to just bind a single argument, using a tuple:

>>> @lite(routing=('wsgiorg.routing_args', 'x-wsgiorg.routing_args'))
... def middleware(envrion, routing=((),{})):
...     """Some sort of middleware"""

This will check the environment for the named extensions in the order listed, and replace routing with the first one matched.

These argument specifications are called "binding rules", by the way. A rule is either a WSGI native string (i.e. of exactly type str), an object with a __wsgi_bind__ method, a callable object, or an iterable of rules (recursively). Strings are looked up in the environ, and iterables are tried in sequence until a lookup succeeds.

Rules with a __wsgi_bind__ method, on the other hand, are looked up having that method called with a single positional argument: the environ dictionary. The method must return an iterable (or sequence) yielding zero or more items. (Which means it's usually simplest to implement as a generator). Rules that don't have a __wsgi_bind__ method, but are callable themselves, are called in the same way. (Which means you don't need to write a class for each rule: functions and methods will also suffice.)

Whether a rule has a __wsgi_bind__ method or is a callable in its own right, returning an empty sequence or yielding zero items means the lookup failed, and a default value should be used instead (or the next alternative binding rule provided for that keyword argument). Otherwise, the first item yielded is passed in as the matching keyword argument. Here's an example of using a __wsgi_bind__ classmethod, to turn a class into a binding rule:

>>> class MyRequest(object):
...     def __init__(self, environ):
...         self.environ = environ
...     @classmethod
...     def __wsgi_bind__(cls, environ):
...         yield cls(environ)

>>> with_request = lite(request=MyRequest)

Now, @with_request will create a MyRequest instance wrapping the environ of the decorated function, and provide it via the request keyword argument. Or, you can explicitly specify what argument to use, by passing it to @lite(). So, these two examples do the same thing, just using different argument names:

>>> @with_request
... def app1(environ, request):
...     """Just an example"""

>>> @lite(req=MyRequest)
... def app1(environ, req):
...     """Just an example"""

The same approach of creating environment-bound classes can also be used to do things like accessing environment-cached objects, such as sessions or users:

>>> class MySession(object):
...     def __init__(self, environ):
...         self.environ = environ
...     @classmethod
...     def __wsgi_bind__(cls, environ):
...         session = environ.get('myframework.MySession')
...         if session is None:
...             session = environ['myframework.MySession'] = cls(environ)
...         yield session

>>> with_session = lite(session=MySession)

The possibilities are pretty much endless -- and much more in keeping with my original vision for how WSGI was supposed to help dissolve web frameworks into web libraries. (That is, things you can easily mix and match without every piece of code you use having to come from the same place.)

Callables that you use as bindings don't even have to return something from the environment or wrap the environment, by the way - they can just be things that use something from the environment. For example, you could bind parameters to temporary files that will be automatically closed when the request is finished:

>>> def mktemp(environ):
...     closing = environ['wsgi_lite.closing']
...     yield closing(tempfile(etc[...]))

>>> @lite(tmp1=mktemp, tmp2=mktemp, session=MySession)
... def do_something(environ, tmp1, tmp2, session):
...     """Write stuff to tmp1 and tmp2"""

You can even use argument bindings in your binding functions, using the @bind decorator from the wsgi_bindings module:

>>> from wsgi_bindings import bind

>>> @bind(closing = 'wsgi_lite.closing')
... def mktemp(environ, closing):
...     yield closing(tempfile(etc[...]))

@bind() is just like @lite() with keyword arguments (including the ability to save and stack calls), except that it doesn't turn the decorated function into a WSGI-compatible app. (Which is a good thing, since a binding rule is not a WSGI app!)

Now, given the above examples, you might be wondering what all that wsgi_lite.closing stuff is about. Well, that's what we're going to talk about in the next two sections...

So, there's some good news and some bad news about close() and resource cleanups in WSGI Lite.

The good news is, @lite middleware is not required to call a body iterator's close() method. And if your app or middleware doesn't need to do any post-request resource cleanup, or if it just returns a body sequence instead of an iterator or generator, then you don't need to worry about resource cleanup at all. Just write the app or middleware and get on with your life. ;-)

Now, if you are yielding body chunks from your WSGI apps, you might want to consider just not doing that.

That's because, if you don't yield chunks, you can write normal, synchronous code that won't have any of the problems I'm about to introduce you to... problems that your existing WSGI apps already have, but you probably don't know about yet!

(People often object when I say that typical application code should never produce its output incrementally... but the hard problem of proper resource cleanup when doing so, is one of the reasons I'm always saying it.)

Anyway, if you must produce your response in chunks, and you need to release some resources as soon as the response is finished, you need to use the wsgi_lite.closing extension, e.g:

def my_app(environ, closing):

    def my_body():
            # allocate some resources
            yield chunk
            # release the resources

    return status, headers, closing(my_body())

This protocol extension (accessed as closing() in the function body above) is used to register an iterator (or other resource) so that its close() method will be called at the end of the request, even if the browser disconnects or a piece of middleware throws away your iterator to use its own instead.

An important note: items registered with closing() are closed in reverse registration order. This means that if the my_body() iterator above is looping over a sub-app's response, then its finally block may be run before any similar finally block in the sub-app. Therefore, your finally block must not close any resources the sub-app might be using!

So, if you are passing any resources down to another WSGI application, be sure to call closing() on them before calling the other application, and then don't close them in your body iterator. Example:

def my_app(environ, closing):
    environ['some.key'] = closing(some_resource())
    return subapp(environ)

In other words, you should only close resources in your iterator if that's where they were opened, or you are 100% positive they can't be accessed from a sub-app. Otherwise, just call closing() on them as soon as you allocate them.

Don't, however, call closing() on objects that don't belong to your function. If you didn't allocate it, closing it is somebody else's job. In particular, you don't need to call closing() on any WSGI or WSGI Lite response bodies, because lighten() takes care of that for you, and you'll end up double-closing things.

Okay, so that was the bad news. Not that bad, though, is it? You just need to add an extra argument to @lite, pay a little bit of attention to the order of resource closing, and register your own objects (but only your own objects) for closing. That's it!

Really, the rest of this section is all about what will happen if you don't use the extension, or if you try to do resource cleanup in a standard WSGI app without the benefit of WSGI Lite.

As long as you use the extension, your app's resource cleanup will work at least as well as -- and probably much better than! -- it would work under plain WSGI. (And you can make it work even better still if you wrap your entire WSGI stack with a lighten() call... but more on that will have to wait until the end of this section.)

So, just to be clear, the rest of this section is about flaws and weaknesses that exist in standard WSGI's resource management protocol, and what WSGI Lite is doing to work around them.

What flaws and weaknesses? Well, consider the example above. Why does it need the closing() extension? After all, doesn't Python guarantee that the finally block will be executed anyway?

Well, yes and no. First off, if the generator is called but never iterated over, the try block won't execute, and so neither will the finally. So, it depends on what the caller does with the generator. For example, if the browser disconnects before the body is fully generated, the server might just stop iterating over it.

Okay, but won't garbage collection take care of it, then?

Well, yes and no. Eventually, it'll be garbage collected, but in the meantime, your app has a resource leak that might be exploitable to deny service to the app: just start up a resource-using request, then drop the connection over and over until the server runs out of memory or file handles or database cursors or whatever.

Now, under the WSGI standard, middleware and servers are supposed to call close() on a response iterator (if it has one), whenever they stop iterating -- regardless of whether the iteration finished normally, with an error, or due to a browser disconnect.

In practice, however, most WSGI middleware is broken and doesn't call close(), because 1) doing so usually makes your middleware code really really complicated, and 2) nobody understands why they need to call close(), because everything appears to work fine without it. (At least, until some black-hat finds your latent denial-of-service bug, anyway.)

So, WSGI Lite works around this by giving you a way to be sure that close() will be called, using a tiny extension of the WSGI protocol that I'll explain in the next section... but only if you care about the details.

Otherwise, just use the wsgi_lite.closing extension if you need resource cleanup in your body iterator, and be happy that you don't need to know anything more. ;-)

Well, actually, you do need to know ONE more thing... If your outermost @lite application is wrapped by any off-the-shelf WSGI middleware, you probably want to wrap the outermost piece of middleware with a lighten() call. This will let WSGI Lite make sure that your close() methods get called, even if the middleware that wraps you is broken.

(Technically speaking, of course, there's no way to be sure you're not being wrapped by middleware, so it's not really a cure-all unless your WSGI server natively supports the extension described in the next section. Hopefully, though, we'll put the extension into a PEP soon and all the popular servers will provide it in a reasonable time period.)

WSGI Lite uses a WSGI server extension called wsgi_lite.closing, that lives in the application's environ variable. The @lite and lighten() decorators automatically add this extension to the environment, if they're called from a WSGI 1 server or middleware, and the key doesn't already exist. (This is why you don't need a default value for the closing argument, by the way: the key will always be available to a @lite app or middleware component, or any sub-app or sub-middleware that inherits the same environment.)

The value for this key is a callback function that takes one argument: an object whose close() method is to be called at the end of the request. For convenience, the passed-in object is returned back to the caller, so you can use it in a way that's reminiscent of with closing(file('foo')) as f:.

Anyway, the idea here is that a server (or middleware component) accepts these registrations, and then closes all the resources (or generators) when the request is finished.

Objects are closed in the reverse order from which they're registered, so that inner apps' resources are released prior to middleware-provided resources being released. (In other words, if an app is using a resource that it received from middleware via its environ, that resource will still be usable during the app's close() processing or finally blocks.)

Objects registered with this extension must have close() methods, and the methods must be idempotent: that is, it must be safe to call them more than once. (That is, calling close() a second time must not raise an error.)

close() methods are explicitly allowed to registering additional objects to be closed: such objects are effectively "pushed" onto the stack of objects to be closed, with the last added object being closed first. (Note that this implies that a close() method must not directly or indirectly re-register itself, as this would create an infinite loop of closing calls.)

Currently, the handling of errors raised by close() methods is undefined, in that WSGI Lite doesn't yet handle them. ;-) (When I have some idea of how best to handle this, I'll update this bit of the spec.)

I would like to encourage WSGI server developers to support this extension if they can. While WSGI Lite implements it via middleware (in both the @lite and lighten() decorators), it's best if the WSGI origin server does it, in order to bypass any broken middleware in between the server and the app. (And, if a @lite or lighten() app is invoked from a server or middleware that already implements this extension, it'll make use of the provided implementation, instead of adding its own.)

Now, if for some reason you want to use this extension directly in your code without using a @lite() binding, please remember that the WSGI spec allows called applications to modify the environ. This means that you must retrieve the extension before you pass the environ to another app. (That's why we have keyword binding in @lite(), remember?)

Technically, WSGI Lite is a protocol as well as an implementation. And there's still one more thing to cover (besides the Rack-style calling convention and closing extension) that distinguishes it from standard WSGI.

Applications supporting the "lite" invocation protocol (i.e. being called without a start_response and returning a status/header/body triplet), are identified by a __wsgi_lite__ attribute with a True value. (@lite and lighten() add this for you automatically.)

Any app without the attribute, however, is assumed to be a standard WSGI 1 application, and thus in need of being lighten()-ed before it can be called via the WSGI Lite protocol.

(If you want to check for this attribute, or add it to an object that natively supports WSGI Lite, you can use the wsgi_lite.is_lite() and wsgi_lite.mark_lite() APIs, respectively. But even if you want to, you probably don't need to, because if you call @lite or lighten() on an object that's already "lite", it's returned unchanged. So it's easier to just always call the appropriate decorator, rather than trying to figure out whether to call it. Idempotence == good!)

Anyway, the rest of the protocol is defined simply as a stripped down WSGI, minus start_response(), write(), and close(), but with the addition of the wsgi_lite.closing key. That's pretty much it.

You knew there had to be a catch, right?

Well, in this case, there are three.

First, if you lighten() a standard WSGI app that uses write() calls instead of using a response iterator, you must have the greenlet library installed, or you'll get an error when write() is called.

Why? Well, it's complicated. But the chances are pretty good that you don't have any code that uses write(), and if you do, well, greenlet works on lots of platforms and Python versions.

Anyway, that's the first limitation. The second limitation is that WSGI Lite cannot work around broken WSGI 1 middleware that lives above your application in the call stack! That is, if your code runs under a middleware component that alters your response, but forgets to make sure your app's response's close() method gets called, then none of the fancy resource closing features in WSGI Lite will work properly.

So, until standard WSGI servers support the wsgi_lite.closing extension, you can (and should) work around this by wrapping your entire WSGI stack with a lighten() call. This way, as long as your server isn't broken, it'll call WSGI Lite's closer, and all will be well with your resource closing.

Third and finally, the lighten() wrapper doesn't support broken WSGI apps that call write() from inside their returned iterators. While some servers allow it, the WSGI specification explicitly forbids it, and to support it in WSGI Lite would force all wrapped WSGI 1 apps to pay in the form of unnecessary greenlet context switches, even if they never used write() at all.

Since the current "word on the street" says that very few WSGI apps use write() at all, I figure it's okay to blow up on the even smaller number that are also spec violators, rather than burden all apps with extra overhead just to support the ill-behaved ones. However, if you feel otherwise, let me know about it via the Web-SIG. (Especially if you have a workable suggestion for how to work around it without making things slower for the apps that don't call write()!)

The @lite decorator supports other kinds of apps besides functions. You can use instance methods, classmethods, callable instances, and even classes as WSGI Lite apps.

For example, with this class:

>>> class Demo(object):
...     @lite
...     def an_app(self, environ):
...         return hello_world(environ)
...     @classmethod
...     @lite
...     def app_factory(cls, environ):
...         return cls().an_app(environ)

both Demo().an_app and Demo.app_factory are WSGI and WSGI Lite applications; either may be called with an environ and an optional start_response:

>>> from wsgi_lite import is_lite

>>> is_lite(Demo.app_factory)

>>> is_lite(Demo().an_app)

If you want to make a class whose instances are WSGI/Lite apps, however, you can just decorate your class's __call__ method:

>>> class MyInstancesAreApps:
...     @lite
...     def __call__(self, environ):
...         return hello_world(environ)

>>> app = MyInstancesAreApps()
>>> is_lite(app)

Note, however, that this makes instances of the class callable as apps. The class itself is not an app:

>>> is_lite(MyInstancesAreApps)

So, if you want to make a class that is itself a WSGI/Lite app, you must subclass instead, and define an app method:

>>> class ThisIsAnApp(
...     def app(self, environ):
...         return hello_world(environ)

>>> is_lite(ThisIsAnApp)

When ThisIsAnApp is used as a WSGI or WSGI Lite app (i.e., when ThisIsAnApp(environ[, optional_start_response]) is called), an instance of the class will be created, and its app() method will be called, with the return value being interpreted as a status, headers, body sequence.

Your app method can optionally be wrapped with @lite to add bindings. And, if you want, you can override __init__(self, environ) to do some setup using the environment, before app is called. (You can even use @bind to add extra arguments to __init__, if you like.)

Earlier, we showed a latinator middleware function that could be used to wrap WSGI or WSGI Lite apps. However, the way that function was written, it would only have been usable with functions, not method definitions.

If you want to write a middleware function that's usable as a decorator with either regular functions or methods, use @lite.wraps as shown here:

>>> def require_authentication(app):
...     @lite.wraps(app, user=User)
...     def wrapper(app, environ, user=None):
...         if user is not None:
...             return app(environ)
...         else:
...             XXX # return a login form response
...     return wrapper

>>> class User(object):
...     @classmethod
...     def __wsgi_bind__(cls, environ):
...         if 'myapp.authenticated_user' in environ:
...             yield environ['myapp.authenticated_user']

>>> @require_authentication
... def my_app(environ):
...     """this code only runs if authenticated"""

The idea in this example is that the @require_authentication decorator can now be used to wrap a function or method definition, in such a way that the decorator doesn't need to know whether it's wrapping a standalone function or some kind of method.

Notice that the wrapper function takes an extra positional argument before the environ. As long as the wrapper uses this argument instead of the object that was passed into @lite.wraps(), then the resulting decorator will work equally well with methods, standalone functions, __call__ methods, etc. (Basically, @lite.wraps gives you access to the same transparent method vs. function support that @lite itself uses.)

@lite.wraps() takes exactly one positional argument: the function or method definition the enclosing decorator will be wrapping. As shown, it also accepts binding arguments as keywords, just like @lite. (This allows our example to ask for an optional User object, whose presence it then checks for.)

If you use any additonal binding decorators with your wrapper (like our earlier @with_routing example), they must be placed after @lite.wraps() (i.e., be closer to your def, so that they are invoked before it). Otherwise, they will be applied to the decorated application instead of your middleware wrapper... which is probably not what you want!

By the way, even though the above example decorator wraps a function that obeys the Lite protocol, it is not required for the lite.wraps() decorator to work. Your decorator can pass different arguments into, or expect different results out of the function it wraps. This can be used to implement wrappers that say, apply a template to data returned from a function or turn it into JSON or XML, or maybe any of the above depending upon the request.

(Note that this means the wrapped function is not automatically a WSGI 1 app, so don't pass it to anything that expects one unless you first wrap it with @lite, e.g. by doing @lite.wraps(lite(app)).)

The code in this repository is experi-mental, and possibly very-mental or just plain detri-mental. It has not been seriously used or battle-hardened as yet, even though test coverage is now at 100% (except for a few new and still-experimental features), and there are some fairly exhaustive WSGI compliance tests that exercise many obscure corners of the WSGI protocol.

Ironically enough, however, that may well mean that there is important "WSGI" code out there that won't work with this module yet, precisely because that other code is not compliant with the spec! So, while this project's code should work quite well for compliant code, this doesn't mean it will play well with all the code you're using in all your project(s). Exercise it carefully, and don't assume that because it works great for one of your apps or middleware components, it'll therefore work great with all of them!

In general, though, this is still alpha software, and things may change or break. It might even be that the whole thing was a really stupid idea that won't actually work in the real world for some reason.

So, I've really just thrown this out there for people to see and play with, so I can get some feedback on its actual usability. Feel free to drop me an email via the Web-SIG mailing list, to let me know what you think. Hopefully, we'll soon get any glitches sorted out, and nail this down to something that's less of a moving target, and maybe even turn it into a PEP and a stdlib contribution!

(Oh, and last, but not least... this package is under the Apache license, since that's what the PSF uses for software contributed to Python, and hopefully that's where this is headed, assuming we don't find some sort of glaring hole in the protocol or concept, of course, and it's in sufficiently high demand.)


A new way of writing WSGI middleware and apps, that's transparently interoperable with standard WSGI.






No packages published