Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dial back the hype about middleware #20

Open
chadwhitacre opened this issue Jan 14, 2015 · 10 comments
Open

dial back the hype about middleware #20

chadwhitacre opened this issue Jan 14, 2015 · 10 comments

Comments

@chadwhitacre
Copy link

From 3333:

If middleware can be both simple and robust, and WSGI is widely available in servers and frameworks, it allows for the possibility of an entirely new kind of Python web application framework: one consisting of loosely-coupled WSGI middleware components. Indeed, existing framework authors may even choose to refactor their frameworks' existing services to be provided in this way, becoming more like libraries used with WSGI, and less like monolithic frameworks. This would then allow application developers to choose "best-of-breed" components for specific functionality, rather than having to commit to all the pros and cons of a single framework.

I suggest that evolution has not favored this path, and we should no longer pretend that it will.

Pylons was the best attempt at realizing the vision of "an entirely new kind of Python web application framework: one consisting of loosely-coupled WSGI middleware components." Pylons is in legacy. I don't believe Pyramid (Pylon's spiritual heir) goes so far. Here's the closest I could find on their docs (cc: @bbangert @mcdonc @tseaver @whitmo):

Pyramid has a sort of internal WSGI-middleware-ish pipeline that can be hooked by arbitrary add-ons named "tweens". The debug toolbar is a "tween", and the pyramid_tm transaction manager is also. Tweens are more useful than WSGI middleware in some circumstances because they run in the context of Pyramid itself, meaning you have access to templates and other renderers, a "real" request object, and other niceties.

Flask (Werkzeug) does make use of middleware, which sit between the server and Flask, and afaict are used for lower-level kludges and glorious hacks like the debugger (eh, @mitsuhiko?).

Sounds like gunicorn and gevent [ab]use middleware to implement websockets.

Django has its own, independent middleware implementation. Natch. ;-)

I started writing this ticket thinking that we should abolish the concept of middleware entirely. Upon reflection, it seems that middleware does have some uses, but afaict it's a minor, secondary part of the Python web development experience, far less important than framework APIs and ecosystems. The vision of "an entirely new kind of Python web application framework" turns out to have been hype, and we should dial it back.

Loosely reticketed from #13 (comment).

@sigmavirus24
Copy link

So I'm against removing middleware altogether. If you want to reimagine how it works (although I'm not sure there's much that really needs reinvention) that's one thing. I'm currently planning a new project that will rely entirely on being able to hook into WSGI as middleware.

I think the argument that Pyramid, Django, etc. implement their own kinds of middleware don't hold water in my opinion). WSGI middleware means that other authors (who don't care what application framework you use) can hook into any of those since they all sit atop WSGI. WSGI middleware will be making my life easier and anyone else like me. I wouldn't be surprised if @GrahamDumpleton has taken advantage of WSGI middleware for similar applications

@rbtcollins
Copy link
Contributor

Certainly we can be less hype about it, but WSGI is all about middleware:
its the basic calling convention.

On 14 January 2015 at 17:15, Ian Cordasco notifications@github.com wrote:

So I'm against removing middleware altogether. If you want to reimagine
how it works (although I'm not sure there's much that really needs
reinvention) that's one thing. I'm currently planning a new project that
will rely entirely on being able to hook into WSGI as middleware.

I think the argument that Pyramid, Django, etc. implement their own kinds
of middleware don't hold water in my opinion). WSGI middleware means that
other authors (who don't care what application framework you use) can hook
into any of those since they all sit atop WSGI. WSGI middleware will be
making my life easier and anyone else like me. I wouldn't be surprised if
@GrahamDumpleton https://github.com/GrahamDumpleton has taken advantage
of WSGI middleware for similar applications


Reply to this email directly or view it on GitHub
#20 (comment)
.

Robert Collins rbtcollins@hp.com
Distinguished Technologist
HP Converged Cloud

@gvanrossum
Copy link

Sounds like most frameworks end up designing their own interface for middleware, because then the middleware can use framework facilities. What this loses is the ability to combine middleware that's not tied to a framework -- that seems to be the pipe dream that didn't happen, and the reason is that the environment (without any framework) is too impoverished to accomplish much. Realistically, most web apps may need a handful of things that can easily be turned on or off and those are all easily provided by the framework.

I think this outcome makes sense -- even though we draw architectural diagrams (of any type of system, not just web apps) as "layer cakes", the interfaces between the layers are all different. It may be turtles all the way down, but that doesn't mean the turtles are all the same. :-)

@chadwhitacre
Copy link
Author

Far be it from me to not draw more layer cakes. :-)

WSGI is all about middleware: its the basic calling convention.

Yes, it's true on its face that WSGI's calling convention gives rise to middleware, because WSGI is chainable: at least some of the layers of the cake can have the same icing between them. As the terms are used in 3333, a "server" is simply the name for one end of a WSGI chain, and the "framework" or "application" is the name for the other end of the chain. "Middleware" is any layer in between. WSGI experts (towards which the people on this repo skew) do take advantage of WSGI's chainability. For them, the opposite endpoint is properly an "application," and not a "framework." Here's a depiction of the mental model:

chainable

However, non-experts—the vast majority of our users—don't have this experience of WSGI. For them, WSGI is some dimly cognized glue that means they can switch their Django app from mod_wsgi to gunicorn to uwsgi to whatever the latest hotness is. Maybe their framework uses WSGI middleware to implement some of its lower-level features (debug toolbar seems to be the best example). However, their primary developer experience is with the framework's own APIs and UIs, not with WSGI.

my-app

In fact, WSGI's chainability is parallel to HTTP's chainability, as 3333 suggests, so basically there are only three or four kinds of icing:

  • framework APIs
  • WSGI
  • possibly FCGI, Apache APIs, etc.
  • HTTP

"All the way down" is not actually that far.

What's the point?

  • The terms WSGI "server," "middleware," "application," and "framework" are still useful.
  • We should speak modestly of middleware.
  • We should let frameworks do their job.

@gvanrossum
Copy link

Cool diagrams! I hadn't thought of the HTTP chainability. Which, BTW, has
some of the same problems, both PR-wise (only a few people care) and in
terms of usability for developers at both extreme ends (client devs and app
devs). For example HTTP proxies are known to break(*) things like timeouts,
long-polling, compression, caching.

(*) "Break" here typically means "has semantics a typical app/client
developer can't distinguish from actual brokenness".

On Wed, Jan 14, 2015 at 6:38 AM, Chad Whitacre notifications@github.com
wrote:

Far be it from me to not draw more layer cakes. :-)

WSGI is all about middleware: its the basic calling convention.

Yes, it's true on its face that WSGI's calling convention gives rise to
middleware, because WSGI is chainable: at least some of the layers of the
cake can have the same icing between them. As the terms are used in 3333
https://www.python.org/dev/peps/pep-3333/, a "server" is simply the
name for one end of a WSGI chain, and the "framework" or "application" is
the name for the other end of the chain. "Middleware" is any layer in
between. WSGI experts (towards which the people on this repo skew) do take
advantage of WSGI's chainability. For them, the opposite endpoint is
properly an "application," and not a "framework." Here's a depiction of the
mental model:

[image: chainable]
https://cloud.githubusercontent.com/assets/134455/5740012/745264e4-9bcd-11e4-918a-6b21d82f920c.jpeg

However, non-experts—the vast majority of our users—don't have this
experience of WSGI. For them, WSGI is some vaguely understood glue that
means they can switch their Django app from mod_wsgi to gunicorn to uwsgi
to whatever the latest hotness is. Maybe their framework uses WSGI
middleware to implement some of its lower-level features (debug toolbar
seems to be the best example). However, their primary developer experience
is with the framework's own APIs, not with WSGI.

[image: my-app]
https://cloud.githubusercontent.com/assets/134455/5740288/fea2c74a-9bcf-11e4-85e5-e9b4e44ab74a.jpeg

In fact, WSGI's chainability is parallel to HTTP's chainability, as 3333
https://www.python.org/dev/peps/pep-3333/#other-http-features suggests,
so basically there are only three or four kinds of icing:

  • HTTP
  • possibly FCGI, Apache APIs, etc.
  • WSGI
  • framework APIs

"All the way down" is not actually that far.

What's the point?

  • The terms "server," "framework," "middleware," and "application" are
    all useful in the context of WSGI.
  • We should speak modestly of middleware.
  • We should let frameworks do their job.


Reply to this email directly or view it on GitHub
#20 (comment)
.

--Guido van Rossum (python.org/~guido)

@bbangert
Copy link

On Wed, Jan 14, 2015 at 7:49 AM, Guido van Rossum notifications@github.com
wrote:

Cool diagrams! I hadn't thought of the HTTP chainability. Which, BTW, has
some of the same problems, both PR-wise (only a few people care) and in
terms of usability for developers at both extreme ends (client devs and
app
devs). For example HTTP proxies are known to break(*) things like
timeouts,
long-polling, compression, caching.

(*) "Break" here typically means "has semantics a typical app/client
developer can't distinguish from actual brokenness".

I would also add that both WSGI middleware and HTTP chains present another
problem.... poor API and the inefficiency of constantly parsing the
request/response
at each layer (and for HTTP proxies, latency between machines).

On Wed, Jan 14, 2015 at 6:38 AM, Chad Whitacre notifications@github.com
wrote:

Far be it from me to not draw more layer cakes. :-)

WSGI is all about middleware: its the basic calling convention.

Yes, it's true on its face that WSGI's calling convention gives rise to
middleware, because WSGI is chainable: at least some of the layers of
the
cake can have the same icing between them. As the terms are used in 3333
https://www.python.org/dev/peps/pep-3333/, a "server" is simply the
name for one end of a WSGI chain, and the "framework" or "application"
is
the name for the other end of the chain. "Middleware" is any layer in
between. WSGI experts (towards which the people on this repo skew) do
take
advantage of WSGI's chainability. For them, the opposite endpoint is
properly an "application," and not a "framework." Here's a depiction of
the
mental model:

Part of the reason for moving away from WSGI middleware, and towards
internal
'middleware' like layers inside the frameworks was efficiency and API.
Writing
WSGI middleware that manipulates a response is pretty crappy, you gotta sub
in
replacement write_response functions and re-parse out headers, etc. And if
the WSGI
middleware decides in the end to do nothing, there's still all the overhead
of parsing
the response/request to the point that it could be determined that nothing
was done.

Now, some of this can be alleviated, WebOb stores parsed request bits in
the environ,
so that later layers using webob don't repeat the parsing steps.... but now
there's the
implied assumption that all layers use WebOb, which is more like a framework
specific request/response.

I think WSGI middleware would've gained/kept its traction if a
request/response object
had been standardized, and we did try to come to an agreement on such a
thing in the
past between Pylons, Django, and Werkzeug... but it didn't work out.

[image: chainable]
<
https://cloud.githubusercontent.com/assets/134455/5740012/745264e4-9bcd-11e4-918a-6b21d82f920c.jpeg>

However, non-experts—the vast majority of our users—don't have this
experience of WSGI. For them, WSGI is some vaguely understood glue that
means they can switch their Django app from mod_wsgi to gunicorn to
uwsgi
to whatever the latest hotness is. Maybe their framework uses WSGI
middleware to implement some of its lower-level features (debug toolbar
seems to be the best example). However, their primary developer
experience
is with the framework's own APIs, not with WSGI.

This was the real win for WSGI, switching the HTTP serving layer easily.
The bits about
middlware can probably be dropped since that just didn't work out.

[image: my-app]
<
https://cloud.githubusercontent.com/assets/134455/5740288/fea2c74a-9bcf-11e4-85e5-e9b4e44ab74a.jpeg>

In fact, WSGI's chainability is parallel to HTTP's chainability, as 3333
https://www.python.org/dev/peps/pep-3333/#other-http-features
suggests,
so basically there are only three or four kinds of icing:

  • HTTP
  • possibly FCGI, Apache APIs, etc.
  • WSGI
  • framework APIs

"All the way down" is not actually that far.

What's the point?

  • The terms "server," "framework," "middleware," and "application" are
    all useful in the context of WSGI.
  • We should speak modestly of middleware.

Yup, or maybe note that it had some promise but wasn't really usable with a
standardized request/response object API (I believe Ruby's version of WSGI,
Rack
does have a standard request/response object so perhaps their middleware
layers
will see more use at some point).

  • We should let frameworks do their job.

Cheers,
Ben

@chadwhitacre
Copy link
Author

WSGI is all about middleware: its the basic calling convention.

Now that I think about it, isn't WSGI's chainability just a basic property of any function call stack? Wrapping calls to extend functionality isn't unique to WSGI, it's yer bog standard call chain:

def foo(bar):
    return baz

def my_foo(bar):
    wobble(bar)
    baz = foo(bar)
    frobble(baz)
    return baz

app = my_foo
make_server(app, etc).serve()

If this is true, it means that if we do drop all mention of middleware from WSGI/2, that wouldn't hinder anyone who did want to continue writing functions designed to wrap other WSGI functions. Would it?

A "middleware" ecosystem could be done as a separate specification from WSGI/2 itself. It would be a minimal spec: I find only one hard constraint on middleware in 3333: that they don't block unnecessarily. The rest of the references either: introduce the concept of middleware; also apply to servers and applications; or have to do with complications introduced by middleware vis-a-vis server extensions.

However, what we've learned from over a decade of experience is that the "middleware" abstraction is nowhere near as fruitful as the "server" and "framework" abstractions. Therefore, we should optimize WSGI/2 for server and framework ecosystems, not a middleware ecosystem. Encouraged by @bbangert's input, I'm swinging back towards suggesting that we drop middleware from WSGI/2.

@rbtcollins @sigmavirus24 You're +1 on middleware. How would dropping middleware from WSGI/2 prevent you from still building a middleware ecosystem?

@rbtcollins
Copy link
Contributor

So I don't think its as simple as 'dropping' middleware. As you note there are few constraints on middleware today, but the parallel draft pep on escaping out of WSGI/1 (see the wsgi list archives) notes a bunch of additional stuff. And there are niggles like: given a chain A->B->C (all WSGI components), if B modifies a header in the request (e.g. normalising some header for some reason), should it modify the headers struct, or should it leave the original intact and pass a new one with the updated values?

In the absence of guidance, we can be sure that folk will do whatever makes sense to them at the time - and then we have 4 billion (exaggerated :)) different calling conventions.

I'm not sure that I agree that the middleware abstraction hasn't been useful - it has been - see e.g. beaker and weurkzeug - but its been problematic to work with, precisely because /only/ a protocol was supplied - the stdlib didn't come with any batteries for efficiently and easily writ ing middleware.

Further, the protocol was, as noted, inefficient. At the time, it was massively more efficient than CGI, which was the competitor, but today where we are assessing response times in ms (or less!) we have a very different willingness to tolerate such fat (even if most typical sites still wouldn't manage to show up middleware on a profiler, enough do that it becomes a key consideration).

IMO to grow an ecosystem one needs a certain amount of stability, clarity that the ecosystem is desired and will be supported, and a good place to glue it in. In the absence of all those things the barrier to entry will be too high, and the experience of folk that do attempt to work in the ecosystem too poor. Consider the nose ecosystem vs the unittest ecosystem. They are both roughly the same in terms of extensability at a purely technical level, but the other factors overwhelm that.

So, the question for me is not 'what would prevent X', but 'do we want to encourage X still' ?

And - I think we do. Good clean middleware is much nicer to read than code embedded in big frameworks IME. Secondly, HTTP/2 is /much/ more extensible than HTTP/1 was, and its my feeling that ensuring that extensability is available in WSGI will permit implementations of [many of] those extensions behind the server, rather than forcing them to be in-server. And that matters because revving deployments of apache2 etc is substantially slower than deploying new app code.

So, I'd like to keep encouraging middleware - I don't think it should be hyped though. Further, I think we need to fix the efficiency issues and where possible make sure the calling conventions are such that there is nothing driving middleware into frameworks: if our expectation is that all middleware will end up in frameworks, there would be no point. My suggestion to use a dict for headers, for instance, is aimed at increasing efficiency for consumers of WSGI itself.

@chadwhitacre
Copy link
Author

the parallel draft pep on escaping out of WSGI/1 (see the wsgi list archives)

Is this the right thread?

https://mail.python.org/pipermail/web-sig/2014-October/005320.html

@sigmavirus24
Copy link

So, I'd like to keep encouraging middleware - I don't think it should be hyped though. Further, I think we need to fix the efficiency issues and where possible make sure the calling conventions are such that there is nothing driving middleware into frameworks: if our expectation is that all middleware will end up in frameworks, there would be no point.

I agree. And standardizing the middleware API and conventions would be a real improvement over the current state of affairs.

My suggestion to use a dict for headers, for instance, is aimed at increasing efficiency for consumers of WSGI itself.

Please let's stop using over-simplified abstractions that only serve to make this harder for consumers. Using dictionaries for request or response headers are unacceptable. While it's true that HTTP/1.1 allow you to join values for repeated header names with ,s, you can't do that for cookies (specifically Set-Cookie), which cannot actually be joined like that and which typically need to return the same header name with different values more than once. We definitely need an efficient MultiDict implementation to stay true to HTTP and the RFCs. It may be simpler for users to just get a dictionary, but it's not accurate or exceedingly convenient for anything other than the simplest of cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants