Permalink
Switch branches/tags
Nothing to show
Find file
Fetching contributors…
Cannot retrieve contributors at this time
342 lines (250 sloc) 13.8 KB

NAME

PSGI::FAQ - Frequently Asked Questions and answers

QUESTIONS

General

How do you pronounce PSGI?

We read it simply P-S-G-I, but you may be able to pronounce it "sky" :)

So what is this?

PSGI is a protocol interface between web servers and perl-based web applications akin to what CGI does for web servers and CGI scripts.

Why do we need this?

Perl has CGI as a core module that somewhat abstracts the difference between CGI, mod_perl and FastCGI. However, most web application framework developers (e.g. Catalyst and Jifty) usually avoid using it to maximize the performance and to access low-level APIs. So they end up writing adapters for all of those different environments, some of which may be well tested while others are not.

PSGI allows web application framework developers to only write an adapter for PSGI and the end users can choose whichever backends that supports the PSGI protocol.

You said PSGI is similar to CGI. How is PSGI protocol different from CGI?

PSGI protocol is designed to be very similar to that of CGI intentionally, so supporting PSGI in addition to CGI would be extremely easy. Here's a highlight of the key differences between CGI and PSGI:

  • In CGI, servers are the actual web servers written in any languages but mostly in C, and script is a script that can be written in any language such as C, Perl, Shell scripts, Ruby or Python.

    In PSGI, servers are still web servers, but they're perl process that are usually embedded in the web server (like mod_perl) or a perl daemon process called by web server (like FastCGI), or an entirely perl based web servers.

  • In CGI, we use STDIN, STDERR and environment variables to read parameters and HTTP request body and send errors from the application.

    In PSGI, we use the $env hash references and psgi.input and psgi.output stream to pass those between servers and applications.

  • In CGI, applications are supposed to print HTTP headers and body to STDOUT to pass it back to the web server.

    In PSGI, applications are supposed to return HTTP status code, headers and body (as an array ref or a filehandle-like object) to the application as an array reference.

My framework already does CGI, FCGI and mod_perl. Why do I want to support PSGI?

If your web application framework already does most of server environments and they perform really good and are well tested, there may not be a direct benefit as of today for you to support PSGI immediately. But if a CGI environment is already supported, supporting PSGI in addition should be extremely easy, and the benefit you and your framework users will enjoy is huge.

I'm writing a web application. What's the benefit of PSGI for me?

If the framework you're using supports PSGI, that means your application can run on any of existent and feature PSGI implementations. Also, if you use the Plack standard API, the end users of your application should be able to configure and run your application in a bunch of different ways.

What should I do to support PSGI?

If you're a web server developer, write a PSGI implementation that calls a PSGI application. Or join the development on Plack, the reference implementation of PSGI, to add backends for more web servers.

If you're a web application framework developer, write an adapter for PSGI. Now you're freed from supporting all different server environments.

If you're a web application developer (or a web application framework user), choose the framework that supports PSGI, or ask the author to support it. :)

See PSGI::Guide how to write PSGI applications, backends and adapters.

Is PSGI faster than (my framework)?

Again, PSGI is not an implementation, but there's a potential for a very fast PSGI implementation that preloads everything and runs fully optimized code as a preforked standalone with XS parsers and sendfile(2) kernel call, an event-based tiny web server written in C and embedded perl that supports PSGI, or a plain-old CGI.pm based backend that doesn't load any modules at all and runs pretty quickly without eating so much memory under the CGI environment.

The reference implementation Plack already has a very fast backends like Standalone::Prefork and Coro.

Users of your framework can chose which backend is the best for their needs. You, as a web application framework developer, don't need to think about lots of different user base with different needs.

Plack

What is Plack? What is the difference between PSGI and Plack?

PSGI is a specification, so there's no software nor a module called PSGI. End users will need to choose one of PSGI implementations to run PSGI applications on. Plack is a reference PSGI implementation that supports environments like prefork standalone server, CGI, FastCGI, mod_perl, AnyEvent and Coro.

Plack also has useful APIs and helpers on top of PSGI, such as Plack::Request to provide a nice object-oriented API on request objects, and plackup that allows you to run an PSGI application from the command line and configure it using app.psgi (a la Rack's Rackup).

What kind of server backends would be available?

In Plack, we already support most web servers like Apache2, and also the ones that supports standard CGI or FastCGI, but also want to support special web servers that can embed perl, like nginx. We think it would be really nice if Apache module mod_perlite and Google AppEngine support PSGI too, so you can run on your PSGI/Plack based perl app in the cloud.

Ruby is Rack and JavaScript is Jack. Why is it not called Pack?

Well Pack indeed is a cute name, but Perl has a built-in function pack so it's a little confusing, especially in the speech.

What namespaces should I use to implement PSGI support?

Do not use the PSGI:: namespace to implement PSGI backends nor adapters.

The PSGI namespace is reserved for PSGI specifications and reference unit tests that implementors have to pass. It should not be used by particular implementations.

If you write a plugin or an extension to support PSGI for an (imaginary) web application framework called Camper, name the code such as Camper::Engine::PSGI. If you write a PSGI backend that runs on an imaginary pure perl web server called mightyd, name it such as Mightyd::Handler::PSGI or consider contributing the code to Plack project as Plack::Server::Mightyd (by doing so you don't need to worry about how to run/load PSGI applications).

HTTP::Engine

Why PSGI/Plack instead of HTTP::Engine?

HTTP::Engine was a great experiment, but it mixed the specification with implementations, and the monolithic class hierarchy and role based interfaces makes it really hard to write a new backend. We kept the existent HTTP::Engine and broke it into three parts: The Spec (PSGI), Reference implementations (Plack::Server) and Standard APIs and Tools (Plack).

Will HTTP::Engine be dead?

It won't be dead. HTTP::Engine will become a very thin API on top of Plack. Your application written in HTTP::Engine should continue to work using HTTP::Engine::Interface::PSGI.

HTTP::Engine would still be useful if you quickly want to write a micro web server application, instead of writing a web application framework.

API Design

Keep in mind that most design choices made in the PSGI spec are to minimize the requirements on backends so they can optimize things. Adding a fancy interface or allowing flexibility on PSGI layers might sound catchy to end users, but it would just add things that backends have to support, which would end up getting in the way of optimizations, or introducing more bugs. What makes a fancy API to attract web application developers is your framework, not PSGI.

Why a big env hash instead of objects with APIs?

The simplicity of the protocol is the key that made WSGI and Rack successful. PSGI is a low-level protocol between backends and web application framework developers. If we define an API on what type of objects should be passed and which method they need to implement, there will be so much duplicated code in the backends, some of which may be buggy.

For instance, PSGI defines $env->{REMOTE_ADDR} as a string. What if the PSGI spec required it to be an instance of Net::IP? Backend code would have to depend on the Net::IP module, or have to write a mock object that implements ALL of Net::IP's methods. Backends depending on specific modules or having to reinvent lots of stuff is considered harmful and that's why the protocol is as minimal as possible.

Making a nice API for the end users is a job that web application frameworks (adapter developers) should do, not something PSGI needs to define.

Why is the application a code ref rather than an object with a ->call method?

Requiring an object in addition to a code ref would make ALL backend's code a few lines more tedious, while requiring an object instead of a code ref would make application developers write another class and instanciate an object.

Requiring an application ALWAYS be an object instead of a code ref would also work, but then application developers always have to define a class and instanciate the object. The benefit of requiring an object instead of a code ref would be that it gives middleware developers a chance to call some methods on the application instead of the standard call method. However, that can also be done also by setting values or callbacks in the environments hash reference.

The problem still remains to fill the gap of how the application should be initialized and by who.

This is a solved problem in Ruby's Rack and Python's WSGI and we'll clone that: plackup takes a app.psgi file where users can preload their framework code, setup anything they need, add middleware, and then return a code reference that is a PSGI application.

Why should the headers returned as an array ref and not a hash ref?

Short: In order to support multiple headers (e.g. Set-Cookie).

Long: In Python WSGI, the response header is a list of (header_name, header_value) tuples i.e. type(response_headers) is ListType so there can be multiple entries for the same header key. In Rack and JSGI, header value is a String consisting of lines separated by "\n".

We liked Python's specification here, and since Perl hashes don't allow multiple entries with the same key (unless it's tied), using an array reference to store [ key => value, key => value ] is the simplest solution to keep both framework adapters and backends simple. Other options, like allowing an array ref in addition to a plain scalar, makes either side of the code unnecessarily tedious.

Note that I'm talking about multiple header lines with the same key, and NOT about multiple header values (e.g. Accept: text/html, text/plain, *). Joining the header values with , is obviously the applications' job. HTTP::Headers does exactly that when it's passed an array reference as a header value, for instance.

The other option is to always require the application to set value as an array ref, even if there is only one entry: this would make backend's code less tedious, but, for the exact reason of multiple header values vs. multiple header lines with the same name mentioned in the paragraph before, I think it's confusing.

No iterators support in $body?

We learned that WSGI and Rack really enjoy the benefit of Python and Ruby's language beauty, which are iterable objects in Python or iterators in Ruby.

Rack, for instance, expects the body as an object that responds to the each method and then yields the buffer, so

body.each { |buf| request.write(buf) }

would just magically work whether body is an Array, FileIO object or an object that implements iterators. Perl doesn't have such a beautiful thing in the language unless autobox is loaded. And PSGI should not make autobox as a requirement, hence we only support a simple array ref or file handles.

Writing an IO::Handle-like object is pretty easy since it's only getline and close. You can also use PerlIO to write such an object that behaves like filehandle, even though it might be considered a little unstable.

What if I want to stream content or do a long-poll Comet?

For the server push, your application can return a IO::Handle-like object as a content body that implements getline to return pushed content. Some implementations are also allowed to do an optimized non-blocking read from special filehandle-like objects like IO::Writer.

Fo the long poll comet you can also use co-routine based implementations like Plack::Server::Coro and do your own threading.

Why CGI-style environment variables instead of HTTP headers as a hash?

Most existent web application frameworks already have code or a handler to run under the CGI environment. Using CGI-style hash keys instead of HTTP headers makes it so easy for the framework developers to implement an adapter to support PSGI. For instance, Catalyst::Engine::PSGI only differs in a few dozens of lines from Catalyst::Engine::CGI and was written in less than an hour.

SEE ALSO

WSGI's FAQ clearly answers lots of questions about how some API design decisions were made, some of which can directly apply to PSGI.

http://www.python.org/dev/peps/pep-0333/#questions-and-answers

MORE QUESTIONS?

If you have a question that is not answered here, or things you totally disagree with, come join the IRC channel #http-engine on irc.perl.org or mailing list http://groups.google.com/group/psgi-plack. Be sure you clarify which hat you're wearing: application developers, server implementors or middleware developers. And don't criticize the spec just to criticize it: show your exact code that doesn't work or get too messy because of spec restrictions etc. We'll ignore all nitpicks and bikeshed discussion.

AUTHOR

Tatsuhiko Miyagawa <miyagawa@bulknews.net>

COPYRIGHT AND LICENSE