0
-Passenger architecture overview
0
-===============================
0
+Passenger architectural overview
0
+================================
0
This document describes Passenger's architure in a global way. The purpose of
0
this document is to lower the barrier to entry for new contributors, as well
0
@@ -9,6 +9,7 @@ as to explain (some of the) design choices that we have made.
0
About the involved technologies
0
-------------------------------
0
+[[typical_web_applications]]
0
=== Typical web applications ===
0
Before we describe Passenger, it is important to understand how typical web
0
@@ -39,7 +40,9 @@ Tomcat web server, behind the Apache web server.
0
2. The web application is contained in a web server. In this case, the web
0
server acts like an application server. This is the case for PHP applications,
0
-on Apache servers with 'mod_php'.
0
+on Apache servers with 'mod_php'. Note that this does not necessarily mean
0
+that the web application is run inside the same process as the web server:
0
+it just means that the web server manages applications.
0
3. The web application *is* a web server, and can accept HTTP requests
0
directly. This is the case for the Trac bug tracking system, running in its
0
@@ -87,7 +90,9 @@ with threads, because parts of Ruby on Rails are not thread-safe. *
0
A particularly interesting thing to note, is that a lot of the memory
0
occupied by Ruby on Rails applications is spent on storing the program code
0
-(i.e. the Ruby AST) in memory. See Appendix A for details. TODO: write this
0
+(i.e. the Ruby AST) in memory. See Appendix A for details. (TODO: write this)
0
+Also, a lot of the startup time of a Ruby on Rails application is spent on
0
+bootstrapping the Rails framework.
0
@@ -96,11 +101,127 @@ The Apache web server has a pluggable I/O multiprocessing (the ability to
0
handle more than 1 concurrent HTTP client at the same time) architecture. An
0
Apache module which implements a particular multiprocessing strategy, is called
0
a Multi-Processing Module (MPM). The 'prefork' MPM -- which also happens to be
0
-the default -- appears to be the most popular one. This MPM uses a 'control process'
0
+the default -- appears to be the most popular one. This MPM spawns multiple
0
+worker child processes. HTTP requests are first accepted by the control
0
+process, and then forwarded to one of the worker processes. The next section
0
+contains a diagram which shows the prefork MPM's architecture.
0
+Passenger's architecture is a lot like setup #2 described in
0
+<<typical_web_applications,Typical web applications>>. In other words,
0
+Passenger runs extends Apache and allows it to act like an application server.
0
+Passenger's architecture -- assuming Apache 2 with the prefork MPM is used --
0
+is shown in the following diagram:
0
+image:images/passenger_architecture.png[Passenger's architecture]
0
+Passenger consists of an Apache module, 'mod_passenger'. This is written in
0
+C++, and can be found in the directory 'ext/apache2'. The module is active in
0
+the Apache control process and in all the Apache worker processes. When an
0
+HTTP request comes in, 'mod_passenger' will check whether the request should
0
+be handled by a Ruby on Rails application. If so, then 'mod_passenger' will
0
+spawn the corresponding Rails application (if necessary) and forward the
0
+request to that application.
0
+It should be noted that the Ruby on Rails application does *not* run in the
0
+same address space as Apache. This differentiates Passenger from other
0
+application-server-inside-web-server software such as mod_php, mod_perl and
0
+mod_ruby. If the Rails application crashes or leak memory, it will have no
0
+effect on Apache. In fact, stability is one of our highest goals. Passenger
0
+is carefully designed and implemented so that Apache shouldn't crash because
0
+=== Spawning and caching of code and applications ===
0
+A very naive implementation of Passenger would spawn a Ruby on Rails
0
+application every time an HTTP request is received, just like CGI would.
0
+However, spawning Ruby on Rails applications is expensive. It can take 1 or 2
0
+seconds on a modern PC, and possible much longer on a heavily loaded server.
0
+This overhead is particularily unacceptable on shared hosts. A less naive
0
+implementation would keep spawned Ruby on Rails application instances alive.
0
+However, this still has several problems:
0
+1. The first request to a Rails website will be slow, and subsequent requests
0
+ will be fast. But the first request to a different Rails website - on the
0
+ same web server - will still be slow.
0
+2. As we've explained earlier in this article, a lot of memory in a Rails
0
+ application is spent on storing the AST of the Ruby on Rails framework and
0
+ the application. Especially on shared hosts and on memory-constrained
0
+ Virtual Private Servers (VPS), this can be a problem.
0
+Both of these problems are very much solvable, and we've chosen to do just
0
+The first problem can be solved by preloading Rails applications, i.e. by
0
+running the Rails application before a request is ever made to that website.
0
+This is the approach taken by most Rails hosts, for example in the form of a
0
+Mongrel cluster which is running all the time. However, this is unacceptable
0
+for a shared host: such an application would just sit there and waste memory
0
+even if it's not doing anything. Instead, we've chosen to take a different
0
+path, which solves both of the aforementioned problems.
0
+We spawn Rails applications via a 'spawn server'. The spawn server caches Ruby
0
+on Rails framework code and application code in memory. Spawning a Rails
0
+application for the first time will still be slow, but subsequent spawn
0
+attempts will be very fast. Furthermore, because the framework code is cached
0
+independently from the application code, spawning a different Rails application
0
+will also be very fast, as long as that application is using a Rails framework
0
+version that has already been cached.
0
+Another implication of the spawn server is that different Ruby on Rails will
0
+share memory with each other, thus solving problem #2. This is described in
0
+detail in <<spawn_server, the next section>>.
0
+But despite the caching of framework code and application code, spawning is
0
+still expensive compared to an HTTP request. We want to avoid spawning whenever
0
+possible. This is why we've introduced the *application pool*. Spawned
0
+applications are kept alive, and their handles are stored into this pool,
0
+allowing them to be reused later. Thus, Passenger has very good average case
0
+Because the different worker processes cannot share memory with each other,
0
+either shared memory must be used for the application pool, or a client/server
0
+architecture must be implemented. We've chosen the latter because it is easier
0
+to implement. The Apache control process acts like a server for the application
0
+pool. The worker processes access the pool through the control process.
0
+However, this does not mean that all data goes through the control process.
0
+A worker process queries the pool for a connection session with a Rails
0
+application. Once this session has been obtained, the worker process will
0
+communicate directly with the Rails application.
0
+The application pool is implemented inside 'mod_passenger'. One can find
0
+detailed documentation about it in
0
+link:cxxapi/index.html[+++the C++ API documentation+++],
0
+in particular the documentation about the `ApplicationPool`,
0
+`StandardApplicationPool` and `ApplicationPoolServer` classes.
0
+NOTE: At the moment, Passenger does not support Apache with the worker MPM
0
+(which uses threads instead of processes). But because the application pool is
0
+implemented in a modular way, supporting the worker MPM shouldn't take more
0
+The application pool is fully responsible for spawning applications, caching
0
+spawned applications' handles, and cleaning up applications which have been
0
+idle for an extended period of time.
0
+=== The spawn server ===
0
+The spawn server is written in Ruby, and its code can be found in the directory
0
+'lib/passenger'. The spawn server's main executable is
0
+'bin/passenger-spawn-server'.
0
+link:rdoc/index.html[The spawn server's RDoc documentation] documents the
0
+implementation in detail.
0
+TODO: write something about sharing memory, possibly referring to Appendix A
0
+Appendix A: Sharing of memory between applications on modern operating systems
0
+------------------------------------------------------------------------------
Comments
No one has commented yet.