Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Browse files

Get rid of ApplicationPool algorithm.txt.

[ci:skip]
  • Loading branch information...
commit 92646a858a919dac166a19f4a21f9d8dc544dcd5 1 parent c0cd45a
@FooBarWidget FooBarWidget authored
Showing with 0 additions and 615 deletions.
  1. +0 −615 doc/ApplicationPool algorithm.txt
View
615 doc/ApplicationPool algorithm.txt
@@ -1,615 +0,0 @@
-= ApplicationPool algorithm
-
-
-== Introduction
-
-For efficiency reasons, Passenger keeps a pool spawned Rails/Ruby applications.
-Please read the C++ API documentation for the ApplicationPool class for a full
-introduction. This document describes an algorithm for managing the pool, in a
-high-level way.
-
-The algorithm should strive to keep spawning to a minimum.
-
-
-== Definitions
-
-=== Vocabulary
-
-- "Application root":
- The toplevel directory in which an application is contained. For Rails
- application, this is the same as RAILS_ROOT, i.e. the directory that contains
- "app/", "public/", etc. For a Rack application, this is the directory that
- contains "config.ru".
-
-- "Active application process":
- An application process that has more than 0 sessions.
-
-=== Types
-
-Most of the types that we use in this document are pretty standard. But we
-explicitly define some special types:
-
-- list<SomeType>
- A doubly linked list which contains elements of type SomeType. It supports
- all the usual list operations that one can expect from a linked list, like
- add_to_back(), etc.
-
- We assume that operations that insert an element into the list return an
- iterator object. An iterator object is an opaque object which represents a
- specific position in the list; it probably contains the links to the previous
- and the next iterator, as well as a reference to the actual list element,
- depending on the list implementation.
-
- The following operations deserve special mention:
- * remove(iterator)
- Removes the specified element from the list, as represented by the given
- iterator. This operation can be done in O(1) time.
-
- * move_to_front(iterator)
- Moves the specified element - as represented by the given iterator - to
- the front of the list. This operation can be done in O(1) time.
-
-- Group
- A compound type (class) which contains information about an application root,
- such as the application processes that have been spawned for this application
- root.
-
- A Group has the following members:
- * name (string):
- This group's key in the _groups_ map of the application pool.
-
- * app_root (string):
- This group's application root. Note that it is *not* guaranteed that all
- ProcessInfo objects in _processes_ have the same application root.
-
- * processes (list<ProcessInfo>):
- A list of ProcessInfo objects.
-
- Invariant:
- processes is non-empty.
- for all 0 <= i < processes.size() - 1:
- processes[i].group_name == name
- if processes[i].process is active:
- processes[i + 1].process is active
-
- * size (unsigned integer):
- The number of items in _processes_.
- Invariant:
- if !detached:
- size == processes.size()
-
- * max_requests (unsigned integer):
- The maximum number of requests that each application process in
- this group may process. After having processed this
- many requests, the application process will be shut down.
- A value of 0 indicates that there is no maximum.
-
- * min_processes (unsigned integer):
- The minimum number of processes that the cleaner thread should keep in
- this group. Defaults to 0.
-
- * spawning (boolean): Whether a background thread is currently spawning
- a new process for this group.
-
- * spawner_thread: A handle to the background thread that is currently
- spawning a new process. Only valid if _spawning_ is true.
-
- * detached (boolean): If true, then it indicates that this Group is
- no longer accessible via _groups_.
- Set to false by the constructor.
- Invariant:
- (detached) == (This Group is accessible via _groups_.)
- if detached:
- for all process_info objects p that have once been in this.processes:
- p.detached
-
- * environment (string): Does nothing. Data is stored in memory for analytics
- purposes.
-
-- ProcessInfo
- A compound type (class) which contains a reference to an application process
- object, as well as various metadata, such as iterators for various linked
- lists. These iterators make it possible to perform actions on the linked
- lists in O(1) time.
-
- A ProcessInfo has the following members:
- * process - A process object, representing an application process.
- * group_name (string) - The name of the group that this ProcessInfo belongs
- to.
- * identifier (string) - A key that uniquely identifies this ProcessInfo in
- this application pool. This key allows external processes to refer to a
- specific ProcessInfo object without knowing its memory pointer. It's set
- to a random string by ProcessInfo's constructor.
- * start_time (timestamp with milisecond resolution) - The time at which this
- application process was started. It's set to the current time by the
- constructor.
- * processed_requests (integer) - The number of requests processed by this
- application instance so far. Set to 0 by the constructor.
- * last_used (time) - The last time a session for this application process
- was opened or closed.
- * sessions (integer) - The number of open sessions for this application
- process. It's set to 0 by the constructor.
- Invariant:
- (sessions == 0 && !detached) == (This ProcessInfo is in inactive_apps.)
- * iterator - The iterator for this ProcessInfo in the linked list
- groups[process_info.group_name].processes
- * ia_iterator - The iterator for this ProcessInfo in the linked list
- inactive_apps. This iterator is only valid if this ProcessInfo really is
- in that list.
- * detached (boolean) - If true, then it indicates that this ProcessInfo is
- no longer accessible via _groups_; this implies that it's no longer
- contained in its associated Group's _processes_ member, a), and that
- _iterator_ and _ia_iterator_ are no longer valid.
- Set to false by the constructor.
- Invariant:
- (detached) == (This ProcessInfo is accessible via _groups_.)
-
-- PoolOptions
- A structure containing additional information used by the spawn manager's
- spawning process, as well as by the get() function.
-
- A PoolOptions has at least the following members:
- * app_group_name (string) - A name which is used to group application
- processes together.
- * max_requests (unsigned integer) - The maximum number of requests that the
- application process may process. After having processed this many requests,
- the application process will be shut down. A value of 0 indicates that there
- is no maximum.
- * min_processes (unsigned integer) - The minimum number of processes for the
- current group that the cleaner thread should keep around.
- * use_global_queue (boolean) - Whether to use a global queue for all
- application processes, or a queue that's private to the application process.
- The users guide explains this feature in more detail.
- * restart_dir (string) - The directory in which the algorithm should look for
- restart.txt and always_restart.txt. The existance and modification times of
- these files tell the algorithm whether an application should be restarted.
- * environment (string) - The environment (RAILS_ENV/RACK_ENV) in which the app
- should run.
-
-=== Special functions
-
-- spawn(app_root, options)
- Spawns a new application process at the given application root with the given
- spawn options. Throws an exception if something went wrong. This function is
- thread-safe. Note that application process initialization can take an arbitrary
- amount of time.
-
-=== Instance variables
-
-The algorithm requires the following instance variables for storing state
-information:
-
-- lock: mutex
- This lock is used for implementing thread-safetiness. We assume that it
- is non-recursive, i.e. if a thread locks a mutex that it has already locked,
- then it will result in a deadlock.
-
-- groups: map[string => Group]
- Maps an application root to its Group object. This map contains all
- application processes in the pool.
-
- Invariant:
- for all values g in groups:
- !g.detached
- g.size <= count
- for all i in g.processes:
- !i.detached
- (sum of all g.size in groups) == count
-
-- max: integer
- The maximum number of ProcessInfo objects that may exist in the pool.
-
-- max_per_app: integer
- The maximum number of ProcessInfo objects that may be simultaneously alive
- for a single Group.
-
-- count: integer
- The current number of ProcessInfo objects in the pool.
- Since 'max' can be set dynamically during the life time of an application
- pool, 'count > max' is possible.
-
-- active: integer
- The number of application processes in the pool that are active.
- Invariant:
- active <= count
-
-- inactive_apps: list<ProcessInfo>
- A linked list of ProcessInfo objects. All application processes in this list
- are inactive.
-
- Invariant:
- inactive_apps.size() == count - active
- for all x in inactive_apps:
- x can be accessed from _groups_.
- x.sessions == 0
-
-- waiting_on_global_queue: integer
- If global queuing mode is enabled, then when get() is waiting for a backend
- process to become idle, this variable will be incremented. When get() is done
- waiting, this variable will be decremented.
-
-
-== Class relations
-
-Here's an UML diagram in ASCII art:
-
-[ProcessInfo] 1..* --------+
- |
- |
-
- 1
-[ApplicationPool] [Group]
- 1 0..*
-
- | |
- +-------------------+
-
-
-== Algorithm in pseudo code
-
-# Thread-safetiness notes:
-# - All wait commands are to unlock the lock during waiting.
-
-
-# Connect to an existing application process, or spawn a new application process
-# and connect to that if necessary.
-# 'app_root' refers to an application root.
-# 'options' is an object of type 'PoolOptions', which contains additional
-# information which may be relevant for spawning.
-#
-# Returns a Session object, representing a single HTTP request/response pair.
-function get(app_root, options):
- MAX_ATTEMPTS = 10
- attempt = 0
- while (true):
- attempt++
- lock.synchronize:
- process_info, group = checkout_without_lock(app_root, options)
- try:
- return process_info.process.connect()
- on exception:
- # The app process seems to have crashed.
- # So we remove this process from our data
- # structures.
- lock.synchronize:
- detach_without_lock(process_info.identifier)
- process_info.sessions--
- if (attempt == MAX_ATTEMPTS):
- propagate exception
-
-
-# Detach the process with the given identifier from the pool's data structures.
-function detach(identifier):
- lock.synchronize:
- return detach_without_lock(identifier)
-
-
-# Checkout a process from the application pool and mark it as being used.
-# If there's no appropriate process in the pool, or if there are not
-# enough processes, then one will be spawned.
-#
-# Returns a pair of [ProcessInfo, Group].
-# All exceptions that occur are propagated.
-private function checkout_without_lock(app_root, options):
- group = groups[options.app_group_name]
-
- if needs_restart(app_root, options):
- Tell spawn server to reload code for options.app_group_name.
- if (group != null):
- detach_group_without_lock(group)
- group = null
-
- if (group != null):
- # There are existing processes for this app group.
- processes = group.processes
-
- if (processes.front.sessions == 0):
- # There is an inactive process, so we use it.
- process_info = processes.front
- processes.move_to_back(process_info.iterator)
- inactive_apps.remove(process_info.ia_iterator)
- mutate_max(active + 1)
- else:
- # All existing processes are active. We either use
- # one of them now or we wait until one of them becomes
- # available. And, if we're allowed to, we spawn an
- # extra process in the background.
- if spawning_allowed(group, options) and !group.spawning:
- spawn_in_background(group, options)
- process_info = select_process(processes, options)
- if (process_info == null):
- goto beginning of function
- else:
- # There are no processes for this app group.
- if (active >= max):
- # Looks like the pool is full and all processes are busy.
- # Wait until the pool appears to have changed in such a
- # way that we can spawn a new app group, and restart
- # this function.
- new_app_group_creatable.wait
- goto beginning of function
- elsif count >= max:
- # The pool is full, and not all processes are busy, but
- # we're in a though situation nevertheless: there are
- # several processes which are inactive, and none of them
- # belong to our current app group, so we must kill one
- # of them in order to free a spot in the pool. But which
- # one do we kill? We want to minimize spawning.
- #
- # It's probably a good idea to keep some kind of
- # statistics in order to decide this. We want the
- # application root that gets the least traffic to be
- # killed. But for now, we kill a random application
- # process.
- process_info = inactive_apps.pop_front
- process_info.detached = true
- group = groups[process_info.group_name]
- processes = group.processes
- processes.remove(process_info.iterator)
- if processes.empty():
- detach_group_without_lock(group)
- else:
- group.size--
- mutate_count(count - 1)
- process_info = new ProcessInfo
- process_info.process = spawn(app_root, options)
- process_info.group_name = options.app_group_name
- group = new Group
- group.name = options.app_group_name
- group.app_root = app_root
- group.size = 1
- groups[options.app_group_name] = group
- iterator = group.processes.add_to_back(process_info)
- process_info.iterator = iterator
- mutate_count(count + 1)
- mutate_active(active + 1)
- if (options.min_processes > 1) and spawning_allowed(group, options):
- spawn_in_background(group, options)
-
- group.max_requests = options.max_requests
- group.min_processes = options.min_processes
- group.environment = options.environment
-
- process_info.last_used = current_time()
- process_info.sessions++
-
- return [process_info, group]
-
-
-private function mutate_active(value):
- if (value < active):
- new_app_group_creatable.notify_all
- global_queue_position_became_available.notify_all
- active = value
-
-private function mutate_count(value):
- # No point in notifying new_app_group_creatable here;
- # if _count_ is being increased then that means the pool
- # isn't full, and nobody is waiting on
- # new_app_group_creatable.
- global_queue_position_became_available.notify_all
- count = value
-
-private function mutate_max(value):
- if (value > max):
- new_app_group_creatable.notify_all
- # We will want any code waiting on the global queue
- # to go ahead and spawn another process.
- global_queue_position_became_available.notify_all
- max = value
-
-
-private function needs_restart(app_root, options):
- if (options.restart_dir is not set):
- restart_dir = app_root + "/tmp"
- else if (options.restart_dir is an absolute path):
- restart_dir = options.restart_dir
- else:
- restart_dir = app_root + "/" + options.restart_dir
-
- return (file_exists("$restart_dir/always_restart.txt")) or
- (we haven't seen "$restart_dir/restart.txt" before) or
- ("$restart_dir/restart.txt" changed since the last time we checked)
-
-
-private function spawning_allowed(group, options):
- return ( count < max ) and
- ( (max_per_app == 0) or (group.size < max_per_app) )
-
-
-# Precondition: !group.detached
-private function detach_group_without_lock(group):
- for all process_info in group.processes:
- if (process_info.sessions == 0):
- inactive_apps.remove(process_info.ia_iterator)
- else:
- mutate_active(active - 1)
- group.processes.remove(process_info.iterator)
- process_info.detached = true
- mutate_count(count - 1)
- if (group.spawning):
- group.spawner_thread.interrupt_and_join
- group.spawner_thread = null
- group.spawning = false
- group.detached = true
- groups.remove(options.app_group_name)
-
-
-private function select_process(processes, options):
- if options.use_global_queue:
- # So we wait until _active_ has changed, then
- # we restart this function and try again.
- waiting_on_global_queue++
- global_queue_position_became_available.wait
- waiting_on_global_queue--
- return null
- else:
- # So we connect to an already active process.
- # This connection will be put into that
- # process's private queue.
- process_info = an element in _processes_ with the smallest _session_ value
- processes.move_to_back(process_info.iterator)
- return process_info
-
-
-# Preconditions:
-# !group.detached
-# !group.spawning
-private function spawn_in_background(group, options):
- group.spawning = true
- group.spawner_thread = new thread(spawner_thread_callback,
- with these arguments to the thread function:
- group, options)
-
-
-private function spawner_thread_callback(group, options):
- Ignore thread interruptions in this function
- while true:
- try:
- Allow thread interruptions in this block
- process = spawn(app_root, options)
- on thread interruption:
- lock.synchronize:
- group.spawning = false
- group.spawner_thread = null
- return
- on exception:
- lock.synchronize:
- if (!group.detached):
- group.spawning = false
- group.spawner_thread = null
- # We want to report the error to the browser
- # but there's no way to do this in this thread, so
- # we just remove the entire group and have the next
- # get() call spawn the process and display the
- # error.
- remove_group_without_lock(group)
- return
-
- lock.synchronize:
- if (group.detached):
- return
- else:
- process_info = new ProcessInfo
- process_info.process = process
- process_info.group_name = options.app_group_name
- process_info.iterator = group.processes.add_to_front(process_info)
- process_info.ia_iterator = inactive_apps.add_to_back(process_info)
- group.size++
- mutate_count(count + 1)
- if (group.size >= options.min_processes) or
- (!spawning_allowed(group, options)):
- group.spawning = false
- group.spawner_thread = null
- return
-
-
-private function detach_without_lock(identifier):
- for group in groups:
- processes = group.processes
- for process_info in processes:
- if process_info.identifier == identifier:
- # Found a matching process.
- process_info.detached = true
- processes.remove(process_info.iterator)
- group.size--
- if processes.empty():
- detach_group_without_lock(group)
- if process_info.sessions == 0:
- inactive_apps.remove(process_info.ia_iterator)
- else:
- mutate_active(active - 1)
- mutate_count(count - 1)
- return true
- return false
-
-
-# The following function is to be called when a session has been closed.
-# _process_info_ is a weak reference to the ProcessInfo that belongs to
-# the process whose session has been closed; it evaluates to NULL if the
-# ProcessInfo object that it belongs to has been destroyed.
-callback session_has_been_closed(session, process_info):
- Convert process_info into a normal reference.
-
- # We check process_info.detached without locking. This should be safe:
- # even if a boolean update isn't atomic on the current CPU, a non-zero
- # value evaluates to true. Once true, _detached_ will never become false,
- # so instruction reorderings by the compiler or CPU won't cause any
- # damage.
- if (process_info == null) or (process_info.detached):
- return
-
- lock.synchronize:
- if process_info.detached:
- return
-
- group = groups[process_info.group_name]
- processes = group.processes
- process_info.processed++
-
- if (group.max_requests > 0) and (process_info.processed >= group.max_requests):
- # The application process has processed its maximum allowed
- # number of requests, so we shut it down.
- process_info.detached = true
- processes.remove(process_info.iterator)
- group.size--
- if processes.empty():
- detach_group_without_lock(group)
- mutate_count(count - 1)
- if (process_info.sessions == 0):
- inactive_apps.remove(process_info.ia_iterator)
- else:
- mutate_active(active - 1)
- else:
- process_info.last_used = current_time()
- process_info.sessions--
- if (process_info.sessions == 0):
- processes.move_to_front(process_info.iterator)
- process_info.ia_iterator = inactive_apps.add_to_back(process_info)
- mutate_active(active - 1)
-
-
-# The following thread will be responsible for cleaning up idle application
-# process, i.e. processes that haven't been used for a while.
-# This can be disabled per app when setting it's maxIdleTime to 0.
-thread cleaner:
- lock.synchronize:
- while true:
- # If MAX_IDLE_TIME is 0 we don't clean up any processes,
- # giving us the option to persist the processes
- # forever unless it's killed in order to free up space
- # for another process.
- if (MAX_IDLE_TIME == 0):
- Wait until the thread has been signalled to quit
- or until MAX_IDLE_TIME changed.
- if thread has been signalled to quit:
- return
- else:
- restart loop
- else:
- Wait until MAX_IDLE_TIME seconds have passed,
- or until the thread has been signalled to quit,
- or until MAX_IDLE_TIME changed.
- if thread has been signalled to quit:
- return
- else if MAX_IDLE_TIME changed:
- restart loop
-
- # Invariant:
- # From this point on, MAX_IDLE_TIME > 0
-
- now = current_time()
- for all process_info in inactive_apps:
- if (now - process_info.last_used > MAX_IDLE_TIME):
- process = process_info.process
- group = groups[process_info.group_name]
- if (group.size > group.min_processes):
- processes = group.processes
- processes.remove(process_info.iterator)
- process_info.detached = true
- inactive_apps.remove(process_info.ia_iterator)
- group.size--
- mutate_count(count - 1)
- if processes.empty():
- detach_group_without_lock(group)
-
Please sign in to comment.
Something went wrong with that request. Please try again.