Add EEP 42: setrlimit(2) analogue for Erlang

erlang · Feb 7, 2013 · 08928e4 · 08928e4
1 parent 177d678
commit 08928e4
Showing 1 changed file with 200 additions and 0 deletions.
diff --git a/eeps/eep-0042.md b/eeps/eep-0042.md
@@ -0,0 +1,200 @@
+    Author: Richard A. O'Keefe <ok(at)cs(dot)otago(dot)ac(dot)nz>
+    Status: Draft
+    Type: Standards Track
+    Erlang-Version: R16A
+    Created: 07-Feb-2013
+    Post-History:
+****
+EEP 42: setrlimit(2) analogue for Erlang
+----
+
+Abstract
+========
+
+A new function `erlang:set_process_info_limit/2` is added,
+allowing a process to set limits on its own memory use.
+
+Specification
+=============
+
+    erlang:set_process_info_limit(Item, Limit :: integer()) ->
+        Old_Limit :: integer()
+
+An `Old_Limit` of 0 means that no limit has been set.
+A Limit of 0 means that no limit is to be set.
+
+The Items that can be set are
+
+- `memory`, the number of bytes that may be used for stack, heap, and
+internal structures.
+
+- `memory_words`, the same value as `memory`, but expressed in
+words to be consistent with `heap_size`, `total_heap_size`, and
+`stack_size` in `erlang:process_info/[1,2]`.  Those functions should
+also be revised to accept this Item.
+
+- `message_queue_len`, the number of unprocessed messages.
+(Aside.) The documentation refers to `message_queue_len`
+but the system generates and recognises `message_queue_len`.
+(End aside.)
+
+The values that are actually set may be less than Limit as
+long as any physically realisable system in the node as
+configured cannot exceed the value used.
+
+A non-zero memory limit is checked after each garbage
+collection or other memory restructuring; setting the memory
+limit to a non-zero value less than the current
+`process_info(self(), memory)` forces an immediate garbage
+collection.
+
+If the memory required by a process exceeds its limit, the
+process is exited with reason `memory`.
+
+A non-zero message queue length limit is checked whenever
+the message queue length is about to be incremented, or
+when such a limit is set.
+
+If the message queue length limit is exceeded, the process
+is exited with reason `message_queue_len`.
+
+
+Motivation
+==========
+
+It is currently possible for an Erlang process to grab
+memory without limit and eventually take down the whole
+node.  This problem is frequently reported in the Erlang
+mailing list.
+
+It is also possible for the message queue of an Erlang
+process to grow without limit.  This problem too is
+reported often enough to be recognised.
+
+This is an EEP rather than a library change request because
+it requires low level runtime system changes to support it.
+For example, the limits have to be _stored_ somewhere,
+suggesting a change in the data structure holding information
+about a process.  The limits have to be _checked_, meaning
+that changes to the garbage collector and the message sending
+core are required.  The limits have to be _enforced_,
+meaning that two new exit reasons have to be supported.  And
+the new exit reasons and the new function have to be
+_documented_, requiring changes to more than one document.
+
+The need is long-standing, and at least the presence of this
+EEP may provoke discussion leading to _some_ resolution.
+
+Rationale
+=========
+
+The function that sets a limit should have a name based
+in the name of the function that reports current use.
+That function is `process_info`, hence the name I've
+chosen here is `set_process_info_limit`.
+
+The `erlang:process_info/[1,2]` functions can report on
+any Erlang process.  But it is clearly dangerous to let
+a process set another process's limits.  All we actually
+_need_ is for a process to be able to set its own limits
+in a startup phase.  That could be done by adding
+`{memory,Size}` and `{message_queue_len,Count}` options
+to `spawn_opt/[2,3]`.  However,
+
+    spawn_opt(Fun, [{memory,128*1024}])
+
+can be mimicked by
+
+    spawn(fun () ->
+        erlang:set_process_info_limit(memory, 128*1024),
+        Fun()
+    end)
+
+and `set_process_info_limit/2` allows a process to set
+different limits at different times.  If you are trying
+to protect the system from _malice_, setting limits in
+`spawn_opt/[2,3]` is the way to go.  If you are trying to
+protect the system from _accident_, letting a process
+set its own limits is the way to go.
+
+The names for the Items to be set clearly must be identical
+to the names used for the same thing elsewhere.
+
+The `memory_words` item is introduced because in a world where
+some of my programs run 32-bit and some run 64-bit it is just
+too confusing to count bytes, especially as stack size and so
+on are measured in words, not bytes.
+
+I considered allowing `total_heap_size` and `stack_size` and so on
+to be given their own limits, but on finding that `total_heap_size`
+"currently includes the stack of the process", decided that it
+was "currently" a bad idea.
+
+I would have liked to allow setting a limit on the number
+of reductions, so that a process that doesn't intend to
+live forever can ensure its own eventual death.  However,
+the documentation warns that `erlang:bump_reductions/1`
+"might be removed", so presumably reduction counting
+_per se_ might well disappear.
+
+The point of letting the value actually set for a limit
+to be smaller is to allow an implementation to use only
+values that will fit in a `size_t`, so that low level
+code does not need to deal with bignums.  On a
+POSIX-compliant operating system, an Erlang implementation
+may use the `RLIMIT_DATA` value for the UNIX process
+if the memory limit is bigger, and may use `RLIMIT_DATA`
+divided by the minimum size of a message for the message
+queue length.  Or it might use other appropriate system
+limits.
+
+The biggest question is "what happens if a limit is exceeded?"
+
+For memory, we could exit with `system_limit` as reason, but
+that wouldn't make clear _what_ limit had been exceeded.  It
+seems advisable to introduce a new reason, and making the
+reason the same as the name of the limit seems the least
+confusing approach.
+
+For message queue length, I would prefer it if the process
+that _sends_ the message were the one to get a run time error.
+However, the Erlang documentation guarantees that sending a
+message to a pid always succeeds, whether the process is live
+or dead, and I don't want to change too much.  A message queue
+might build up because some process(es) is(are) sending junk;
+we would prefer exiting junk senders to exiting junk receivers.
+However, if the junk receiver isn't cleaning out the junk, that
+junk is _never_ going away, so it's _always_ going to be costing
+time to skip over as well as memory to store.  That argument
+applies to another option that I considered: just discarding
+messages that would make the queue too long.  The Erlang Way is
+to let subsystems that get into trouble crash.  So let it be.
+
+
+Backwards Compatibility
+=======================
+
+The new function is not exported by default
+so it cannot be called accidentally.
+
+
+Reference Implementation
+========================
+
+None in this draft.
+
+
+Copyright
+=========
+
+This document has been placed in the public domain.
+
+
+[EmacsVar]: <> "Local Variables:"
+[EmacsVar]: <> "mode: indented-text"
+[EmacsVar]: <> "indent-tabs-mode: nil"
+[EmacsVar]: <> "sentence-end-double-space: t"
+[EmacsVar]: <> "fill-column: 70"
+[EmacsVar]: <> "coding: utf-8"
+[EmacsVar]: <> "End:"
+[VimVar]: <> " vim: set fileencoding=utf-8 expandtab shiftwidth=4 softtabstop=4: "