+ <h1>Supervision in pure OpenRC using cgroup subsystem. <br /><small><strong>July 31, 2013</strong></small></h1>
+<h2 id="abstract">Abstract</h2>
+<p>This post describes how it’s possible to improve cgroup support in OpenRC to support user hooks, and shows a way to create basic supervision daemon based on cgroups.</p>
+<p>This post describes OpenRC-0.11/0.12_beta and some things can differ in later versions. Please notify me to post updates here if you find a differences.</p>
+<h2 id="introduction">Introduction</h2>
+<h3 id="the-problem">The problem</h3>
+<p>In a general case there are many services that should be run and restarted if they fails. There are many other subproblems like when we should restart services and when not. Many existing systems can solve those issues but have different trade-offs. In this post I’ll try to present a simple mechanism that allowes to create basic supervision and other nice things.</p>
+<h3 id="idea">Idea</h3>
+<p>Linux kernel provides a mechanism to track groups of processes - <code>Cgroups</code>. All process childs will belong to the same cgroups and that groups are easily trackable from user space. If you want to understand cgroups better you may read following docs <a href="">cgroups</a>. Cgroups provides a way of setting limits and controlling groups, that is also usefull but at this moment it’s out of the scope.</p>
+<p>When all processes dies kernel will call ‘release_notify_agent’ script and will provide a path to cgroup, this may be used to remove empty cgroups and make some additional actions.</p>
+<p>Idea is that we can check service state to understand if we need to restart it.</p>
+<h2 id="details">Details</h2>
+<h3 id="implementation">Implementation</h3>
+<p>Here are improvements and files that should be added to OpenRC to provide required functionallity.</p>
+<h4 id="restart-daemon">Restart daemon</h4>
+<p>First we need to create a deamon for restarting a services, because we can’t start service from agent, as it has <code>PF_NO_SETAFFINITY</code> flag and thus cgroups will not work for any of it’s children. So lets have a very simple daemon, it will be extended in the next posts</p>
+<pre class="sourceCode bash"><code class="sourceCode bash"><span class="co">#!/bin/sh</span>
+<span class="kw">if [</span> <span class="ot">$#</span> <span class="ot">-lt</span> 1<span class="kw"> ]</span> ; <span class="kw">then</span>
+ <span class="kw">echo</span> <span class="st">&quot;usage is </span><span class="ot">$0</span><span class="st"> &lt;path to fifo&gt;&quot;</span>
+ <span class="kw">exit</span> 1
+<span class="kw">fi</span>
+<span class="kw">while [</span> <span class="ot">-p</span> <span class="ot">$1</span><span class="kw"> ]</span> ; <span class="kw">do</span>
+ <span class="kw">while</span> <span class="kw">read</span> <span class="ot">line</span> ; <span class="kw">do</span>
+ <span class="kw">echo</span> <span class="st">&quot;rc-service </span><span class="ot">$line</span><span class="st">&quot;</span><span class="kw">;</span>
+ <span class="kw">done</span> <span class="kw">&lt;</span><span class="ot">$1</span>
+<span class="kw">done</span></code></pre>
+<h4 id="release-notify-agent-improvement">Release notify agent improvement</h4>
+<p>Current release notify agent is very simple idea is to extend it to support user hooks. There are some different way to do it:</p>
+<ol style="list-style-type: decimal">
+<li>Add it to the service state. Requires hook in a script</li>
+<li>Create static structure in a filesystem</li>
+<p>We will use 2. as it’s simplier and doesn’t lead to a init script hacking. We will have following file structure:</p>
+<p>In /etc/conf.d/cgroups there will be hooks ‘cgroup-release’ for default one ‘service-name.cgroup-release’ for service specific one. Here is my example.</p>
+|-- cgroup-release # default release hook
+|-- service1.cgroup-release -&gt; service-restart.cgroup-release # service release hook
+`-- service-restart.cgroup-release # example script
+<p>This approach doesn’t scale on a multiple hooks but it may be improved after discussion with upstream. Each script can return $RC_CGROUP_CONTINUE exit code, so cgroup will be not deleted after a hook.</p>
+<p>Here is script itself (newer version can be found on <a href="">github</a>):</p>
+<pre class="sourceCode bash"><code class="sourceCode bash"><span class="ot">PATH=</span>/bin:/usr/bin:/sbin:/usr/sbin
+<span class="ot">cgroup=</span>/sys/fs/cgroup/openrc
+<span class="ot">cgroup_rmdir=</span>1
+<span class="ot">RC_SVCNAME=${1}</span>
+<span class="kw">if [</span> <span class="ot">-n</span> <span class="st">&quot;</span><span class="ot">${RC_SVCNAME}</span><span class="st">&quot;</span><span class="kw"> ]</span> ; <span class="kw">then</span>
+<span class="ot">hook=</span>@SYSCONFDIR@/conf.d/cgroups/<span class="ot">${RC_SVCNAME}</span>.cgroup-release
+<span class="kw">[</span> <span class="ot">-f</span> <span class="st">&quot;</span><span class="ot">$hook</span><span class="st">&quot;</span> <span class="ot">-a</span> <span class="ot">-x</span> <span class="st">&quot;</span><span class="ot">$hook</span><span class="st">&quot;</span><span class="kw"> ]</span> <span class="kw">||</span> <span class="ot">hook=</span>@SYSCONFDIR@/conf.d/cgroups/cgroup-release;
+<span class="kw">if [</span> <span class="ot">-x</span> <span class="st">&quot;</span><span class="ot">$hook</span><span class="st">&quot;</span><span class="kw"> ]</span>; <span class="kw">then</span>
+<span class="st">&quot;</span><span class="ot">$hook</span><span class="st">&quot;</span> <span class="kw">cleanup</span> <span class="st">&quot;</span><span class="ot">$RC_SVCNAME</span><span class="st">&quot;</span> <span class="kw">||</span> <span class="kw">case</span> <span class="ot">$?</span><span class="kw"> in</span> <span class="ot">$RC_CGROUP_CONTINUE</span><span class="kw">)</span> <span class="ot">cgroup_rmdir=</span>0<span class="kw">;;</span> <span class="kw">esac</span> ;
+<span class="kw">fi</span>
+<span class="kw">else</span>
+<span class="ot">cgroup_rmdir=</span>1
+<span class="kw">fi</span>
+<span class="kw">if [</span> <span class="ot">${cgroup_rmdir}</span> <span class="ot">-a</span> <span class="ot">-d</span> <span class="ot">${cgroup}</span>/<span class="st">&quot;</span><span class="ot">$1</span><span class="st">&quot;</span><span class="kw"> ]</span>; <span class="kw">then</span>
+<span class="kw">for</span> <span class="ot">$c</span> <span class="kw">in</span> <span class="kw">/sys/fs/cgroup/*</span> <span class="kw">;</span> <span class="kw">do</span>
+<span class="kw">rmdir</span> <span class="st">&quot;</span><span class="ot">${c}</span><span class="st">&quot;</span>/openrc_<span class="st">&quot;</span><span class="ot">$1</span><span class="st">&quot;</span>
+<span class="kw">done</span>;
+<span class="kw">rmdir</span> <span class="ot">$cgroup</span>/<span class="st">&quot;</span><span class="ot">${1}</span><span class="st">&quot;</span>
+<span class="kw">fi</span></code></pre>
+<p>Restart service script. This script simply checks service state and if it’s 32 (service failed) then start a new instance and set <code>$RC_CGROUP_CONTINUE</code></p>
+<pre class="sourceCode bash"><code class="sourceCode bash"><span class="co">#!/bin/sh</span>
+<span class="co"># This script is run for service that need to be restarted</span>
+<span class="co"># if it's last process leaves cgroup.</span>
+<span class="ot">action=$1</span>
+<span class="ot">service=$2</span>
+<span class="kw">if [</span> x<span class="ot">$action</span> <span class="ot">==</span> x<span class="st">&quot;cleanup&quot;</span><span class="kw"> ]</span> ; <span class="kw">then</span>
+ <span class="kw">rc-service</span> <span class="ot">$service</span> status <span class="kw">&gt;</span> /dev/null
+ <span class="kw">case</span> <span class="ot">$?</span><span class="kw"> in</span>
+ 32<span class="kw">)</span>
+ <span class="kw">/etc/init.d/</span><span class="ot">${service}</span> <span class="kw">-d</span> restart
+ <span class="kw">exit</span> <span class="ot">$RC_CGROUP_CONTINUE</span>
+ <span class="kw">;;</span>
+ *<span class="kw">)</span>
+ <span class="kw">return</span> 0<span class="kw">;;</span>
+ <span class="kw">esac</span>
+<span class="kw">fi</span></code></pre>
+<h3 id="other-solutions">Other solutions</h3>
+<p>The general supervision is quite complicated problems as there are many conditions when we can think that our service failed, like:</p>
+<li>main process dies</li>
+<li>all service children dies</li>
+<li>service to not write logs for some time</li>
+<li>big resource memory/cpu consuming</li>
+<li>service to not respond on logs for some time</li>
+<li>log fd is closed.</li>
+<p>Some of the options can be translated to another, like big resource consuming can be translated to process death by setting correct limits. And process death (and in some cases even children death) can be tracked by log fd (in case of a process in background).</p>
+<p>One more thing that you may need complicated hooks, that have a state do decide what to do with failed service, like do not restart if it was failed many times in a small time period.</p>
+<p>So full features system will be very complicated so non-specialized subsystems address only a part of a problem domain. Here are some examples for other supervision systems:</p>
+<h2 id="related-work">Related work</h2>
+<ol style="list-style-type: decimal">
+<li>work on inclusion of a user hooks to OpenRC release agent.</li>
+<li>improve restart script to track really dead services that can be restart</li>
+<h2 id="conclusions-and-futher-work">Conclusions and futher work</h2>
+<p>It’s possible to create a very simple and extensible supervision system on the top of OpenRC, by extending notification systems. Also there are more usecases for it, like:</p>
+<pre><code>* adding system wide notification mechanism via dbus
+* additional logging system</code></pre>
+<hr />
Alexander Vershilov
+ <em>Alexander Vershilov</em>
+ <a href=""><img src="" /></a>
+<br class="clearfix" />
