/
2013-08-08-openrc-supervision-using-cgroups.html
211 lines (194 loc) · 13 KB
/
2013-08-08-openrc-supervision-using-cgroups.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<title>Qnikst blog - Supervision in pure OpenRC using cgroup subsystem.</title>
<!-- Bootstrap -->
<link href="../css/bootstrap.min.css" rel="stylesheet" media="screen">
<style>
body {
padding-top: 60px; /* 60px to make the container go all the way to the bottom of the topbar */
}
</style>
<script src="http://code.jquery.com/jquery-latest.js"></script>
<script src="../js/bootstrap.min.js"></script>
</head>
<body>
<div class="navbar navbar-fixed-top navbar-inverse">
<div class="navbar-inner">
<a class="brand" href="../">Qnikst blog</a>
<ul class="nav ">
<li class="active"><a href="../">Home</a></li>
<li><a href="../posts.html">Blog</a></li>
<li><a href="../projects.html">Projects</a></li>
<li><a href="../contact.html">Contacts</a></li>
<li><a href="../rss/">RSS</a></li>
</ul>
</div>
</div>
<div class="container">
<div class="page-header">
<h1>Supervision in pure OpenRC using cgroup subsystem. <br /><small><strong>August 8, 2013</strong></small></h1>
</div>
<div style="float:right;width:200px;font-size:0.5em;">
Updates:
<ul>
<li>
2008.08.09 - small corrections, acknowledgement section added
</li>
</ul>
Versions:
<ul>
<li>
Kernel >=2.6.24 && <=3.10
</li>
<li>
Openrc 0.12
</li>
</ul>
</div>
<h2 id="abstract">Abstract</h2>
<p>This post describes how it’s possible to improve cgroup support in OpenRC to support user hooks, and shows how to create a basic supervision daemon based on cgroups.</p>
<p>This post describes OpenRC-0.11/0.12_beta and some things may change in later versions. Please notify me to post updates here if you find such changes.</p>
<h2 id="introduction">Introduction</h2>
<h3 id="the-problem">The problem</h3>
<p>In a general case, there are many services that should be run and restarted when they fail. There are many other subproblems like when should we restart services and when not. Many existing systems can solve those issues but have different trade-offs. In this post I’ll try to present a simple mechanism that allows to create basic supervision and other nice things.</p>
<h3 id="idea">Idea</h3>
<p>The Linux kernel provides a mechanism to track groups of processes - <code>Cgroups</code>. All process children will put in the process’s cgroup. And it’s easy to track cgroups from user space. If you want to understand cgroups better you may read <a href="https://www.kernel.org/doc/Documentation/cgroups/cgroups.txt">cgroups documentation</a>. Cgroups provide a way of setting limits and controlling groups, that is also useful but at this moment it’s out of the scope.</p>
<p>When all processes in a group die, kernel will call ‘release_notify_agent’ script, proving the path to the cgroup. This may be used to remove empty cgroups and take additional actions.</p>
<p>Idea is that we can check service state to decide if we should restart it.</p>
<h2 id="details">Details</h2>
<h3 id="implementation">Implementation</h3>
<p>Here are improvements and files that should be added to OpenRC to provide the required functionality.</p>
<h4 id="restart-daemon">Restart daemon</h4>
<p>First we need to create a daemon to restart a services, because we can’t start service from agent, as it has <code>PF_NO_SETAFFINITY</code> flag and thus cgroups will not work for any of its children. So let’s have a very simple daemon, it will be extended in the next posts</p>
<pre><code> #!/bin/sh
if [ $# -lt 1 ] ; then
echo "usage is $0 <path to fifo>"
exit 1
fi
while [ -p $1 ] ; do
while read line ; do
echo "rc-service $line";
done <$1
done</code></pre>
<h4 id="release-notify-agent-improvement">Release notify agent improvement</h4>
<p>The current release notify agent is very simple; so we extend it to support user hooks. There are some different ways to do it:</p>
<ol style="list-style-type: decimal">
<li>Add it to the service state. (Requires hook in the init script)</li>
<li>Create static structure in a filesystem</li>
</ol>
<p>We will use 2. as it’s simpler and doesn’t lead to a init script hacking. We will have following file structure:</p>
<p>In /etc/conf.d/cgroups there will be hooks, ‘cgroup-release’ for default one ‘service-name.cgroup-release’ for service specific one. Here is my example.</p>
<pre><code>/etc/conf.d/cgroups/
|-- cgroup-release # default release hook
|-- foo.cgroup-release -> service-restart.cgroup-release # service release hook
`-- service-restart.cgroup-release # example script
</code></pre>
<p>This approach doesn’t scale on a multiple hooks but it may be improved after discussion with upstream. Each script can return $RC_CGROUP_CONTINUE exit code, so cgroup will not be deleted after a hook.</p>
<p>Here is a script itself (newer version can be found on <a href="https://github.com/qnikst/openrc/blob/cgroups.release_notification/sh/cgroup-release-agent.sh.in">github</a>):</p>
<pre class="sourceCode bash"><code class="sourceCode bash"><span class="ot">PATH=</span>/bin:/usr/bin:/sbin:/usr/sbin
<span class="ot">cgroup=</span>/sys/fs/cgroup/openrc
<span class="ot">cgroup_rmdir=</span>1
<span class="ot">RC_SVCNAME=$1</span>
<span class="ot">RC_CGROUP_CONTINUE=</span>3;
<span class="kw">export</span> <span class="ot">RC_CGROUP_CONTINUE</span> <span class="ot">RC_SVCNAME</span> <span class="ot">PATH</span>;
<span class="kw">if [</span> <span class="ot">-n</span> <span class="st">"</span><span class="ot">${RC_SVCNAME}</span><span class="st">"</span><span class="kw"> ]</span> ; <span class="kw">then</span>
<span class="ot">hook=</span>@SYSCONFDIR@/conf.d/cgroups/<span class="ot">${RC_SVCNAME}</span>.cgroup-release
<span class="kw"> [</span> <span class="ot">-x</span> <span class="st">"</span><span class="ot">$hook</span><span class="st">"</span><span class="kw"> ]</span> <span class="kw">||</span> <span class="ot">hook=</span>@SYSCONFDIR@/conf.d/cgroups/cgroup-release;
<span class="kw">if [</span> <span class="ot">-x</span> <span class="st">"</span><span class="ot">$hook</span><span class="st">"</span><span class="kw"> ]</span>; <span class="kw">then</span>
<span class="st">"</span><span class="ot">$hook</span><span class="st">"</span> <span class="kw">cleanup</span> <span class="st">"</span><span class="ot">$RC_SVCNAME</span><span class="st">"</span> <span class="kw">||</span> <span class="kw">case</span> <span class="ot">$?</span><span class="kw"> in</span> <span class="ot">$RC_CGROUP_CONTINUE</span><span class="kw">)</span> <span class="ot">cgroup_rmdir=</span>0<span class="kw">;;</span> <span class="kw">esac</span> ;
<span class="kw">fi</span>
<span class="kw">fi</span>
<span class="kw">if [</span> <span class="ot">${cgroup_rmdir}</span> <span class="ot">-eq</span> 1<span class="kw"> ]</span> <span class="kw">&& [</span> <span class="ot">-d</span> <span class="st">"</span><span class="ot">${cgroup}</span><span class="st">/</span><span class="ot">$1</span><span class="st">"</span><span class="kw"> ]</span>; <span class="kw">then</span>
<span class="kw">for</span> <span class="kw">c</span> in /sys/fs/cgroup/*/<span class="st">"openrc_</span><span class="ot">$1</span><span class="st">"</span> <span class="kw">;</span> <span class="kw">do</span>
<span class="kw">rmdir</span> <span class="st">"</span><span class="ot">${c}</span><span class="st">"</span>
<span class="kw">done</span>;
<span class="kw">rmdir</span> <span class="st">"</span><span class="ot">$cgroup</span><span class="st">/</span><span class="ot">${1}</span><span class="st">"</span>
<span class="kw">fi</span></code></pre>
<p>Restart service script. This script simply checks service state and if it’s 32 (service failed) then start a new instance and set <code>$RC_CGROUP_CONTINUE</code></p>
<pre class="sourceCode bash"><code class="sourceCode bash"><span class="co">#!/bin/sh</span>
<span class="co"># This script is run for service that need to be restarted</span>
<span class="co"># if it's last process leaves cgroup.</span>
<span class="ot">action=$1</span>
<span class="ot">service=$2</span>
<span class="kw">if [</span> cleanup <span class="ot">=</span> <span class="st">"</span><span class="ot">$action</span><span class="st">"</span><span class="kw"> ]</span> ; <span class="kw">then</span>
<span class="kw">rc-service</span> <span class="ot">$service</span> status <span class="kw">></span> /dev/null
<span class="kw">case</span> <span class="ot">$?</span><span class="kw"> in</span>
32<span class="kw">)</span>
<span class="kw">/etc/init.d/</span><span class="ot">${service}</span> <span class="kw">-d</span> restart
<span class="kw">exit</span> <span class="ot">$RC_CGROUP_CONTINUE</span>
<span class="kw">;;</span>
*<span class="kw">)</span>
<span class="kw">return</span> 0<span class="kw">;;</span>
<span class="kw">esac</span>
<span class="kw">fi</span></code></pre>
<h3 id="other-solutions">Other solutions</h3>
<p>Generic supervision is quite a complicated problem as there are many conditions when we may suppose that our service failed, like:</p>
<ul>
<li>main process dies;</li>
<li>all service children die;</li>
<li>service does not write logs for some time;</li>
<li>large resource memory/cpu consuming;</li>
<li>service does not respond to control call;</li>
<li>log fd is closed.</li>
</ul>
<p>Some of the options can be translated to another, like large resource consuming can be translated to process death by setting correct limits. And process death (and in some cases even children deaths) can be tracked by log fd (in case of a process in background).</p>
<p>More complex hooks may be also needed, when deciding what to do with failed service, e.g. do not restart if it has failed many times in a short period of time.</p>
<p>So with all required features will be very complicated. So non-specialized subsystems address only a part of a problem domain. Here are some other examples of supervision systems:</p>
<ul>
<li>monit (full featured)</li>
<li>s6 (pid, fd based)</li>
<li>daemon-tools</li>
<li>angel</li>
<li>systemd (pid, cgroups based)</li>
<li>upstart (pid based)</li>
</ul>
<h2 id="future-work">Future work</h2>
<ol style="list-style-type: decimal">
<li>work on inclusion of a user hooks to OpenRC release agent.</li>
<li>improve restart script to track really dead services that can be restarted</li>
</ol>
<h2 id="conclusions-and-futher-work">Conclusions and futher work</h2>
<p>It’s possible to create a very simple and extensible supervision system based on OpenRC, by extending notification systems. Also there are more usecases for it, like:</p>
<ul>
<li>adding system wide notification mechanism via dbus</li>
<li>additional logging system</li>
</ul>
<h2 id="acknowledgements">Acknowledgements</h2>
<p>I want to thank igli for code corrections and usefull tips, and Kirill Zaborsky for correcting lingual mistakes.</p>
<hr />
<div class="pull-right">
<em>Alexander Vershilov</em>
<a href="http://creativecommons.org/licenses/by/3.0"><img src="http://i.creativecommons.org/l/by/3.0/88x31.png" /></a>
</div>
<br class="clearfix" />
<div id="disqus_thread"></div>
<script type="text/javascript">
/* * * CONFIGURATION VARIABLES: EDIT BEFORE PASTING INTO YOUR WEBPAGE * * */
var disqus_shortname = 'qnikst'; // required: replace example with your forum shortname
(function() {
var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true;
dsq.src = 'http://' + disqus_shortname + '.disqus.com/embed.js';
(document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq);
})();
</script>
<noscript>Please enable JavaScript to view the <a href="http://disqus.com/?ref_noscript">comments powered by Disqus.</a></noscript>
<a href="http://disqus.com" class="dsq-brlink">comments powered by <span class="logo-disqus">Disqus</span></a>
<footer>
Site generated using <a href="http://jaspervdj.be/hakyll">Hakyll</a> using <a href="http://johnmacfarlane.net/pandoc/">pandoc</a>
</footer>
</div>
<script type="text/javascript">
// <noscript> я очень хочу вас посчитать, напишите комментарий хотя бы, пожааалуйста </noscript>
var _gaq = _gaq || [];
_gaq.push(['_setAccount', 'UA-38941774-1']);
_gaq.push(['_trackPageview']);
(function() {
var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
})();
</script>
</body>
</html>