/
monit.html
3460 lines (3357 loc) · 160 KB
/
monit.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<?xml version="1.0" ?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Monit - utility for monitoring services on a Unix system</title>
<meta http-equiv="content-type" content="text/html; charset=utf-8" />
<link rev="made" href="mailto:martinpala@trilobite.local" />
</head>
<body style="background-color: white">
<!-- INDEX BEGIN -->
<div name="index">
<p><a name="__index__"></a></p>
<ul>
<li><a href="#name">NAME</a></li>
<li><a href="#synopsis">SYNOPSIS</a></li>
<li><a href="#description">DESCRIPTION</a></li>
<li><a href="#general_operation">GENERAL OPERATION</a></li>
<ul>
<li><a href="#general_options_and_arguments">General Options and Arguments</a></li>
</ul>
<li><a href="#what_to_monitor">WHAT TO MONITOR</a></li>
<li><a href="#how_to_monitor">HOW TO MONITOR</a></li>
<li><a href="#logging">LOGGING</a></li>
<li><a href="#daemon_mode">DAEMON MODE</a></li>
<li><a href="#init_support">INIT SUPPORT</a></li>
<li><a href="#include_files">INCLUDE FILES</a></li>
<li><a href="#group_support">GROUP SUPPORT</a></li>
<li><a href="#monitoring_mode">MONITORING MODE</a></li>
<li><a href="#alert_messages">ALERT MESSAGES</a></li>
<ul>
<li><a href="#setting_a_global_alert_statement">Setting a global alert statement</a></li>
<li><a href="#setting_a_local_alert_statement">Setting a local alert statement</a></li>
<li><a href="#alert_message_layout">Alert message layout</a></li>
<li><a href="#setting_a_global_mail_format">Setting a global mail format</a></li>
<li><a href="#setting_an_error_reminder">Setting an error reminder</a></li>
<li><a href="#setting_a_mail_server_for_alert_messages">Setting a mail server for alert messages</a></li>
<li><a href="#event_queue">Event queue</a></li>
</ul>
<li><a href="#service_timeout">SERVICE TIMEOUT</a></li>
<li><a href="#service_tests">SERVICE TESTS</a></li>
<ul>
<li><a href="#existence_testing">EXISTENCE TESTING</a></li>
<li><a href="#resource_testing">RESOURCE TESTING</a></li>
<li><a href="#file_checksum_testing">FILE CHECKSUM TESTING</a></li>
<li><a href="#timestamp_testing">TIMESTAMP TESTING</a></li>
<li><a href="#file_size_testing">FILE SIZE TESTING</a></li>
<li><a href="#file_content_testing">FILE CONTENT TESTING</a></li>
<li><a href="#filesystem_flags_testing">FILESYSTEM FLAGS TESTING</a></li>
<li><a href="#space_testing">SPACE TESTING</a></li>
<li><a href="#inode_testing">INODE TESTING</a></li>
<li><a href="#permission_testing">PERMISSION TESTING</a></li>
<li><a href="#uid_testing">UID TESTING</a></li>
<li><a href="#gid_testing">GID TESTING</a></li>
<li><a href="#pid_testing">PID TESTING</a></li>
<li><a href="#ppid_testing">PPID TESTING</a></li>
<li><a href="#connection_testing">CONNECTION TESTING</a></li>
<ul>
<ul>
<li><a href="#connection_testing_using_the_url_notation">Connection testing using the URL notation</a></li>
<li><a href="#remote_host_ping_test">Remote host ping test</a></li>
<li><a href="#examples">Examples</a></li>
<li><a href="#testing_the_sip_protocol">Testing the SIP protocol</a></li>
<li><a href="#testing_the_radius_protocol">Testing the RADIUS protocol</a></li>
</ul>
</ul>
</ul>
<li><a href="#service_poll_time">SERVICE POLL TIME</a></li>
<li><a href="#monit_httpd">MONIT HTTPD</a></li>
<ul>
<li><a href="#fips_support">FIPS support</a></li>
<li><a href="#monit_httpd_authentication">Monit HTTPD Authentication</a></li>
<ul>
<ul>
<li><a href="#host_and_network_allow_list">Host and network allow list</a></li>
<li><a href="#basic_authentication">Basic Authentication</a></li>
</ul>
</ul>
</ul>
<li><a href="#dependencies">DEPENDENCIES</a></li>
<li><a href="#the_run_control_file">THE RUN CONTROL FILE</a></li>
<ul>
<li><a href="#run_control_syntax">Run Control Syntax</a></li>
<li><a href="#configuration_examples">CONFIGURATION EXAMPLES</a></li>
</ul>
<li><a href="#files">FILES</a></li>
<li><a href="#environment">ENVIRONMENT</a></li>
<li><a href="#signals">SIGNALS</a></li>
<li><a href="#notes">NOTES</a></li>
<li><a href="#authors">AUTHORS</a></li>
<li><a href="#copyright">COPYRIGHT</a></li>
<li><a href="#see_also">SEE ALSO</a></li>
</ul>
<hr name="index" />
</div>
<!-- INDEX END -->
<p>
</p>
<h1><a name="name">NAME</a></h1>
<p>Monit - utility for monitoring services on a Unix system</p>
<p>
</p>
<hr />
<h1><a name="synopsis">SYNOPSIS</a></h1>
<p><strong>monit</strong> [options] {arguments}</p>
<p>
</p>
<hr />
<h1><a name="description">DESCRIPTION</a></h1>
<p><strong>monit</strong> is a utility for managing and monitoring processes,
files, directories and filesystems on a Unix system. Monit
conducts automatic maintenance and repair and can execute
meaningful causal actions in error situations. E.g. Monit can
start a process if it does not run, restart a process if it does
not respond and stop a process if it uses too much resources. You
may use Monit to monitor files, directories and filesystems for
changes, such as timestamps changes, checksum changes or size
changes.</p>
<p>Monit is controlled via an easy to configure control file based
on a free-format, token-oriented syntax. Monit logs to syslog or
to its own log file and notifies you about error conditions via
customizable alert messages. Monit can perform various TCP/IP
network checks, protocol checks and can utilize SSL for such
checks. Monit provides a http(s) interface and you may use a
browser to access the Monit program.</p>
<p>
</p>
<hr />
<h1><a name="general_operation">GENERAL OPERATION</a></h1>
<p>The behavior of Monit is controlled by command-line options
<em>and</em> a run control file, <em class="file">~/.monitrc</em>, the syntax of which we
describe in a later section. Command-line options override
<em class="file">.monitrc</em> declarations.</p>
<p>The following options are recognized by monit. However, it is
recommended that you set options (when applicable) directly in
the <em>.monitrc</em> control file.</p>
<p>
</p>
<h2><a name="general_options_and_arguments">General Options and Arguments</a></h2>
<p><strong>-c</strong> <em>file</em>
Use this control file</p>
<p><strong>-d</strong> <em>n</em>
Run as a daemon once per <em>n</em> seconds</p>
<p><strong>-g</strong>
Set group name for start, stop, restart, monitor and
unmonitor.</p>
<p><strong>-l</strong> <em>logfile</em>
Print log information to this file</p>
<p><strong>-p</strong> <em>pidfile</em>
Use this lock file in daemon mode</p>
<p><strong>-s</strong> <em>statefile</em>
Write state information to this file</p>
<p><strong>-I</strong>
Do not run in background (needed for run from init)</p>
<p><strong>-t</strong>
Run syntax check for the control file</p>
<p><strong>-v</strong>
Verbose mode, work noisy (diagnostic output)</p>
<p><strong>-H</strong> <em>[filename]</em>
Print MD5 and SHA1 hashes of the file or of stdin if the
filename is omitted; Monit will exit afterwards</p>
<p><strong>-V</strong>
Print version number and patch level</p>
<p><strong>-h</strong>
Print a help text</p>
<p>In addition to the options above, Monit can be started with one
of the following action arguments; Monit will then execute the
action and exit without transforming itself to a daemon.</p>
<p><strong>start all</strong>
Start all services listed in the control file and
enable monitoring for them. If the group option is
set, only start and enable monitoring of services in
the named group (no "all" verb is required in this
case).</p>
<p><strong>start name</strong>
Start the named service and enable monitoring for
it. The name is a service entry name from the
monitrc file.</p>
<p><strong>stop all</strong>
Stop all services listed in the control file and
disable their monitoring. If the group option is
set, only stop and disable monitoring of the services
in the named group (no "all" verb is required in this
case).</p>
<p><strong>stop name</strong>
Stop the named service and disable its monitoring.
The name is a service entry name from the monitrc
file.</p>
<p><strong>restart all</strong>
Stop and start <em>all</em> services. If the group option
is set, only restart the services in the named group
(no "all" verb is required in this case).</p>
<p><strong>restart name</strong>
Restart the named service. The name is a service entry
name from the monitrc file.</p>
<p><strong>monitor all</strong>
Enable monitoring of all services listed in the
control file. If the group option is set, only start
monitoring of services in the named group (no "all"
verb is required in this case).</p>
<p><strong>monitor name</strong>
Enable monitoring of the named service. The name is
a service entry name from the monitrc file. Monit will
also enable monitoring of all services this service
depends on.</p>
<p><strong>unmonitor all</strong>
Disable monitoring of all services listed in the
control file. If the group option is set, only disable
monitoring of services in the named group (no "all"
verb is required in this case).</p>
<p><strong>unmonitor name</strong>
Disable monitoring of the named service. The name is
a service entry name from the monitrc file. Monit
will also disable monitoring of all services that
depends on this service.</p>
<p><strong>status</strong>
Print full status information for each service.</p>
<p><strong>summary</strong>
Print short status information for each service.</p>
<p><strong>reload</strong>
Reinitialize a running Monit daemon, the daemon will
reread its configuration, close and reopen log files.</p>
<p><strong>quit</strong>
Kill a Monit daemon process</p>
<p><strong>validate</strong>
Check all services listed in the control file. This
action is also the default behavior when Monit runs
in daemon mode.</p>
<p>
</p>
<hr />
<h1><a name="what_to_monitor">WHAT TO MONITOR</a></h1>
<p>You may use Monit to monitor daemon processes or similar programs
running on localhost. Monit is particular useful for monitoring
daemon processes, such as those started at system boot time from
/etc/init.d/. For instance sendmail, sshd, apache and mysql. In
difference to many monitoring systems, Monit can act if an error
situation should occur, e.g.; if sendmail is not running, monit
can start sendmail or if apache is using too much resources (e.g.
if a DoS attack is in progress) Monit can stop or restart apache
and send you an alert message. Monit can also monitor process
characteristics, such as; if a process has become a zombie and
how much memory or cpu cycles a process is using.</p>
<p>You may also use Monit to monitor files, directories and
filesystems on localhost. Monit can monitor these items for
changes, such as timestamps changes, checksum changes or size
changes. This is also useful for security reasons - you can
monitor the md5 checksum of files that should not change.</p>
<p>You may even use Monit to monitor remote hosts. First and
foremost Monit is a utility for monitoring and mending services
on localhost, but if a service depends on a remote service, e.g.
a database server or an application server, it might by useful to
be able to test a remote host as well.</p>
<p>You may monitor the general system-wide resources such as cpu
usage, memory and load average.</p>
<p>
</p>
<hr />
<h1><a name="how_to_monitor">HOW TO MONITOR</a></h1>
<p>Monit is configured and controlled via a control file called
<strong>monitrc</strong>. The default location for this file is ~/.monitrc. If
this file does not exist, Monit will try /etc/monitrc, then
@sysconfdir@/monitrc and finally ./monitrc.</p>
<p>A Monit control file consists of a series of service entries and
global option statements in a free-format, token-oriented syntax.
Comments begin with a # and extend through the end of the line.
There are three kinds of tokens in the control file: grammar
keywords, numbers and strings.</p>
<p>On a semantic level, the control file consists of three types of
statements:</p>
<ol>
<li><strong><a name="global_set_statements" class="item">Global set-statements</a></strong>
<p>A global set-statement starts with the keyword <em>set</em> and the
item to configure.</p>
</li>
<li><strong><a name="global_include_statement" class="item">Global include-statement</a></strong>
<p>The include statement consists of the keyword <em>include</em> and
a glob string.</p>
</li>
<li><strong><a name="one_or_more_service_entry_statements" class="item">One or more service entry statements.</a></strong>
<p>A service entry starts with the keyword <em>check</em> followed by the
service type.</p>
</li>
</ol>
<p>A Monit control file example:</p>
<pre>
#
# Monit control file
#</pre>
<pre>
set daemon 120 # Poll at 2-minute intervals
set logfile syslog facility log_daemon
set alert foo@bar.baz
set httpd port 2812 and use address localhost
allow localhost # Allow localhost to connect
allow admin:Monit # Allow Basic Auth</pre>
<pre>
check system myhost.mydomain.tld
if loadavg (1min) > 4 then alert
if loadavg (5min) > 2 then alert
if memory usage > 75% then alert
if swap usage > 25% then alert
if cpu usage (user) > 70% then alert
if cpu usage (system) > 30% then alert
if cpu usage (wait) > 20% then alert</pre>
<pre>
check process apache
with pidfile "/usr/local/apache/logs/httpd.pid"
start program = "/etc/init.d/httpd start" with timeout 60 seconds
stop program = "/etc/init.d/httpd stop"
if 2 restarts within 3 cycles then timeout
if totalmem > 100 Mb then alert
if children > 255 for 5 cycles then stop
if cpu usage > 95% for 3 cycles then restart
if failed port 80 protocol http then restart
group server
depends on httpd.conf, httpd.bin</pre>
<pre>
check file httpd.conf
with path /usr/local/apache/conf/httpd.conf
# Reload apache if the httpd.conf file was changed
if changed checksum
then exec "/usr/local/apache/bin/apachectl graceful"</pre>
<pre>
check file httpd.bin
with path /usr/local/apache/bin/httpd
# Run /watch/dog in the case that the binary was changed
if failed checksum then exec "/watch/dog"</pre>
<pre>
include /etc/monit/mysql.monitrc
include /etc/monit/mail/*.monitrc</pre>
<p>The above example illustrates a service entry for monitoring the
apache web server process as well as related files. The meaning
of the various statements will be explained in the following
sections.</p>
<p>
</p>
<hr />
<h1><a name="logging">LOGGING</a></h1>
<p>Monit will log status and error messages to a log file. Use the
<em>set logfile</em> statement in the monitrc control file. To setup
Monit to log to its own logfile, use e.g. <em>set logfile
/var/log/monit.log</em>. If <strong>syslog</strong> is given as a value for the
<em>-l</em> command-line switch (or the keyword <em>set logfile syslog</em>
is found in the control file) Monit will use the <strong>syslog</strong> system
daemon to log messages with a priority assigned to each message
based on the context. To turn off logging, simply do not set the
logfile in the control file (and of course, do not use the -l
switch)</p>
<p>
</p>
<hr />
<h1><a name="daemon_mode">DAEMON MODE</a></h1>
<p>The <em>-d interval</em> command-line switch runs Monit in daemon
mode. You must specify a numeric argument which is a polling
interval in seconds.</p>
<p>In daemon mode, Monit detaches from the console, puts itself in
the background and runs continuously, monitoring each specified
service and then goes to sleep for the given poll interval.</p>
<p>Simply invoking</p>
<pre>
Monit -d 300</pre>
<p>will poll all services described in your <em class="file">~/.monitrc</em> file every
5 minutes.</p>
<p>It is strongly recommended to set the poll interval in your
~/.monitrc file instead, by using <em>set daemon <strong>n</strong></em>, where <strong>n</strong>
is an integer number of seconds. If you do this, Monit will
always start in daemon mode (as long as no action arguments are
given). Example (check every 5 minutes):</p>
<pre>
set daemon 300</pre>
<p>If you need Monit to wait some time at startup before it start
checking services you can use the delay statement. Example (check
every 5 minutes, wait 1 minute on start before first monitoring
cycle):</p>
<pre>
set daemon 300 with start delay 60</pre>
<p>Monit makes a per-instance lock-file in daemon mode. If you need
more Monit instances, you will need more configuration files,
each pointing to its own lock-file.</p>
<p>Calling <em>monit</em> with a Monit daemon running in the background
sends a wake-up signal to the daemon, forcing it to check
services immediately.</p>
<p>The <em>quit</em> argument will kill a running daemon process instead
of waking it up.</p>
<p>
</p>
<hr />
<h1><a name="init_support">INIT SUPPORT</a></h1>
<p>Monit can run and be controlled from <em>init</em>. If Monit should
crash, <em>init</em> will re-spawn a new Monit process. Using init to
start Monit is probably the best way to run Monit if you want to
be certain that you always have a running Monit daemon on your
system. (It's obvious, but never the less worth to stress; Make
sure that the control file does not have any syntax errors before
you start Monit from init. Also, make sure that if you run monit
from init, that you do not start Monit from a startup scripts as
well).</p>
<p>To setup Monit to run from init, you can either use the 'set
init' statement in monit's control file or use the -I option from
the command line and here is what you must add to /etc/inittab:</p>
<pre>
# Run Monit in standard run-levels
mo:2345:respawn:/usr/local/bin/monit -Ic /etc/monitrc</pre>
<p>After you have modified init's configuration file, you can run
the following command to re-examine /etc/inittab and start monit:</p>
<pre>
telinit q
For systems without telinit:</pre>
<pre>
kill -1 1</pre>
<p>If Monit is used to monitor services that are also started at
boot time (e.g. services started via SYSV init rc scripts or via
inittab) then, in some cases, a race condition could occur. That
is; if a service is slow to start, Monit can assume that the
service is not running and possibly try to start it and raise an
alert, while, in fact the service is already about to start or
already in its startup sequence. Please see the FAQ for solutions
to this problem.</p>
<p>
</p>
<hr />
<h1><a name="include_files">INCLUDE FILES</a></h1>
<p>The Monit control file, <em>monitrc</em>, can include additional
configuration files. This feature helps to maintain a certain
structure or to place repeating settings into one file. Include
statements can be placed at virtually any spot. The syntax is the
following:</p>
<pre>
INCLUDE globstring</pre>
<p>The globstring is any kind of string as defined in <code>glob(7)</code>.
Thus, you can refer to a single file or you can load several
files at once. In case you want to use whitespace in your string
the globstring need to be embedded into quotes (') or double
quotes ("). For example,</p>
<pre>
INCLUDE "/etc/monit/Monit configuration files/printer.*.monitrc"</pre>
<p>loads any file matching the single globstring. If the globstring
matches a directory instead of a file, it is silently ignored.</p>
<p><em>INCLUDE</em> statements in included files are parsed as in the main
control file.</p>
<p>If the globstring matches several results, the files are included
in a non sorted manner. If you need to rely on a certain order,
you might need to use single <em>include</em> statements.</p>
<p>
</p>
<hr />
<h1><a name="group_support">GROUP SUPPORT</a></h1>
<p>Service entries in the control file, <em>monitrc</em>, can be grouped
together by the <em>group</em> statement. The syntax is simply (keyword
in capital):</p>
<pre>
GROUP groupname</pre>
<p>With this statement it is possible to group similar service
entries together and manage them as a whole. Monit provides
functions to start, stop, restart, monitor and unmonitor a
group of services, like so:</p>
<p>To start a group of services from the console:</p>
<pre>
Monit -g <groupname> start</pre>
<p>To stop a group of services:</p>
<pre>
Monit -g <groupname> stop</pre>
<p>To restart a group of services:</p>
<pre>
Monit -g <groupname> restart</pre>
<p>Note:
the <em>status</em> and <em>summary</em> commands don't support the -g
option and will print the state of all services.</p>
<p>Service can be added to multiple groups by adding group statement
multiple times:</p>
<pre>
group www
group filesystem</pre>
<p>
</p>
<hr />
<h1><a name="monitoring_mode">MONITORING MODE</a></h1>
<p>Monit supports three monitoring modes per service: <em>active</em>,
<em>passive</em> and <em>manual</em>. See also the example section below for
usage of the mode statement.</p>
<p>In <em>active</em> mode, Monit will monitor a service and in case of
problems Monit will act and raise alerts, start, stop or restart
the service. Active mode is the default mode.</p>
<p>In <em>passive</em> mode, Monit will passively monitor a service and
specifically <strong>not</strong> try to fix a problem, but it will still raise
alerts in case of a problem.</p>
<p>For use in clustered environments there is also a <em>manual</em>
mode. In this mode, Monit will enter <em>active</em> mode <strong>only</strong> if a
service was brought under monit's control, for example by
executing the following command in the console:</p>
<pre>
Monit start sybase
(Monit will call sybase's start method and enable monitoring)</pre>
<p>If a service was not started by Monit or was stopped or disabled
for example by:</p>
<pre>
Monit stop sybase
(Monit will call sybase's stop method and disable monitoring)</pre>
<p>Monit will then not monitor the service. This allows for having
services configured in monitrc and start it with Monit only if it
should run. This feature can be used to build a simple failsafe
cluster.</p>
<p>A service's monitoring state is persistent across Monit restart.
This means that you probably would like to make certain that
services in manual mode are stopped or in unmonitored mode at
server shutdown. Do for instance the following in a server
shutdown script:</p>
<pre>
Monit stop sybase</pre>
<p>or</p>
<pre>
Monit unmonitor sybase</pre>
<p>If you use Monit in a HA-cluster you should place the state file
in a temporary filesystem so if the machine should crash and the
stand-by machine take over services, any manual monitoring mode
services that were started on the crashed machine won't be
started on reboot. Use for example:</p>
<pre>
set statefile /tmp/monit.state</pre>
<p>
</p>
<hr />
<h1><a name="alert_messages">ALERT MESSAGES</a></h1>
<p>Monit will raise an email alert in the following situations:</p>
<pre>
o A service timed out
o A service does not exist
o A service related data access problem
o A service related program execution problem
o A service is of invalid object type
o A icmp problem
o A port connection problem
o A resource statement match
o A file checksum problem
o A file size problem
o A file/directory timestamp problem
o A file/directory/filesystem permission problem
o A file/directory/filesystem uid problem
o A file/directory/filesystem gid problem
o An action is done per administrator's request</pre>
<p>Monit will send an alert each time a monitored object changed.
This involves:</p>
<pre>
o Monit started, stopped or reloaded
o A file checksum changed
o A file size changed
o A file content match
o A file/directory timestamp changed
o A filesystem mount flags changed
o A process PID changed
o A process PPID changed</pre>
<p>You use the alert statement to notify Monit that you want alert
messages sent to an email address. If you do not specify an alert
statement, Monit will not send alert messages.</p>
<p>There are two forms of alert statement:</p>
<pre>
o Global - common for all services
o Local - per service</pre>
<p>In both cases you can use more than one alert statement. In other
words, you can send many different emails to many different
addresses.</p>
<p>Recipients in the global and in the local lists are alerted when
a service failed, recovered or changed. If the same email address
is in the global and in the local list, Monit will only send one
alert. Local (per service) defined alert email addresses override
global addresses in case of a conflict. Finally, you may choose
to only use a global alert list (recommended), a local per
service list or both.</p>
<p>It is also possible to disable the global alerts locally for
particular service(s) and recipients.</p>
<p>
</p>
<h2><a name="setting_a_global_alert_statement">Setting a global alert statement</a></h2>
<p>If a change occurred on a monitored services, Monit will send an
alert to all recipients in the global list who has registered
interest for the event type. Here is the syntax for the global
alert statement:</p>
<dl>
<dt><strong><a name="set_alert_mail_address_not_events_mail_format_mail_format_reminder_number" class="item">SET ALERT mail-address [ [NOT] {events}] [MAIL-FORMAT
{mail-format}] [REMINDER number]</a></strong></dt>
</dl>
<p>Simply using the following in the global section of monitrc:</p>
<pre>
set alert foo@bar</pre>
<p>will send a default email to the address foo@bar whenever an
event occurred on any service. Such an event may be that a
service timed out, a service doesn't exist and so on. If you want
to send alert messages to more email addresses, add a <em>set alert
'email'</em> statement for each address.</p>
<p>For explanations of the <em>events, MAIL-FORMAT and REMINDER</em>
keywords above, please see below.</p>
<p>You can also use the NOT option ahead of the events list which
will reverse the meaning of the list. That is, only send alerts
for events <em>not</em> in the list. This can save you some
configuration bytes if you are interested in most events except a
few.</p>
<p>
</p>
<h2><a name="setting_a_local_alert_statement">Setting a local alert statement</a></h2>
<p>Each service can also have its own recipient list.</p>
<dl>
<dt><strong><a name="alert_mail_address_not_events_mail_format_mail_format_reminder_number" class="item">ALERT mail-address [ [NOT] {events}] [MAIL-FORMAT
{mail-format}] [REMINDER number]</a></strong></dt>
</dl>
<p>or</p>
<dl>
<dt><strong><a name="noalert_mail_address" class="item">NOALERT mail-address</a></strong></dt>
</dl>
<p>If you only want an alert message sent for certain events and for
certain service(s), for example only for timeout events or only
if a service died, then postfix the alert-statement with a filter
block:</p>
<pre>
check process myproc with pidfile /var/run/my.pid
alert foo@bar only on { timeout, nonexist }
...</pre>
<p>(<em>only</em> and <em>on</em> are noise keywords, ignored by Monit. As a
side note; Noise keywords are used in the control file grammar to
make an entry resemble English and thus make it easier to read
(or, so goes the philosophy). The full set of available noise
keywords are listed below in the Control File section).</p>
<p>You can also setup to send alerts for all events except some by
putting the word <em>not</em> ahead of the list. For example, if you
want to receive alerts for all events except Monit instance
events, you can write (note that the noise words 'but' and 'on'
are optional):</p>
<pre>
check system myserver
alert foo@bar but not on { instance }
...</pre>
<p>instead of:</p>
<pre>
alert foo@bar on { action
checksum
content
data
exec
gid
icmp
invalid
fsflags
nonexist
permission
pid
ppid
size
timeout
timestamp }</pre>
<p>This will send alerts for all events to foo@bar, except Monit
instance events. An instance event BTW, is an event fired
whenever the Monit program start or stop.</p>
<p>Event filtering can be used to send an email to different email
addresses depending on the events that occurred. For instance:</p>
<pre>
alert foo@bar { nonexist, timeout, resource, icmp, connection }
alert security@bar on { checksum, permission, uid, gid }
alert manager@bar</pre>
<p>This will send an alert message to foo@bar whenever a nonexist,
timeout, resource or connection problem occurs and a message to
security@bar if a checksum, permission, uid or gid problem
occurs. And finally, a message to manager@bar whenever any error
event occurs.</p>
<p>Here is the list of events you can use in a mail-filter: <em>uid,
gid, size, nonexist, data, icmp, instance, invalid, exec,
content, timeout, resource, checksum, fsflags, timestamp,
connection, permission, pid, ppid, action</em></p>
<p>You can also disable the alerts localy using the NOALERT
statement. This is useful if you have lots of services monitored
and are using the global alert statement, but don't want to
receive alerts for some minor subset of services:</p>
<pre>
noalert appadmin@bar</pre>
<p>For example, if you stick the noalert statement in a 'check
system' entry, you won't receive system related alerts (such as
Monit instance started/stopped/reloaded alert, system overloaded
alert, etc.) but will receive alerts for all other monitored
services.</p>
<p>The following example will alert foo@bar on all events on all
services by default, except the service mybar which will send an
alert only on timeout. The trick is based on the fact that local
definition of the same recipient overrides the global setting
(including registered events and mail format):</p>
<pre>
set alert foo@bar
check process myfoo with pidfile /var/run/myfoo.pid
...
check process mybar with pidfile /var/run/mybar.pid
alert foo@bar only on { timeout }</pre>
<p>
</p>
<h2><a name="alert_message_layout">Alert message layout</a></h2>
<p>Monit provides a default mail message layout that is short and to
the point. Here's an example of a standard alert mail sent by
monit:</p>
<pre>
From: monit@tildeslash.com
Subject: Monit alert -- Does not exist apache
To: hauk@tildeslash.com
Date: Thu, 04 Sep 2003 02:33:03 +0200</pre>
<pre>
Does not exist Service apache</pre>
<pre>
Date: Thu, 04 Sep 2003 02:33:03 +0200
Action: restart
Host: www.tildeslash.com</pre>
<pre>
Your faithful employee,
monit</pre>
<p>If you want to, you can change the format of this message with
the optional <em>mail-format</em> statement. The syntax for this
statement is as follows:</p>
<pre>
mail-format {
from: monit@localhost
reply-to: support@domain.com
subject: $SERVICE $EVENT at $DATE
message: Monit $ACTION $SERVICE at $DATE on $HOST: $DESCRIPTION.
Yours sincerely,
monit
}</pre>
<p>Where the keyword <em>from:</em> is the email address Monit should
pretend it is sending from. It does not have to be a real mail
address, but it must be a proper formated mail address, on the
form: name@domain. The <em>reply-to:</em> keyword can be used to set
the reply-to mail header. The keyword <em>subject:</em> is for the
email subject line. The subject must be on only <em>one</em> line. The
<em>message:</em> keyword denotes the mail body. If used, this keyword
should always be the last in a mail-format statement. The mail
body can be as long as you want, but must <strong>not</strong> contain the '}'
character.</p>
<p>All of these format keywords are optional, but if used, you must
provide at least one. Thus if you only want to change the from
address Monit is using you can do:</p>
<pre>
set alert foo@bar with mail-format { from: bofh@bar.baz }</pre>
<p>From the previous example you will notice that some special $XXX
variables were used. If used, they will be substituted and
expanded into the text with these values:</p>
<ul>
<li><strong><a name="_event" class="item"><em>$EVENT</em></a></strong>
<pre>
A string describing the event that occurred. The values are
fixed and are:</pre>
<pre>
Event: | Failure state: | Success state:
-------------------------------------------------------------------
ACTION | "Action done" | "Action done"
CHECKSUM | "Checksum failed" | "Checksum succeeded"
CONNECTION| "Connection failed" | "Connection succeeded"
CONTENT | "Content failed", | "Content succeeded"
DATA | "Data access error" | "Data access succeeded"
EXEC | "Execution failed" | "Execution succeeded"
FSFLAG | "Filesystem flags failed"| "Filesystem flags succeeded"
GID | "GID failed" | "GID succeeded"
ICMP | "ICMP failed" | "ICMP succeeded"
INSTANCE | "Monit instance changed" | "Monit instance changed not"
INVALID | "Invalid type" | "Type succeeded"
NONEXIST | "Does not exist" | "Exists"
PERMISSION| "Permission failed" | "Permission succeeded"
PID | "PID failed" | "PID succeeded"
PPID | "PPID failed" | "PPID succeeded"
RESOURCE | "Resource limit matched" | "Resource limit succeeded"
SIZE | "Size failed" | "Size succeeded"
TIMEOUT | "Timeout" | "Timeout recovery"
TIMESTAMP | "Timestamp failed" | "Timestamp succeeded"
UID | "UID failed" | "UID succeeded"</pre>
</li>
<li><strong><a name="_service" class="item"><em>$SERVICE</em></a></strong>
<pre>
The service entry name in monitrc</pre>
</li>
<li><strong><a name="_date" class="item"><em>$DATE</em></a></strong>
<pre>
The current time and date (RFC 822 date style).</pre>
</li>
<li><strong><a name="_host" class="item"><em>$HOST</em></a></strong>
<pre>
The name of the host Monit is running on</pre>
</li>
<li><strong><a name="_action" class="item"><em>$ACTION</em></a></strong>
<pre>
The name of the action which was done. Action names are fixed
and are:</pre>
<pre>
Action: | Name:
--------------------
ALERT | "alert"
EXEC | "exec"
MONITOR | "monitor"
RESTART | "restart"
START | "start"
STOP | "stop"
UNMONITOR| "unmonitor"</pre>
</li>
<li><strong><a name="_description" class="item"><em>$DESCRIPTION</em></a></strong>
<pre>
The description of the error condition</pre>
</li>
</ul>
<p>
</p>
<h2><a name="setting_a_global_mail_format">Setting a global mail format</a></h2>
<p>It is possible to set a standard mail format with the following
global set-statement (keywords are in capital):</p>
<dl>
<dt><strong><a name="set_mail_format_mail_format" class="item">SET MAIL-FORMAT {mail-format}</a></strong></dt>
</dl>
<p>Format set with this statement will apply to every alert
statement that does <em>not</em> have its own specified mail-format.
This statement is most useful for setting a default from address
for messages sent by monit, like so:</p>
<pre>
set mail-format { from: monit@foo.bar.no }</pre>
<p>
</p>
<h2><a name="setting_an_error_reminder">Setting an error reminder</a></h2>
<p>Monit by default sends just one error notification if a service
failed and another when it recovered. If you want to be notified
more then once if a service remains in a failed state, you can
use the reminder option to the alert statement (keywords are in
capital):</p>
<dl>
<dt><strong><a name="alert_with_reminder_on_number_cycles" class="item">ALERT ... [WITH] REMINDER [ON] number [CYCLES]</a></strong></dt>
</dl>
<p>For example if you want to be notified each tenth cycle if a
service remains in a failed state, you can use:</p>
<pre>
alert foo@bar with reminder on 10 cycles</pre>
<p>Likewise if you want to be notified on each failed cycle, you can
use:</p>
<pre>
alert foo@bar with reminder on 1 cycle</pre>
<p>
</p>
<h2><a name="setting_a_mail_server_for_alert_messages">Setting a mail server for alert messages</a></h2>
<p>The mail server Monit should use to send alert messages is
defined with a global set statement (keywords are in capital and
optional statements in [brackets]):</p>
<pre>
SET MAILSERVER {hostname|ip-address [PORT port]
[USERNAME username] [PASSWORD password]
[using SSLV2|SSLV3|TLSV1] [CERTMD5 checksum]}+
[with TIMEOUT X SECONDS]
[using HOSTNAME hostname]</pre>
<p>The port statement allows to use SMTP servers other then those
listening on port 25. If omitted, port 25 is used unless ssl or
tls is used, in which case port 465 is used by default.</p>
<p>Monit support plain smtp authentication - you can set a username
and a password using the USERNAME and PASSWORD options.</p>
<p>To use secure communication, use the SSLV2, SSLV3 or TLSV1
options, you can also specify the server certificate checksum
using CERTMD5 option.</p>
<p>As you can see, it is possible to set several SMTP servers. If
Monit cannot connect to the first server in the list it will try
the second server and so on. Monit has a default 5 seconds
connection timeout and if the SMTP server is slow, Monit could
timeout when connecting or reading from the server. If this is
the case, you can use the optional timeout statement to explicit
set the timeout to a higher value if needed. Here is an example
for setting several mail servers:</p>
<pre>
set mailserver mail.tildeslash.com, mail.foo.bar port 10025
username "Rabbi" password "Loewe" using tlsv1, localhost
with timeout 15 seconds</pre>
<p>Here Monit will first try to connect to the server
"mail.tildeslash.com", if this server is down Monit will try
"mail.foo.bar" on port 10025 using the given credentials via tls
and finally "localhost". We also set an explicit connect and read
timeout; If Monit cannot connect to the first SMTP server in the
list within 15 seconds it will try the next server and so on. The
<em>set mailserver ..</em> statement is optional and if not defined
Monit will not send email alerts. Not setting a mail server is
recommended only if alert notification is delegated to M/Monit.</p>
<p>Monit, by default, use the local host name in SMTP HELO/EHLO and
in the Message-ID header. Some mail servers check this
information against DNS for spam protection and can reject the
email if the DNS and the hostname used in the transaction does
not match. If this is the case, you can override the default
local host name by using the HOSTNAME option:</p>
<pre>
set mailserver mail.tildeslash.com using hostname
"myhost.example.org"</pre>
<p>
</p>
<h2><a name="event_queue">Event queue</a></h2>
<p>If the MTA (mail server) for sending alerts is not available,
Monit <em>can</em> queue events on the local file-system until the MTA
recover. Monit will then post queued events in order with their
original timestamp so the events are not lost. This feature is
most useful if Monit is used together with M/Monit and when event
history is important.</p>
<p>The event queue is persistent across monit restarts and provided
that the back-end filesystem is persistent too, across system
restart as well.</p>
<p>By default, the queue is disabled and if the alert handler fails,
Monit will simply drop the alert message. To enable the event
queue, add the following statement to the Monit control file:</p>
<pre>
SET EVENTQUEUE BASEDIR <path> [SLOTS <number>]</pre>
<p>The <path> is the path to the directory where events will be
stored. Optionally if you want to limit the queue size, use the
slots option to only store up to <em>number</em> event messages. If the
slots option is not used, Monit will store as many events as the
backend filesystem allows.</p>
<p>Example:</p>
<pre>
set eventqueue
basedir /var/monit
slots 5000</pre>
<p>Events are stored in a binary format, with one file per event.
The file size is ca. 130 bytes or a bit more (depending on the
message length). The file name is composed of the unix timestamp,
underscore and the service name, for example:</p>
<pre>
/var/monit/1131269471_apache</pre>
<p>If you are running more then one Monit instance on the same
machine, you <strong>must</strong> use separated event queue directories to
avoid sending wrong alerts to the wrong addresses.</p>
<p>If you want to purge the queue by hand, that is, remove queued
event-files, Monit should be stopped before the removal.</p>
<p>
</p>
<hr />
<h1><a name="service_timeout">SERVICE TIMEOUT</a></h1>
<p><strong>monit</strong> provides a service timeout mechanism for situations
where a service simply refuses to start or respond over a longer
period.</p>
<p>The timeout mechanism is based on number if service restarts and
number of poll-cycles. For example, if a service had <em>x</em>
restarts within <em>y</em> poll-cycles (where <em>x</em> <= <em>y</em>) then Monit
will perform an action (for example unmonitor the service). If a
timeout occurs Monit will send an alert message if you have
register interest for this event.</p>
<p>The syntax for the timeout statement is as follows (keywords are
in capital):</p>
<dl>
<dt><strong><a name="cycle" class="item">IF <number> RESTART <number> CYCLE(S) THEN <action></a></strong></dt>
</dl>
<p>Here is an example where Monit will unmonitor the service if it
was restarted 2 times within 3 cycles:</p>
<pre>
if 2 restarts within 3 cycles then unmonitor</pre>
<p>To have Monit check the service again after a monitoring was
disabled, run 'monit monitor <servicename>' from the command
line.</p>
<p>Example for setting custom exec on timeout:</p>
<pre>
if 5 restarts within 5 cycles then exec "/foo/bar"</pre>
<p>Example for stopping the service:</p>
<pre>
if 7 restarts within 10 cycles then stop</pre>
<p>
</p>
<hr />
<h1><a name="service_tests">SERVICE TESTS</a></h1>
<p>Monit provides several tests you may utilize in a service entry
to test a service. There are two classes of tests: variable and
constant tests. That is, the condition we test is either constant