-
Notifications
You must be signed in to change notification settings - Fork 118
/
Copy pathatop.1
3272 lines (3194 loc) · 109 KB
/
atop.1
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
.TH ATOP 1 "July 2024" "Linux"
.SH NAME
.B atop
- Advanced System & Process Monitor
.SH SYNOPSIS
Live measurement in bar graph mode:
.PP
.TP 5
.B \ atop \-B[H] [-t [absdir]] [interval [samples]]
.PP
Live measurement cgroups in text mode:
.PP
.TP 5
.B \ atop \-G [-t [absdir]] [-2|-3|-4|-5|-6|-7|-8|-9] [\-a] [\-C|\-M|\-D|\-A] [interval [samples]]
.PP
Live measurement processes in text mode:
.PP
.TP 5
.B \ atop [-t [absdir]] [\-g|\-m|\-d|\-n|\-u|\-p|\-s|\-c|\-v|\-o|\-y|\-Y] [\-C|\-M|\-D|\-N|\-A] [\-fFX1xR] [interval [samples]]
.PP
Live generation of parsable output (white-space separated or JSON):
.PP
.TP 5
.B \ atop [\-Plabel[,label]... [-Z]] [\-Jlabel[,label]...] [interval [samples]]
.PP
Write raw log files:
.PP
.TP 5
.B \ atop \-w rawfile [\-a] [\-S] [interval [samples]]
.PP
Analyze raw log files in bar graph mode:
.PP
.TP 5
.B \ atop \-B[H] -r [rawfile|yyy...] [\-b [YYYYMMDD]hhmm[ss]] [\-e [YYYYMMDD]hhmm[ss]]
.PP
Analyze cgroups from raw log files in text mode:
.PP
.TP 5
.B \ atop \-G [-2|-3|-4|-5|-6|-7|-8|-9] [\-a] \-r [rawfile|yyy...] [\-b [YYYYMMDD]hhmm[ss]] [\-e [YYYYMMDD]hhmm[ss]]
.PP
Analyze processes from raw log files in text mode:
.PP
.TP 5
.B \ atop \-r [rawfile|yyy...] [\-b [YYYYMMDD]hhmm[ss]] [\-e [YYYYMMDD]hhmm[ss]] [\-g|\-m|\-d|\-n|\-u|\-p|\-s|\-c|\-v|\-o|\-y|\-Y] [\-C|\-M|\-D|\-N|\-A] [\-fFX1xR]
.PP
Generate parsable output from raw log files (white-space separated or JSON):
.PP
.TP 5
.B \ atop \-r [rawfile|yyy...] [\-b [YYYYMMDD]hhmm[ss]] [\-e [YYYYMMDD]hhmm[ss]] [\-Plabel[,label]... [-Z]] [\-Jlabel[,label]...]
.SH DESCRIPTION
The program
.I atop
is an interactive monitor to view the load on a Linux system.
Every
.I interval
seconds (default: 10 seconds) information is gathered about the
resource occupation on system level (CPUs, memory, disks and network
interfaces) and on cgroup level (version 2). Besides,
information is gathered about the processes and threads that
are responsible for the utilization of the CPUs, memory and disks.
Network load per process is shown only when the
.I netatop
kernel module or the
.I netatop-bpf
BPF module has been installed.
.SH TWIN MODE
With the
.I -t
flag you can run
.I atop
interactively in 'twin mode'.
This mode allows to run a live measurement with the possibility to
review and analyze an earlier sample. Meanwhile, the live measurement
continues.
When started in twin mode,
.I atop
spawns a child process that gathers the counters and writes
them to a temporary raw file. The parent process reads the counters
from the temporary raw file and presents them to the user.
The reading of the parent process keeps in pace with the written
samples of the child process for live measurements.
While the gathering continues by the child process, key 'r',
key 'b' or key 'T' can be pressed to review earlier samples.
The parent process implicitly pauses the live measurement by pressing
one of these keys or by pressing key 'z' (pause) explicitly.
After browsing through the earlier samples with the
keys 't' (next sample), 'T' (previous sample), 'r' (reset to
begin of measurement), 'Z' (fast-forward to end of measurement)
and 'b' (branch to timestamp), the live measurement can be continued
by pressing key 'z' (resume after pause).
The temporary raw file will be written in the
.B /tmp
directory by default. Optionally the absolute path name of an alternative
directory can be added behind the
.I -t
flag (e.g. when there is not enough space in the
.B /tmp
directory). In any case, the parent process will terminate the child process
when the measurement is finished and the temporary raw file will be
removed.
.SH BAR GRAPH MODE
When running
.I atop
you can choose to view the system load in bar graph mode or in text mode.
In bar graph mode the resource utilization of CPUs, memory, disks and network
interfaces is shown via (character-based) bar graphs, but only on system level.
When you want to view more detailed information on system level or when you
want to view the resource consumption on process or thread level, you can switch
to text mode by pressing the 'B' key. Alternatively, you can use the 'B' key
(again) to switch from text mode to bar graph mode.
.br
By default,
.I atop
starts in text mode unless the
.I -B
flag is used or unless 'B' has been configured as a default flag in the
.I .atoprc
file (for further information about default flags, refer to the
.B atoprc
man page).
.PP
In bar graph mode the terminal will be subdivided into four character-based
windows, i.e. one window for each hardware resource:
.PP
.TP 5
.B Processors
The first bar shows the average busy percentage of all CPUs with
the bar label 'Avg' (might be abbreviated to 'Av' or even just 'A').
The subsequent bars show the busy percentages of single CPUs.
.br
When there is not enough horizontal space to show all CPUs, only the
most busy CPUs per sample will be shown after the width of each bar
has been reduced to a minimum.
By default, the categories of CPU consumption are shown by different
colors in the bars, marked with a character 'S' (system mode), 'U'
(user mode), 'I' (interrupt handling), 's' (steal) and 'G'
(guest, i.e. consumed by virtual machines).
.br
The top of the bar might consist of an unmarked color representing
a 'neutral' category. Suppose that the scale unit is 5% per line
and the total busy percentage is 54% consisting of two categories of 27%.
The two categories will be rounded to 25% (5 lines of 5% each) but the
total busy percentage will be rounded to 55% (11 lines of 5%).
Then the top line will represent a 'neutral' category.
.br
By pressing the 'H' key or by starting
.I atop
with the '-H' flag, no categories are shown.
A red line is drawn in the bar graph as critical threshold.
By default this value is 90% and can be modified by the 'cpucritperc'
option in the configuration file (see separate
.B atoprc
man page). When this value is set to zero, no threshold line will be drawn.
.TP 5
.B Memory and swap space
Memory is presented as a column in which the
specific categories of memory consumption are shown. These categories
are (code, data and stack of) processes/kernel, slab caches
(i.e. dynamically allocated kernel memory), shared memory, tmpfs,
static huge pages, page cache and free memory.
.br
Swap space (if present) is also presented as a column in which the
categories processes/tmpfs, shared memory and free space are shown.
At the right side memory-related event counters are shown.
.br
The bottom three counters are colored green when there is no memory pressure.
When considerable activity is noticed such counter might be colored orange and
with high activity red.
.br
When memory pressure starts, usually memory page scanning will be activated
first. When pressure increases, memory pages of processes might be swapped
out to swap space (if present).
.br
The 'oomkills' counter (Out Of Memory killing) is most serious:
it reflects the number of processes that are killed due to lack of memory
(and swap). Therefore this counter shows the absolute number (not per second)
of processes being killed during the last interval and will immediately
be colored red when it is 1 or more. Besides, after
.I atop
has noticed OOM killing the 'oomkills' counter remains orange for the next
15 minutes, just in case that you have missed the OOM killing event itself.
.br
When there is enough vertical space in the memory window, event counters
are shown about the number of memory pages being swapped in,
the number of memory pages paged out to block devices and
the number of memory pages paged in from block devices.
Memory and swap space consumption will preferably be shown in a
character-based window that vertically uses the entire screen for
optimal granularity. However, when there are a lot of disks and/or
network interfaces the memory and swap space consumption will be shown
in a character-based window that only uses the upper half of the screen.
.TP 5
.B Disks
For each disk the busy percentage is shown as a bar.
.br
When there is not enough horizontal space to show all disks, only the
most busy disks per sample will be shown.
By default, categories of disk consumption are shown by different colors
in the bars, marked with a character 'R' (read) and 'W'
(write).
.br
The top of the bar might consist of an unmarked color representing
a 'neutral' category. Suppose that the scale unit is 5% per line
and the total busy percentage is 54% consisting of two categories of 27%.
The two categories will be rounded to 25% (5 lines of 5% each) but the
total busy percentage will be rounded to 55% (11 lines of 5%).
Then the top line will represent a 'neutral' category.
.br
By pressing the 'H' key or by starting
.I atop
with the '-H' flag, no categories are shown.
A red line is drawn in the bar graph as critical threshold.
By default this value is 90% and can be modified by the 'dskcritperc'
option in the configuration file (see separate
.B atoprc
man page). When this value is set to zero, no threshold line will be drawn.
.TP 5
.B Interfaces
For each non-virtual network interface a double bar graph is shown with
a dedicated scale that reflects the traffic rate. One of the bars shows
the transmit rate ('TX') and the other bar the receive rate ('RX').
The traffic scale of each network interface remains at its highest level.
All interface scales can be reset during the measurement by pressing
the 'L' key.
Most often the real speed (maximum bandwidth) of network interfaces is
not known, e.g. in case of the network interfaces of virtual machines.
Therefore it is not possible to show the interface utilization as a
percentage. However, when the real speed of an interface is known it will
be shown underneath the concerning bar graph.
When there is not enough horizontal space to show all network interfaces,
only the most busy interfaces per sample will be shown.
.PP
Usually the bar graphs will not be sorted on busy percentage when there
is enough horizontal space. However, after switching from text mode to
bar graph mode the bar graphs might have been sorted because this was
needed for the presentation in text mode. The next interval in bar graph
mode shows the bars unsorted again unless the window width is unsufficient
for all bars.
.PP
For the CPUs, disks and memory resources also a bar graph with two bars
is shown reflecting the PSI values 'some' (S) and 'full' (F).
The 'some' (S) percentage indicates the time in which at least some tasks
were stalled on a given resource.
The 'full' (F) percentage indicates the time in which all non-idle tasks were stalled
on this resource simultaneously, which had severe impact on the performance.
.br
For some Linux distributions PSI support might be disabled by default.
In that case you need to pass
.B psi=1
on the kernel command line during boot.
.PP
The remaining part of this manual page mainly describes the information
shown in text mode.
When certain descriptions also apply to bar graph mode it will be
mentioned explicitly.
.SH TEXT MODE IN GENERAL
With every interval information is shown about the resource occupation
on system level (CPU, memory, disks and network layers) in the upper part
of the screen. When a resource has been unused during the interval,
the line is suppressed unless the 'f' key is active.
In the bottom part of the screen cgroup level (key 'G') or process level
information is shown (keys 'g', 'c', 's', 'm', 'd', 'n', 'v', 'u' or 'p').
By default, only cgroups are shown without assigned processes in the cgroup itself
nor in the cgroups underneath, unless the 'a' key (all) is active.
By default, only processes are shown that were active during
the interval unless the 'a' key (all) is active.
.br
The intervals are repeated till the number of
.I samples
(specified as command argument) is reached, or till the key 'q' is pressed
in interactive mode.
.PP
When
.I atop
is started, it checks whether the standard output channel is connected to a
screen, or to a file/pipe. In the first case it produces screen control
codes (via the ncurses library) and behaves interactively. In the second case
it produces flat text output.
.PP
In interactive mode, the output of
.I atop
scales dynamically to the current dimensions of the screen/window.
.br
If the window is resized horizontally, columns will be added or removed
automatically. For this purpose, every column has a particular weight. The
columns with the highest weights that fit within the current width will
be shown.
.br
If the window is resized vertically, lines of the process/thread list
will be added or removed automatically.
.PP
In interactive mode the output of
.I atop
can be controlled by pressing particular keys.
However it is also possible to specify such key as
.B flag
on the command line. In that case
.I atop
switches to the indicated mode on beforehand. This mode can
be modified again interactively. Specifying such key as flag
is especially useful when running
.I atop
with output to a pipe or file (non-interactively).
These flags are the same as the keys that can be pressed in interactive
mode (see section INTERACTIVE COMMANDS).
.br
Additional flags are available to support storage of atop-data in raw
format (see section RAW DATA STORAGE).
By default,
.I atop
produces its output full-screen unless a flag is passed to direct
the output to a raw log (-w) or direct the output in parseable (-P)
or JSON output (-J). It is possible however to produce full-screen
output while the output is also directed in another way. In that
case a flag that relates to full-screen output (like -g) has to be
passed on the command line explicitly.
The status line in the initial screen shows if
.I atop
runs with restricted view (as unprivileged user) or unrestricted view
(as privileged user). In case of restricted view
.I atop
does not have the privileges (no root identity nor the necessary capabilities)
to retrieve all counter values on system level and on process level.
.SH PROCESS ACCOUNTING
With every interval,
.I atop
reads the kernel administration to obtain information about all
running processes.
However, it is likely that processes have terminated during the interval.
These processes might have consumed system resources during
this interval before they terminated.
Therefore,
.I atop
tries to read the process accounting records that contain the accounting
information of terminated processes and report these processes too.
Only when the process accounting mechanism in the kernel is activated,
the kernel writes such process accounting record to a file
for every process that terminates.
.PP
There are various ways for
.I atop
to get access to the process accounting records (tried in this order):
.PP
.TP 4
1.
When the environment variable ATOPACCT is set,
it specifies the name of the process accounting file.
In that case, process accounting for this file
should have been activated on beforehand.
Before opening this file for reading,
.I atop
drops its root privileges (if any).
.br
When this environment variable is present but its
contents is empty, process accounting will not be used at all.
.PP
.TP 4
2.
.B This is the preferred way of handling process accounting records!
.br
When the
.I atopacctd
daemon is active, it has activated the process accounting mechanism in
the kernel and transfers to original accounting records to shadow files.
In that case,
.I atop
drops its root privileges and opens the current shadow file for reading.
.br
This way is preferred, because the
.I atopacctd
daemon maintains full control of the size of the original process
accounting file written by the kernel and the shadow files read by the
.I atop
process(es).
The
.I atopacct
service will be activated before the
.I atop
service to enable
.I atop
to detect that process accounting is managed by the
.I atopacctd
daemon. As a forking service,
.I atopacctd
takes care that all directories and files are initialized before the
parent process dies. The child process continues as the daemon process.
For further information, refer to the
.B atopacctd
man page.
.PP
.TP 4
3.
When the
.I atopacctd
daemon is not active,
.I atop
verifies if the process accounting mechanism has been switched on
via the separate
.B psacct
or
.B acct
package (the package name depends on the Linux distro). In that case,
one of the files
.B /var/log/pacct,
.B /var/account/pacct
or
.B /var/log/account/pacct
is in use as process accounting file and
.I atop
opens this file for reading.
.PP
.TP 4
4.
As a last possibility,
.I atop
itself tries to activate the process accounting mechanism (requires root
privileges) using the file
.B /var/cache/atop.d/atop.acct
(to be written by the kernel, to be read by
.I atop
itself). Process accounting remains active as long as
at least one
.I atop
process is alive.
Whenever the last
.I atop
process stops (either by pressing 'q' or by 'kill \-15'), it deactivates the
process accounting mechanism again. Therefore you should never terminate
.I atop
by 'kill \-9', because then it has no chance to stop process accounting.
As a result, the accounting file may consume a lot of
disk space after a while.
.br
To avoid that the process accounting file consumes too much disk space,
.I atop
verifies at the end of every sample if the size of the process accounting
file exceeds 200 MiB and if this
.I atop
process is the only one that is currently using the file.
In that case the file is truncated to a size of zero.
Notice that root-privileges are required to switch on/off process accounting
in the kernel. You can start
.I atop
as a root user or specify setuid-root privileges to the executable file.
In the latter case,
.I atop
switches on process accounting and drops the root-privileges again.
.br
If
.I atop
does not run with root-privileges, it does not show information
about finished processes.
It indicates this situation with the
message 'no procacct' in the top-right corner (instead of the counter that
shows the number of exited processes).
.PP
When during one interval a lot of processes have finished,
.I atop
might grow tremendously in memory when reading all process accounting
records at the end of the interval. To avoid such excessive growth
.I atop
will never read more than 50 MiB with process information from the
process accounting file per interval (approx. 54000 finished processes).
In interactive mode a warning is given whenever processes have been skipped
for this reason.
.PP
.SH COLORS
For the resource consumption on system level,
.I atop
uses colors in text mode to indicate that a critical occupation
percentage has been (almost) reached.
A critical occupation percentage means that is likely that this load
causes a noticeable negative performance influence for applications using
this resource. The critical percentage depends on the type of resource:
e.g. the performance influence of a disk with a busy percentage of 80%
might be more noticeable for applications/users than a CPU with a busy
percentage of 90%.
.br
Currently
.I atop
uses the following default values to calculate a weighted percentage
per resource:
.PP
.TP 5
.B \ Processor
A busy percentage of 90% or higher is considered 'critical'
(also in bar graph mode).
.TP 5
.B \ Disk
A busy percentage of 90% or higher is considered 'critical'.
.TP 5
.B \ Network
A busy percentage of 90% or higher for the load of an interface is
considered 'critical'.
.TP 5
.B \ Memory
An occupation percentage of 90% is considered 'critical'.
Notice that this occupation percentage is the accumulated memory
consumption of the kernel (including slab) and all processes. The
memory for the page cache ('cache' and 'buff' in the MEM-line) and the
reclaimable part of the slab ('slrec') is not implied!
.br
If the number of pages swapped out ('swout' in the PAG-line) is larger
than 10 per second, the memory resource is considered 'critical'.
A value of at least 1 per second is considered 'almost critical'.
.br
If the committed virtual memory exceeds the limit ('vmcom' and 'vmlim'
in the SWP-line), the SWP-line is colored due to overcommitting the system.
.TP 5
.B \ Swap
An occupation percentage of 80% is considered 'critical'
because swap space might be completely exhausted in the near future.
It is not critical from a performance point-of-view.
.PP
These default values can be modified in the configuration file
(see separate
.B atoprc
man page).
.PP
When a resource exceeds its critical occupation percentage, the concerning
values in the screen line are colored red by default.
.br
When a resource exceeds (by default) 80% of its critical percentage
(so it is almost critical), the concerning values in the screen line
are colored cyan by default. This 'almost critical percentage' (one value
for all resources) can be also modified in the configuration file
(see separate
.B atoprc
man page).
.br
The default colors red and cyan can be modified in the configuration file
as well (see separate
.B atoprc
man page).
.PP
With the key 'x' (or flag \-x), the use of colors can be suppressed
in text mode. The use of colors is however mandatory in case of bar graph mode.
.SH NETATOP OR NETATOP-BPF MODULE
Per-process and per-thread network activity can be measured by the
.I netatop
kernel module or the
.I netatop-bpf
BPF module that can be separately installed.
.br
When
.I atop
gathers counters for a new interval, it verifies if the
.I netatop
or
.I netatop-bpf
module is currently active. If so,
.I atop
obtains the relevant network counters from this module and shows
the number of sent and received packets per process/thread in the generic
screen. Besides, detailed counters can be requested by
pressing the 'n' key.
.br
When the
.I netatopd
daemon is running in combination with the
.I netatop
module,
.I atop
also reads the network counters of exited processes that are logged
by this daemon (comparable with process accounting).
.PP
More information about the optional
.I netatop
kernel module and the
.I netatopd
daemon can be found in the concerning man-pages and on the website
mentioned at the end of this manual page.
.SH GPU STATISTICS GATHERING
GPU statistics can be gathered by
.I atopgpud
which is a separate data collection daemon process.
It gathers cumulative utilization counters of every Nvidia GPU
in the system, as well as utilization counters of every
process that uses a GPU.
When
.I atop
notices that the daemon is active, it reads these GPU utilization
counters with every interval.
The
.I atopgpud
daemon is written in Python, so a Python interpreter should be installed
on the target system.
For the gathering of the statistics, the
.I pynvml
module is used by the daemon. Be sure that this module is installed
on the target system before activating the daemon, by running the
command
.I pip
as root user:
.PP
.B \ pip install nvidia-ml-py
.PP
The
.I atopgpud
daemon is installed by default as part of the
.B atop
package, but it is
.I not
automatically enabled.
The daemon can be enabled and started now by running the following commands
(as root):
.PP
.B \ systemctl enable atopgpu
.br
.B \ systemctl start atopgpu
.PP
Find a description about the utilization counters in the section OUTPUT DESCRIPTION.
.SH INTERACTIVE COMMANDS
When running
.I atop
interactively (no output redirection), keys can be pressed to control the
output. In general, lower case keys can be used to show other information for
the active processes while certain upper case keys can be used to influence the
sort order of the active process/thread list. Some of these keys can also
be used to switch from bar graph mode to particular detailed process information
in text mode.
.PP
.TP 5
.B g
Show generic output (default).
Per process the following fields are shown in case of a window-width
of 80 positions:
process-id, CPU consumption during
the last interval in system and user mode, the virtual and resident
memory growth of the process.
.br
The data transfer per process for read/write on disk can only be shown
when
.I atop
runs with root privileges.
.br
When the optional module
.I netatop
or
.I netatop-bpf
is loaded, the data transfer for send/receive
of network packets is shown for each process.
.br
The last columns contain the state, the occupation percentage for the
chosen resource (default: CPU) and the process name.
When more than 80 positions are available, other information is added.
.PP
.TP 5
.B m
Show memory related output.
Per process the following fields are shown in case of a window width
of 80 positions:
process-id, minor and major
memory faults, size of virtual shared text, total virtual
process size, total resident process size, virtual and resident growth during
last interval, memory occupation percentage and process name.
When more than 80 positions are available, other information is added.
For memory consumption, always all processes are shown (also the processes
that were not active during the interval).
.PP
.TP 5
.B d
Show disk-related output.
When
.I atop
runs with root privileges, the following fields are shown:
process-id, amount of data read from disk, amount of data written to disk,
amount of data that was written but has been withdrawn again (WCANCL),
disk occupation percentage and process name.
.PP
.TP 5
.B n
Show network related output.
Per process the following fields are shown in case of a window width
of 80 positions:
process-id, thread-id,
total bandwidth for received packets,
total bandwidth for sent packets,
number of received TCP packets with the average size per packet (in bytes),
number of sent TCP packets with the average size per packet (in bytes),
number of received UDP packets with the average size per packet (in bytes),
number of sent UDP packets with the average size per packet (in bytes),
the network occupation percentage and process name.
.br
This information can only be shown when the optional module
.I netatop
or
.I netatop-bpf
is installed.
When more than 80 positions are available, other information is added.
.PP
.TP 5
.B s
Show scheduling characteristics.
Per process the following fields are shown in case of a window width
of 80 positions:
process-id,
number of threads in state 'running' (R),
number of threads in state 'interruptible sleeping' (S),
number of threads in state 'uninterruptible sleeping' (D),
number of threads in state 'idle' (I),
scheduling policy (normal timesharing, realtime round-robin, realtime fifo),
nice value, priority, realtime priority, current processor,
status, exit code, state, the occupation percentage for the chosen
resource and the process name.
When more than 80 positions are available, other information is added.
.PP
.TP 5
.B v
Show various process characteristics.
Per process the following fields are shown in case of a window width
of 80 positions:
process-id, user name and group,
start date and time, status (e.g. exit code if the process has finished),
state, the occupation percentage for the chosen resource and the process name.
When more than 80 positions are available, other information is added.
.PP
.TP 5
.B c
Show the command line of the process.
Per process the following fields are shown: process-id,
the occupation percentage for the chosen resource and the
command line including arguments.
.PP
.TP 5
.B G
Show cgroup v2 information.
Show a hierarchical structure of cgroups and related metrics.
Optionally, the processes assigned to each cgroup can be shown.
With the keys/flags '2' till '7' the level depth of the cgroups can
be chosen ('7' is default). With key/flag '8' the assigned active processes
are shown as well, except the kernel processes (usually in the root cgroup).
With key/flag '9' all active processes are shown, including the
kernel processes.
With key/flag 'a' (toggle) all cgroups and all processes are shown
instead of only the active cgroups and processes.
A cgroup is considered inactive when no processes are assigned to that cgroup
nor to the cgroups underneath.
With key/flag 'C' the output is sorted on CPU consumption, with
key/flag 'M' sorted on memory consumption and with
key/flag 'D' sorted on disk consumption.
.PP
.TP 5
.B e
Show GPU utilization.
Per process at least the following fields are shown:
process-id,
range of GPU numbers on which the process currently runs,
GPU busy percentage on all GPUs,
memory busy percentage (i.e. read and write accesses on memory) on all GPUs,
memory occupation at the moment of the sample,
average memory occupation during the sample, and
GPU percentage.
When the
.I atopgpud
daemon does not run with root privileges, the GPU busy percentage and
the memory busy percentage are not available on process level.
In that case, the GPU percentage on process level reflects the
GPU memory occupation instead of the GPU busy percentage (which
is preferred).
.PP
.TP 5
.B o
Show the user-defined line of the process.
In the configuration file the keyword
.I ownprocline
can be specified with the description of a user-defined output-line.
.br
Refer to the man-page of
.B atoprc
for a detailed description.
.PP
.TP 5
.B y
Show the individual threads within a process (toggle).
Single-threaded processes are still shown as one line.
.br
For multi-threaded processes, one line represents the process
while additional lines show the activity
per individual thread (in a different color). Depending on
the option 'a' (all or active toggle), all threads are shown
or only the threads that were active during the last interval.
Depending on the option 'Y' (sort threads), the threads per
process will be sorted on the chosen sort criterium or not.
.br
Whether this key is active or not can be seen in the header line.
.PP
.TP 5
.B Y
Sort the threads per process when combined with option 'y' (toggle).
.PP
.TP 5
.B u
Show the process activity accumulated per user.
Per user the following fields are shown: number of processes active
or terminated during last interval (or in total if combined with command 'a'),
accumulated CPU consumption during last interval in system and user mode,
the current virtual and resident memory space consumed by active processes
(or all processes of the user if combined with command 'a').
.br
When
.I atop
runs with root privileges,
the accumulated read and write throughput on disk is shown.
When the optional module
.I netatop
or
.I netatop-bpf
has been installed,
the accumulated number of received and sent network packets is shown.
.br
The last columns contain the accumulated occupation percentage for the
chosen resource (default: CPU) and the user name.
.PP
.TP 5
.B p
Show the process activity accumulated per program (i.e. process name).
Per program the following fields are shown: number of processes active
or terminated during last interval (or in total if combined with command 'a'),
accumulated CPU consumption during last interval in system and user mode,
the current virtual and resident memory space consumed by active processes
(or all processes of the user if combined with command 'a').
.br
When
.I atop
runs with root privileges,
the accumulated read and write throughput on disk is shown.
When the optional module
.I netatop
or
.I netatop-bpf
has been installed,
the accumulated number of received and sent network packets is shown.
.br
The last columns contain the accumulated occupation percentage for the
chosen resource (default: CPU) and the program name.
.PP
.TP 5
.B j
Show the process activity accumulated per container/pod.
Per container (e.g. Docker/Podman) or pod (e.g. Kubernetes) the following fields
are shown: number of processes active
or terminated during last interval (or in total if combined with command 'a'),
accumulated CPU consumption during last interval in system and user mode,
the current virtual and resident memory space consumed by active processes
(or all processes of the user if combined with command 'a').
.br
When
.I atop
runs with root privileges,
the accumulated read and write throughput on disk is shown.
When the optional module
.I netatop
or
.I netatop-bpf
has been installed,
the accumulated number of received and sent network packets is shown.
.br
The last columns contain the accumulated occupation percentage for the
chosen resource (default: CPU) and the container/pod name (CID/POD).
.PP
.TP 5
.B C
Sort the current list in the order of CPU consumption (default).
The one-but-last column changes to 'CPU'.
.PP
.TP 5
.B E
Sort the current list in the order of GPU utilization
(preferred, but only applicable
when the
.I atopgpud
daemon runs under root privileges) or the order of
GPU memory occupation).
The one-but-last column changes to 'GPU'.
.PP
.TP 5
.B M
Sort the current list in the order of resident memory consumption.
The one-but-last column changes to 'MEM'. In case of sorting on memory,
the full process list will be shown (not only the active processes).
.PP
.TP 5
.B D
Sort the current list in the order of disk accesses issued.
The one-but-last column changes to 'DSK'.
.PP
.TP 5
.B N
Sort the current list in the order of network bandwidth (received
and transmitted).
The one-but-last column changes to 'NET'.
.PP
.TP 5
.B A
Sort the current list automatically in the order of the most busy
system resource during this interval.
The one-but-last column shows either 'ACPU', 'AMEM', 'ADSK' or 'ANET'
(the preceding 'A' indicates automatic sorting-order).
The most busy resource is determined by comparing the weighted
busy-percentages of the system resources, as described earlier in
the section COLORS.
.br
This option remains valid until
another sorting-order is explicitly selected again.
.br
A sorting order for disk is only possible when
.I atop
runs with root privileges.
.br
A sorting order for network is only possible when the optional module
.I netatop
or
.I netatop-bpf
is loaded.
.PP
Miscellaneous interactive commands:
.PP
.TP 5
.B ?
Request for help information (also the key 'h' can be pressed).
.PP
.TP 5
.B V
Request for version information (version number and date).
.PP
.TP 5
.B R
Gather and calculate the proportional set size of processes (toggle).
Gathering of all values that are needed to calculate the PSIZE of a process
is a very time-consuming task, so this key should only be active when
analyzing the resident memory consumption of processes.
.PP
.TP 5
.B W
Get the WCHAN per thread (toggle).
Gathering of the WCHAN string per thread
is a relatively time-consuming task, so this key should only be made
active when analyzing the reason for threads to be in sleep state.
.PP
.TP 5
.B x
Suppress colors to highlight critical resources (toggle).
.br
Whether this key is active or not can be seen in the header line.
.PP
.TP 5
.B z
The pause key can be used to freeze the current situation in order to
investigate the output on the screen. While
.I atop
is paused, the keys described above can be pressed to show other
information about the current list of processes.
Whenever the pause key is pressed again,
atop will continue with a next sample.