-
Notifications
You must be signed in to change notification settings - Fork 4
/
vpmem_hana_startup.py
1607 lines (1433 loc) · 65.2 KB
/
vpmem_hana_startup.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
#!/usr/bin/python3
'''
IBM "vpmem_hana_startup.py": A convenience script to recreate vPMEM or tmpfs
filesystems on boot if required.
This script
- Scans a supplied configuration file to determine:
- the parent UUID of the vPMEM volumes.
- the HANA sid.
- the parent directory under which to mount the vPMEM filesystems.
- Searches the device tree to locate the vPMEM devices associated with the UUID.
- For devices optimized for affinity (the expected case), multiple child
volumes may be discovered.
- Checkes if the devices have a valid file system on them.
- If no valid file systems found, format them with an xfs filesystem.
- Mounts each of the filesystems to a mount point representing their NUMA
associativity.
- Updates the HANA configuration file (global.ini) to reflect where the vPMEM
devices are mounted for each NUMA domain.
Assumptions:
- At least one vPMEM volume has been configured via the Power HMC.
- vPMEM usage is already activated in the HANA ini scripts.
- Default to xfs filesystems: -b 64k -s 512.
Dependencies:
- /sbin/lsprop
- /usr/bin/ndctl
- /usr/bin/numactl
Installation (sample files can be found under https://github.com/IBM/vpmem-hana-startup):
1. Choose a location for the script and config file
(referred to as /mountpoint/path/to below).
2. Create /mountpoint/path/to/vpmem_hana.cfg
[
{
"sid" : "<HANA instance name>"
,"nr" : "<HANA instance number>"
,"hostname" : "<HANA hostname>"
,"puuid" : "<parent vpmem volume uuid>"
,"mnt" : "<filesystem path to mount vpmem filesystems under>"
}
{
"sid" : "<HANA instance name>"
,"nr" : "<HANA instance number>"
,"hostname" : "<HANA hostname>"
,"mnt" : "<filesystem path to mount tmpfs filesystems under>"
}
]
3. Create /etc/systemd/system/vpmem_hana.service taking care of NOTEs below
[Unit]
Description=Virtual PMEM SAP HANA Startup Script
# NOTE: Ensure path to script is mounted. Replace /mountpoint as appropriate
RequiresMountsFor=/mountpoint
[Service]
Type=oneshot
# NOTE: Adjust the path to the startup script. Replace
# /mountpoint/path/to as appropriate
ExecStart=/bin/sh -c "/mountpoint/path/to/vpmem_hana_startup.py -c
/mountpoint/path/to/vpmem_hana.cfg -a -t
/mountpoint/path/to/vpmem_topology.sav"
[Install]
WantedBy=multi-user.target
4. Start service now and on reboot
systemctl start vpmem_hana.service
systemctl status vpmem_hana.service
systemctl enable vpmem_hana.service
'''
import getopt
import logging
import os
import re
import signal
import socket
import sys
import threading
import time
import json
import subprocess
import glob
from enum import IntEnum
import configparser
# Script usage:
USAGE = '''Usage: %s [-c <file>] [-t <file>] [-l <file>] [-s <size>] [-r] [-a] [-n] [-g] [-p] [-v] [-h]
OPTIONS
============ =================================================================
-c <file> Full path configuration file.
-t <file> Full path topology file.
-l <file> Full path log file.
-s <size> Total memory size for tmpfs filesystems in KB, MB, GB or TB,
e.g. 1024KB or 2048GB.
-r Recreate filesystem. This option forces recreation of the
filesystem(s) regardless of whether valid or not.
-a Activate vPMEM usage in HANA ini files. Default: updates only
mount point locations.
-n Filesystem numbering by index. Default is by NUMA node.
-g Record topology to file.
-p List volume parent UUIDs.
-v Print version.
-h Help.'''
# Constants used by the script:
VPMEM_SCRIPT_VERSION = "2.19"
PMEMSS_CMD_TIMEDOUT = "CMD_TIMEDOUT"
VPMEM_MAX_RETRIES = 3
# Absolute paths of tools needed by this script (verified during
# verifyDependencies()):
NDCTL_TOOL_PATH = "/usr/bin/ndctl"
BLKID_TOOL_PATH = "/sbin/blkid"
MKFS_TOOL_PATH = "/sbin/mkfs.xfs"
CAT_TOOL_PATH = "/usr/bin/cat"
LSPROP_TOOL_PATH = "/sbin/lsprop"
NUMACTL_TOOL_PATH = "/usr/bin/numactl"
UMOUNT_TOOL_PATH = "/usr/bin/umount"
DF_TOOL_PATH = "/usr/bin/df"
MKDIR_TOOL_PATH = "/usr/bin/mkdir"
MOUNT_TOOL_PATH = "/usr/bin/mount"
CHOWN_TOOL_PATH = "/usr/bin/chown"
CHMOD_TOOL_PATH = "/usr/bin/chmod"
MOUNTPOINT_TOOL_PATH = "/usr/bin/mountpoint"
RM_TOOL_PATH = "/usr/bin/rm"
# Logger instance this script is using:
vpmemLogger = None
class TL(IntEnum):
'''
Trace level enums to make the trace level during the trace() calls more
readable.
'''
ERROR = 0
WARNING = 1
INFO = 2
DEBUG = 3
CONSOLE = 2
LEVEL = 9
# Global variables used by this script:
# Name of this script.
vpmemScriptName = ''
# Name of the Linux distribution this script is running on.
vpmemDistro = ''
# Path of the log file. If the '-l' option is not used the default path is:
# /tmp/vpmem_hana_startup.<hostname>.log.
vpmemLogFilePath = ''
# Path of the configuration file specified by the '-c' option.
vpmemConfigFile = ''
# Path of the topology file specified by the '-t' option.
vpmemTopologyFile = ''
# Flag ('a' option) to control HANA usage of vPMEM via HANA ini files.
vpmemActivateUsage = False
# Flag ('r' option) to force recreation of the file systems.
vpmemRebuildFS = False
# Flag ('n' option) to numbering the file systems by index or by NUMA
# node ID (default).
vpmemFSSimpleNumbering = False
# Flag to store the topology to the file specified by the '-t' option.
vpmemRecordTopology = False
# Index used during numbering of the file systems when creating those and
# vpmemFSSimpleNumbering is True.
vpmemMntIndex = 0
# Dictionary used when creating the file systems containing the current index
# for file systems which do not belong to a specific NUMA node ("dot"-notation).
vpmemNUMANodeIdDict = {}
class PMEMSSDRFormatter(logging.Formatter):
'''
Derived Formatter to handle debug messages differently.
'''
def __init__(self, debug_fmt=None, fmt=None, datefmt=None):
self.fmt = fmt or "%(message)s"
self.dbg_fmt = debug_fmt or fmt
logging.Formatter.__init__(self, self.fmt, datefmt)
def format(self, record):
tzt = divmod(-(time.altzone if (time.daylight and
time.localtime().tm_isdst > 0) else time.timezone), 60)
timezone = str.format("{0:+03d}{1:02d}", tzt[0] // 60, tzt[1])
record.__dict__["timezone"] = timezone
if record.levelno in [logging.DEBUG, logging.ERROR]:
self._fmt = self.dbg_fmt
else:
self._fmt = self.fmt
return logging.Formatter.format(self, record)
def initLogger(logFilePath):
'''
Initializing the logger used for this script. Default log file path is
'/tmp/vpmem_hana_startup.<hostname>.log' otherwise specified by the user via
the '-l' option.
@param: logFilePath (str): Path to the log file.
@return: logger (logger instance): The logger instance this script is using.
'''
# Create the directory the log file lives in case it does not exist.
path, _ = os.path.split(logFilePath)
if not os.path.isdir(path):
os.mkdir(path)
# Set the time stamp and log line formats
isDSTActive = time.daylight and time.localtime().tm_isdst > 0
utcOffsetSec = -time.altzone if isDSTActive else -time.timezone
utcOffsetHours, utcOffsetMin = divmod(utcOffsetSec // 60, 60)
timezone = "{0:+03d}{1:02d}".format(utcOffsetHours, utcOffsetMin)
basicFormat = "%(asctime)s.%(msecs).3d%(timezone)5s: [%(levelname).1s] " \
"%(threadName)-15s %(message)-50s"
dateFormat = '%Y-%m-%d_%H:%M:%S'
formatter = PMEMSSDRFormatter(fmt=basicFormat, datefmt=dateFormat)
# Create the logger instance
logger = logging.getLogger(vpmemScriptName)
handler = logging.FileHandler(logFilePath)
handler.setFormatter(formatter)
logger.addHandler(handler)
logger.setLevel(logging.DEBUG)
# the default permission is '0o644'
os.chmod(logFilePath, 0o644)
return logger
def trace(level, fmt, *args):
'''
Trace function used troughout the script. Trace level 0 and 1 will go to
stdout and to the log file; all other trace levels only to the log file.
@param: level (TL int enum): Trace level.
@param: fmt (str): Format string.
@param: *args (list): Variable list of arguments.
@return: Nothing.
'''
if TL.CONSOLE >= level:
prefix = ''
if level == TL.ERROR:
prefix = ' ERROR:'
elif level == TL.WARNING:
prefix = ' WARNING:'
print('%s' % prefix, fmt % args)
sys.stdout.flush()
if TL.LEVEL >= level:
if level == TL.ERROR:
vpmemLogger.error(fmt, *args)
elif level == TL.WARNING:
vpmemLogger.warning(fmt, *args)
elif level == TL.INFO:
vpmemLogger.info(fmt, *args)
else: # TL.DEBUG
vpmemLogger.debug(fmt, *args)
def _stop_process(proc, logCmd, timeout):
'''
When a command runs into a timeout the expired timer calls this function to
print some meaningful message to the log file. In addition the function
kills the timed out command by sending a SIGTERM.
'''
try:
if proc.poll() is None:
trace(TL.DEBUG, "Command %s timed out after %s sec. Sending SIGTERM",
logCmd, timeout)
os.kill(proc.pid, signal.SIGTERM) # SIGKILL or SIGTERM
time.sleep(0.5)
if proc.poll() is None:
trace(TL.DEBUG, "Command %s timed out after %s sec. Sending SIGKILL",
logCmd, timeout)
os.kill(proc.pid, signal.SIGKILL)
except Exception as err:
vpmemLogger.exception(err)
def runCmd(args, timeout=42, sh=False, env=None, retry=0):
'''
Execute an external command, read the output and return it.
@param args (str|list of str): Command to be executed.
@param timeout (int): timeout in sec, after which the command is forcefully
terminated.
@param sh (bool): True if the command is to be run in a shell and False if
directly. If the command contains arguments which must be
interpreted by a shell, e.g. wildcards this parameter must
be set to True.
@param env (dict): Environment variables for the new process (instead of
inheriting from the current process).
@param retry (int): Number of retries on command timeout.
@return: (stdout, stderr, rc) (str, str, int): The output of the command.
'''
traceCmd = False
if isinstance(args, str):
logCmd = args
else:
logCmd = ' '.join(args)
try:
if env is not None:
fullenv = dict(os.environ)
fullenv.update(env)
env = fullenv
if sh:
cmd = ' '.join(args)
else:
cmd = args
# create the subprocess, ensuring a new process group is spawned
# so we can later kill the process and all its child processes
proc = subprocess.Popen(cmd, shell=sh,
stdout=subprocess.PIPE, stderr=subprocess.PIPE,
close_fds=False, env=env)
timer = threading.Timer(timeout, _stop_process, [proc, logCmd, timeout])
timer.start()
(sout, serr) = proc.communicate()
timer.cancel() # stop the timer when we got data from process
ret = proc.poll()
except OSError as err:
trace(TL.DEBUG, "%s", str(err))
sout = ""
serr = str(err)
ret = 127 if "No such file" in serr else 255
finally:
try:
proc.stdout.close()
proc.stderr.close()
except: #pylint: disable=bare-except
pass
timeout = ret in (-signal.SIGTERM, -signal.SIGKILL) # 143,137
if ret == -6 and retry >= 0 : # special handling for sigAbrt
trace(TL.DEBUG, "retry abrt %s", args)
(sout, serr, ret) = runCmd(args, timeout, sh, env, -1)
if timeout and retry > 0:
retry -= 1
trace(TL.DEBUG, "Retry command %s counter: %s", args, retry)
(sout, serr, ret) = runCmd(args, timeout, sh, env, retry)
elif timeout:
serr = PMEMSS_CMD_TIMEDOUT
trace(TL.DEBUG, "runCMD: '%s' Timeout:%d ret:%s", args, timeout, ret)
elif traceCmd:
trace(TL.DEBUG, "runCMD: '%s' Timeout:%d ret:%s \n%s \n%s", args, timeout,
ret, serr, sout)
if not ret:
trace(TL.DEBUG, "Run cmd '%s' succeed (ret 0): \n%s\n", args, sout)
else:
trace(TL.DEBUG, "Run cmd '%s' failed (ret %d): \n%s\n", args, ret, serr)
return (sout.decode(), serr.decode(), ret)
def runCmdExitOnError(args, timeout=42, sh=False, env=None, retry=0):
'''
Short form of runCmd(). It takes the same parameters as runCmd() but it
exits immediatly in case runCmd() returned with a error.
@param args (str|list of str): Command to be executed.
@param timeout (int): timeout in sec, after which the command is forcefully
terminated.
@param sh (bool): True if the command is to be run in a shell and False if
directly. If the command contains arguments which must be
interpreted by a shell, e.g. wildcards this parameter must
be set to True.
@param env (dict): Environment variables for the new process (instead of
inheriting from the current process).
@param retry (int): Number of retries on command timeout.
@return: Nothing.
'''
_, _, ret = runCmd(args, timeout, sh, env, retry)
if ret != 0:
errStr = "Command " + str(args) + " failed (rc " + str(ret) + ")."
epilogAndExit(errStr, ret)
def epilogAndExit(epilog, exitCode, printUsage=False):
'''
Helper function to print an epilog and exit.
@param epilog (str): String to be printed.
@param exitCode (int): Exit code this script ends with.
@param printUsage (bool): True: Print script usage to stderr. False: Print
epilog to stdout and to the log file with an additional
hint to the written log file in case the exit code is not
zero.
@return: exitCode (int): Exit code to the caller of this scriopt.
'''
if not printUsage:
if exitCode:
trace(TL.INFO, "%s See log file %s for more details.", epilog,
vpmemLogFilePath)
elif len(epilog) > 0:
trace(TL.INFO, "%s", epilog)
else:
sys.stderr.write(epilog + "\n")
sys.stderr.write(USAGE % sys.argv[0] + "\n")
sys.exit(exitCode)
def unmountCmd(path, loop=False, printErr=True):
'''
Unmount specified file system.
@param: path (str): The path of the file system to unmount.
@param: loop (bool): Flag, to decide if the unmount command should run in a
loop until VPMEM_MAX_RETRIES reached.
@param: printErr (bool): Suppress error message on stdout.
@return: (bool): True if the unmount succeeds otherwise False.
'''
if not os.path.exists(path):
trace(TL.ERROR, "Path %s does not exist", path)
return False
failed = 0
while True:
cmd = [UMOUNT_TOOL_PATH, '-f', path]
_, serr, ret = runCmd(cmd)
if ret == 0:
trace(TL.DEBUG, "Unmounted %s successfully", path)
return True
failed += 1
if printErr:
trace(TL.ERROR, "Unmounting %s failed (rc: %d serr: %s)",
path, ret, serr)
if loop:
time.sleep(1)
if not loop or failed == VPMEM_MAX_RETRIES:
break
return False
def naturalSort(listToSort):
'''
Natural sort function.
@param: listToSort (list): List to sort.
@return: (list): Natural sorted list.
'''
def convert(text):
# Convert text to lower case for further sort processing.
return int(text) if text.isdigit() else text.lower()
def getAlphanumKey(text):
return [convert(c) for c in re.split('([0-9]+)', text)]
if len(listToSort) == 0:
return listToSort
return sorted(listToSort, key=getAlphanumKey)
def getTotalMemory():
'''
Return the total amount of memory in kB. This is used when creating tmpfs
volumes to calculate the size of tmpfs volumes.
@return: memTotalkB (int): Total memory in kB.
'''
memTotalkB = 0
with open('/proc/meminfo', encoding="utf-8") as fh:
memInfo = fh.read()
match = re.search(r'^MemTotal:\s+(\d+)', memInfo)
if match:
memTotalkB = int(match.groups()[0])
trace(TL.DEBUG, "Total memory: %d kB", memTotalkB)
return memTotalkB
def verifyDependencies():
'''
Verify that the installed Python version matches (here: major version 3 or
newer) and that various utilities we need (lsprop, ndctl, ...) are available.
@param: None.
@return: bool: False if the verification steps fail otherwise True.
'''
trace(TL.DEBUG, "Used Python version: '%s'", sys.version)
if sys.version_info.major < 3:
trace(TL.ERROR, "Python version 3 or newer required, but found Python "
"version %d.%d.", sys.version_info.major, sys.version_info.minor)
return False
for toolPath in (NDCTL_TOOL_PATH, BLKID_TOOL_PATH, MKFS_TOOL_PATH,
CAT_TOOL_PATH, LSPROP_TOOL_PATH, NUMACTL_TOOL_PATH,
UMOUNT_TOOL_PATH, DF_TOOL_PATH, MKDIR_TOOL_PATH,
MOUNT_TOOL_PATH, CHOWN_TOOL_PATH, CHMOD_TOOL_PATH,
MOUNTPOINT_TOOL_PATH, RM_TOOL_PATH):
if not os.path.isfile(toolPath):
trace(TL.ERROR, "Required dependency %s not found", toolPath)
return False
return True
def verifyJSON(jsonFilePath):
'''
Verify if the specified file contains a valid JSON structure.
@param: jsonFilePath (str): Path to the JSON based file.
@return: bool: False if the verification step fail otherwise True.
'''
with open(jsonFilePath, encoding="utf-8") as fh:
try:
json.load(fh)
except ValueError as err:
trace(TL.ERROR, "%s is not a valid JSON file (err %s).",
jsonFilePath, err)
return False
return True
def verifyPermissions():
'''
Verify that this script runs as root.
@param: None.
@return: bool: False if the verification step fail otherwise True.
'''
if not os.geteuid() == 0:
trace(TL.ERROR, "This script must be run as root.")
return False
else:
return True
def getValueFromFileWithKeyAndDelimiter(filePath, delimiter, key):
'''
Extracts a value from a flat file of the structure <KEY><DELIMITER><VALUE>,
e.g. PRETTY_NAME="SUSE Linux Enterprise Server 15 SP4"
@param: filePath (str): Path to the file.
@param: delimiter (str): Delimiter between key and value.
@param: key (str): The key for which the value should be returned.
@return: Tuple of (bool, str): True: The returned string is the value for
the specified key. False: The key was not found in the file and the
string 'UNKNOWN' will be returned as value.
'''
myDict = {}
with open(filePath, encoding="utf-8") as file:
for line in file:
if not line.strip():
continue
k, v = line.rstrip().split(delimiter)
myDict[k] = v.strip('"')
if key not in myDict:
return (False, 'UNKNOWN')
else:
return (True, myDict[key])
def verifyAndGetCfgInfo(cfgDict, cfgFilePath):
'''
Verify the configuration values from the passed-in dictionary. The following
configuration values will be verified from the dictionary:
"sid" : "<HANA instance name>"
"nr" : "<HANA instance number>"
"mnt" : "<filesystem path to mount vpmem filesystems under>"
If one of them did not exist the script returns immediatly with an error
(False), which means the current configuration is invalid.
"puuid" : ""
If the puuid does not exist in the configuration file it is assumed that the
file system type if 'tmpfs' otherwise 'vpmem'.
"hostname" : "<hostname of the host HANA is installed on>"
If the hostname does not exist in the configuration file this function
extracts it from the OS via a socket system call.
@param: cfgDict (dict): Dictionary containing the configuration.
@param: cfgFilePath (str): Path to the configuration file.
@return: Tuple of (bool, dict): False, if an error occurred. In this case
the returned dictionary cannot be used. True, if the verification
succeed and the dictionary with the configuration values consisting
of key-value tuples.
'''
trace(TL.INFO, "Verifying configuration parameters in %s.", cfgFilePath)
if "sid" not in cfgDict:
trace(TL.ERROR, "SID not specified in configuration file. Keyword: "
"'sid'")
return (False, cfgDict)
if "nr" not in cfgDict:
trace(TL.ERROR, "Instance number not specified in configuration file. "
"Keyword: 'nr'")
return (False, cfgDict)
if "mnt" not in cfgDict:
trace(TL.ERROR, "Filesystem mountpoint not specified in configuration "
"file. Keyword: 'mnt'")
return (False, cfgDict)
# Get the sid and convert it to upper and lower case for later use
sid = cfgDict.get("sid")
cfgDict["siduc"] = sid.upper()
cfgDict["sidlc"] = sid.lower()
puuids = cfgDict.get("puuid")
if puuids is not None:
# Convert a single puuid string into a list of puuids for further
# processing.
if isinstance(puuids, str):
puuids = [puuids]
cfgDict["puuid"] = puuids
for puuid in puuids:
if (len(puuid) != 36 or
not re.match(r"[A-F0-9a-f]{8}-[A-F0-9a-f]{4}-[A-F0-9a-f]{4}-" \
"[A-F0-9a-f]{4}-[A-F0-9a-f]{12}\}?$", puuid)):
trace(TL.ERROR, "Invalid UUID specified: '%s'.", puuid)
return (False, cfgDict)
trace(TL.INFO, "Valid UUID found: '%s'.", puuid)
trace(TL.INFO, "Config parameter (puuid): UUID=%s", puuids)
cfgDict['type'] = "vpmem"
else:
cfgDict['type'] = "tmpfs"
# Accept 'host' as well as 'hostname' key for hostname configuration entry.
# In case both do not exist in the configuration file get it via the socket
# modul.
if cfgDict.get("host") is None:
if cfgDict.get("hostname") is None:
trace(TL.DEBUG, "Hostname not specified in script configuration "
"file. Keyword: 'host' or 'hostname'. Evaluating via OS.")
cfgDict['hostname'] = socket.gethostname().split('.')[0]
else:
cfgDict['hostname'] = cfgDict.pop('host') # Rename key to 'hostname'
trace(TL.INFO, "Config parameter (sid): HANA SID=%s", cfgDict.get("sid"))
trace(TL.INFO, "Config parameter (nr): HANA instance no=%s", cfgDict.get("nr"))
trace(TL.INFO, "Config parameter (mnt): Mount point=%s", cfgDict.get("mnt"))
trace(TL.INFO, "Config parameter (host OR hostname): Hostname=%s", cfgDict.get("hostname"))
trace(TL.INFO, "Config parameter (type): File system type=%s", cfgDict.get("type"))
return (True, cfgDict)
def convertMemorySizeToKB(memSize):
'''
Converts memory size into KB.
@param: memSize (str): Memory size to be converted.
@return: memSizeInKB (int): Converted memory size in KB.
'''
units = {"KB": 1, "MB": 2**10, "GB": 2**20, "TB": 2**30}
memSizeUC = memSize.strip().upper()
memSizeInKB = 0
if not re.match(r' ', memSizeUC):
memSizeUC = re.sub(r'([KMGT]?B)', r' \1', memSizeUC)
(number, unit) = [str.strip() for str in memSizeUC.split()]
memSizeInKB = int(float(number) * units[unit])
trace(TL.DEBUG, "Converted '%s' to %d kB.", memSizeUC, memSizeInKB)
return memSizeInKB
def calcTmpFSMemSizes(totalMemSizeReq, NUMANodeList):
'''
Verify the specfied total tmpfs memory size and calculate the portions of
tmpfs volume sizes based on the number of NUMA node and the total memory
size available for every single NUMA node.
@param: totalMemSizeReq (str): Total tmpfs memory size specified by the
caller of this script by using the '-s' option.
@param: NUMANodeList (list): List of NUMA node Ids used to evaluate the
total memory size for every single NUMA node.
@return: tuple of (valid, memSizePerNUMANodeDict) (bool, dict): valid is
True if the passed-in totalMemSize is valid and all sanity checks
succeeded otherwise False. In case valid is True
memSizePerNUMANodeDict contains for every single NUMA node its
memory size in KB otherwise an empty dictionary. The sum of all
memory sizes in this dictionary is equal totalMemSizeReq. The
partial memory sizes will be calculated based on the total memory
available for every single NUMA node.
'''
memSizePerNUMANodeDict = {}
# Empty string means the caller of this script did not specify the total
# tmpfs memory size; just return.
if not totalMemSizeReq:
return (True, memSizePerNUMANodeDict)
# Is the specified size starting with a digit?
if not totalMemSizeReq.strip()[0].isdigit():
trace(TL.ERROR, "Invalid format of memory size: '%s' expected: "
"<INTEGER>KB/MB/TB", totalMemSizeReq)
return (False, memSizePerNUMANodeDict)
totalMemSizekB = getTotalMemory()
totalMemSizeReqkB = convertMemorySizeToKB(totalMemSizeReq)
if totalMemSizeReqkB >= totalMemSizekB:
trace(TL.ERROR, "Total memory size requested (%d kB) for tmpfs "
"equal or greater total memory available (%d kB).",
totalMemSizeReqkB, totalMemSizekB)
return (False, memSizePerNUMANodeDict)
# Print a warning, if the requested memory size is greater/equal 90% of the
# total memory size configured for the LPAR to give the caller a hint that
# the requested tmpfs size is close to the total memory size.
if totalMemSizeReqkB >= round(0.9 * totalMemSizekB):
trace(TL.WARNING, "Total memory size requested (%d kB) for tmpfs "
"reaches 90%% of the total memory available (%d kB).",
totalMemSizeReqkB, totalMemSizekB)
trace(TL.DEBUG, "Parse and verify memory size: totalMemSize (kB): %d "
"totalMemSizeReq (kB): %d, NUMA node list: %s", totalMemSizekB,
totalMemSizeReqkB, NUMANodeList)
# Get the total memory size of every NUMA node.
cmd = [NUMACTL_TOOL_PATH, '-H']
sout, _, ret = runCmd(cmd)
if len(sout) == 0 or ret != 0:
trace(TL.ERROR, "%s failed (rc: %d).", cmd, ret)
return (False, memSizePerNUMANodeDict)
regex = r'node (\d+) size: (\d+) (\w+)'
matches = re.findall(regex, sout)
if not matches:
trace(TL.ERROR, "Unexpected output returned by %s.", cmd)
return (False, memSizePerNUMANodeDict)
totalMemDictPerNUMA = dict((nodeId, (totalMem, suffix))
for (nodeId, totalMem, suffix) in matches)
# Iterate over all NUMA nodes and calculate the memory size fraction for
# this node based on the total memory for this node.
sumMemSizeAllNUMANodes = 0
for nodeId in NUMANodeList:
if nodeId not in totalMemDictPerNUMA.keys():
trace(TL.ERROR, "NUMA node 'node%s' not in output of %s listed.",
nodeId, cmd)
return (False, memSizePerNUMANodeDict)
totalMemSizePerNUMAkB = convertMemorySizeToKB(totalMemDictPerNUMA[nodeId][0] +
totalMemDictPerNUMA[nodeId][1])
memSizeReqPerNUMAkB = round((float(totalMemSizePerNUMAkB / totalMemSizekB)) *
totalMemSizeReqkB)
trace(TL.DEBUG, "NUMA node: %s total memory size: %s kB size tmpfs "
"for this node: %d kB", nodeId, totalMemSizePerNUMAkB,
memSizeReqPerNUMAkB)
memSizePerNUMANodeDict[nodeId] = str(memSizeReqPerNUMAkB)
sumMemSizeAllNUMANodes += memSizeReqPerNUMAkB
trace(TL.DEBUG, "totalMemSizeReq (kB): %d, sumMemSizeAllNUMANodes (kB): %d "
"diff (kB): %d", totalMemSizeReqkB, sumMemSizeAllNUMANodes,
(totalMemSizeReqkB - sumMemSizeAllNUMANodes))
return (True, memSizePerNUMANodeDict)
def listFileSystemSummary(sid, fileSystemList, NUMANodeList):
'''
Print a summary for the vPMEM file systems by using the Linux df command.
@param: sid (str): The HANA instance ID the vPMEM volumes belong to.
@param: fileSystemList (list): List of file system volumes to print.
@param: NUMANodeList (list): List of NUMA node Ids.
@return: (bool): Always True
'''
trace(TL.DEBUG, "List vPMEM summary enter; SID: %s file system list: %s "
"NUMA node list: %s", sid, fileSystemList, NUMANodeList)
if len(fileSystemList) == 0:
trace(TL.INFO, "No file system volume(s) defined.")
return True
if len(fileSystemList) != len(NUMANodeList):
trace(TL.ERROR, "Error: NUMA node and file system list are of unequal "
"length. Dubious results.")
else:
trace(TL.INFO, "File system summary:\n")
# Print header
print("Instance: %3s" % sid)
print("%4s %9s %9s %9s %s" % ("Numa", "", "", "Percent", ""))
print("%4s %9s %9s %9s %s" % ("Node", "Available", "Used", "Used",
"Mountpoint"))
print("%4s %9s %9s %9s %s" % ("----", "---------", "---------",
"---------",
"------------------------------------"))
# Get and print file system disk space usage
for fs, nn in zip(fileSystemList, NUMANodeList):
cmd = [DF_TOOL_PATH, '-h', '--output=avail,used,pcent', fs]
sout, _, ret = runCmd(cmd)
if len(sout) > 0 and ret == 0:
# The df command returns something like:
# Avail Used Use%
# 498G 1015M 1%
# hence, skip the header of df output by getting just the 2nd
# line.
for line in sout.split('\n'):
if line.strip().lower().startswith("avail") or \
len(line.strip()) == 0:
continue
(avail, used, pcent) = line.strip().split()
print("%4s %9s %9s %9s %s" % (nn, avail, used, pcent, fs))
print("\n") # Nicer output
return True
def listPUUIDs():
'''
Print a list of PUUID's to stdout by using the lsprop command in combination
with a regular expression.
The regular expression extracts the region AND the PUUID from the output
returned by the lsprop command.
Example for the region substring: 'region123'
Example for the PUUID substring: '4d1c54f4-1a75-4e4c-817e-bdb65222c601'
'''
regex = r"region[0-9]+|[A-F0-9a-f]{8}-[A-F0-9a-f]{4}-[A-F0-9a-f]{4}-" \
"[A-F0-9a-f]{4}-[A-F0-9a-f]{12}"
# Command must run in a shell, because the command list contains wildcards
# which must be interpreted by a shell
cmd = [LSPROP_TOOL_PATH, '/sys/devices/ndbus*/region*/of_node/ibm,unit-parent-guid']
sout, _, ret = runCmd(cmd, sh=True)
if ret == 0:
if len(sout) > 0:
matches = re.findall(regex, sout)
# Print header
print("%10s %4s %13s %s" % ("vPMEM ", "Numa", "", ""))
print("%10s %4s %13s %s" % ("Region ", "Node", "Size", "Parent UUID"))
print("%10s %4s %13s %s" % ("----------", "----", "-------------",
"------------------------------------"))
# Convert list into dictionary for further processing
regionDict = dict(map(lambda i: (matches[i], matches[i+1]),
range(len(matches)-1)[::2]))
sortedKeys = naturalSort(regionDict.keys())
for region in sortedKeys:
puuid = regionDict[region]
cmd = [CAT_TOOL_PATH, '/sys/devices/ndbus*/' + region + '/size']
sout, _, ret = runCmd(cmd, sh=True)
if len(sout) > 0 and ret == 0:
size = sout.strip()
cmd = [CAT_TOOL_PATH, '/sys/devices/ndbus*/' + region + '/numa_node']
sout, _, ret = runCmd(cmd, sh=True)
if len(sout) > 0 and ret == 0:
numanode = sout.strip()
print("%10s %4s %13s %s" % (region, numanode, size, puuid))
else:
trace(TL.INFO, "No volume parent UUIDs found.")
return True
def getTmpFSMnts(mntParent, sid):
'''
Prints the tmp file system volumes to stdout by using the /proc/mounts
entries.
@param: mntParent (str): Base mount point.
@param: sid (str): HANA instance ID the tmpfs volumes belong to.
@return: myFileSystemList (list): List of tmpfs file systems identified by
this function.
'''
myFileSystemList = []
myNUMANodeList = []
trace(TL.DEBUG, "Get tmpfs mounts enter; SID: %s mount parent: %s", sid, mntParent)
printedHeader = False
mntParent = mntParent + '/' + sid.upper()
mounts = None
with open('/proc/mounts','r', encoding="utf-8") as file:
mounts = [line.split() for line in file.readlines()]
for mount in mounts:
if len(mount) < 6 or not mount[1].startswith(mntParent) or \
mount[2] != 'tmpfs' or sid not in mount[1]:
continue
if not printedHeader:
trace(TL.INFO, "Following existing tmpfs file systems have been found:\n")
print("Name Mountpoint "
"Type Options Node")
print("------------ ------------------------------------------------ "
"-------- --------------------------------------------------------- ----")
printedHeader = True
(mntname, mntpoint, mnttype, mntopts, _, _) = mount[:]
node = ''
if "prefer" in mntopts:
match = re.search(r'prefer:(\d+)', mntopts) # Extracting the node
if match:
node = match.group(1)
print(' '.join([mntname.rjust(12), mntpoint.rjust(48),
mnttype.rjust(8), mntopts.rjust(57), node.rjust(4)]))
myFileSystemList.append(mntpoint)
myNUMANodeList.append(node)
myFileSystemList = naturalSort(myFileSystemList)
if not myFileSystemList:
trace(TL.INFO, "No tmpfs file system found.")
else:
print("\n") # Nicer output
trace(TL.DEBUG, "Get tmpfs mounts exit; file system list: %s NUMA node list: %s",
myFileSystemList, myNUMANodeList)
return myFileSystemList
def getNUMANodes(recordTopology, topologyFilePath):
''' Extracts the NUMA node Id's from the sys file system and puts them into
a list for further processing. The functions returns two lists, current
found NUMA nodes and the previous list of NUMA nodes. If recordTopology
is True the extracted NUMA node Id's will be stored in the file
specified by topologyFilePath.
@param: recordTopology (bool): Flag to store the extracted NUMA topology
to a file.
@param: topologyFilePath (str): Path to file to which the NUMA topology
will be stored in case recordTopology is True. When
recordTopology is False prevNUMANodeIds will contain the
(previous) NUMA node IDs read from the this topolopy file.
@return: Tuple of curNUMANodeIds (list): List of current NUMA node IDs
and prevNUMANodeIds (list): List of previous NUMA node IDs in
case the topologyFilePath exists and recordTopology is False.
'''
curNUMANodeIds = []
prevNUMANodeIds = []
trace(TL.INFO, "Extracting NUMA nodes.")
SYS_PATH = "/sys/devices/system/node/node"
regEx = r"" + SYS_PATH + "(\d+)"
for nodepath in glob.glob(SYS_PATH + '*'):
match = re.findall(regEx, nodepath)
cmd = ['compgen', '-G', nodepath + '/memory*']
# Command must run in a shell, because compgen is a shell buildin cmd
_, _, ret = runCmd(cmd, sh=True)
if len(match) and ret == 0:
trace(TL.DEBUG, "Found NUMA node with id=%s (rc: %d)", match[0], ret)
curNUMANodeIds.append(match[0])
if len(curNUMANodeIds) > 0:
curNUMANodeIds = naturalSort(curNUMANodeIds)
trace(TL.INFO, "Found %d NUMA nodes.", len(curNUMANodeIds))
if recordTopology:
trace(TL.INFO, "Writing %d NUMA node IDs to %s.",
len(curNUMANodeIds), topologyFilePath)
trace(TL.DEBUG, "NUMA node IDs: %s", curNUMANodeIds)
with open(topologyFilePath, 'w', encoding="utf-8") as file:
file.write(' '.join(curNUMANodeIds) + '\n')
else:
if (len(topologyFilePath) and os.path.exists(topologyFilePath)):
with open(topologyFilePath, 'r', encoding="utf-8") as file:
lines = file.readlines()
for line in lines:
numaNodeList = line.rstrip().split(' ')
prevNUMANodeIds.extend(numaNodeList)
prevNUMANodeIds = naturalSort(prevNUMANodeIds)
trace(TL.DEBUG, "Previous NUMA node list: %s read from: %s",
prevNUMANodeIds, topologyFilePath)
trace(TL.DEBUG, "Found %d NUMA nodes with id(s): %s", len(curNUMANodeIds),
str(curNUMANodeIds))
return (curNUMANodeIds, prevNUMANodeIds)
def removeTmpfsAllMnts(fileSystemList):
'''
Remove all tmpfs file systems by using the Linux umount command.
@param: fileSystemList (list): List of tmpfs file systems to be removed.
@return: True
'''
trace(TL.DEBUG, "Remove tmpfs mounts enter; file system list: %s",
fileSystemList)
failed = 0
for fs in fileSystemList:
if not unmountCmd(fs):
failed += 1
trace(TL.DEBUG, "Unmounted tmpfs; succeeded: %d failed: %d",
(len(fileSystemList) - failed), failed)
trace(TL.INFO, "Removed %d tmpfs file systems.",
len(fileSystemList) - failed)
return True
def createTmpFSMounts(sid, mntparent, numaNodes, memSizePerNUMANode):
'''
Create tmpfs file systems by using the Linux commands:
-mkdir: to create the mount point the tmpfs volume lives
-mount: to create the tmpfs volume
-chown: to set the owner to the tmpfs volume
-chmod: to set the permissions for the tmpfs volume
In case one of these commands fail the script returns immediatly with an
error and exits (with on exit code) otherwise it adds the created tmpfs file
system to myFileSystemList which will be returned at the end.
@param: sid (str): HANA instance ID the tmpfs file systems created for.
@param: mntparent (str): Base path for the mount point.
@param: numaNodes (list): List of NUMA node IDs the tmpfs file system
volumes belong to.
@param: memSizePerNUMANode (dict): Memory size for tmpfs volumes of every
single NUMA node in kB.
@return: myFileSystemList (list): List of tmpfs file systems created by this
function.
'''
trace(TL.DEBUG, "Create tmpfs mounts SID: %s mount parent: %s NUMA nodes: %s",
sid, mntparent, numaNodes)
myFileSystemList = []
if not sid or not mntparent or not numaNodes:
trace(TL.ERROR, "HANA SID or mount point or NUMA node list empty.")
return myFileSystemList
siduc = sid.upper()
sidlc = sid.lower()
basePath = mntparent + '/' + siduc + '/' + "node"
# Split the default size of tmpfs (50%) accross all NUMA nodes to avoid
# overbooking the memory by tmpfs when not using the size option.
if not memSizePerNUMANode:
memTotalkB = getTotalMemory()
sizeOpt = str(round((memTotalkB/2)/len(numaNodes))) + 'k'
trace(TL.DEBUG, "Size for tmpfs volume(s): %sB", sizeOpt)
# ATTENTION: numaNodes must be an array of grouped digits
for nodeId in numaNodes:
fs = basePath + nodeId
if memSizePerNUMANode:
sizeOpt = memSizePerNUMANode[nodeId] + 'k'
trace(TL.DEBUG, "Size for tmpfs volume for NUMA node %s: %sB",
nodeId, sizeOpt)