HCQC is a tool for investigating the quality of code generation of the kernel part of the HPC application by the compiler.
Many HPC applications have a few hot spots which are a very narrow range of code, consisting of one function or several consecutive loops. These hot spots occupy most of the program execution time. Therefore, the quality of the code of the hot spot is important for performance.
HCQC is a program for collecting metric data for investigating the quality of hotspot code by some compilers for registered test programs. There is not much meaning with just one compiler's data, but it is meaningful to compare the results of multiple compilers. A typical comparison method is as follows.
-
On Architecture
A, CompilerXvs. CompilerYThis can evaluate the advantages and disadvantages of compilers
XandY. -
Compiler
XversionVvs. CompilerXversionWThis can check the effect of changes in compiler version.
-
Compiler
Xon ArchitectureAvs. ArchitectureBThis can confirm the lack of compiler
X's features when the architecture changes. -
Compiler
Xon ArchitectureAvs. CompilerYon ArchitectureBIf the architecture
Ais new and the architectureBis mature, this comparison will provide important information on compilerX's enhancement.
HCQC is a tool to help improve the performance of hot spots.
HCQC currently mainly deals with GCC or Clang/LLVM on Linux of 64 bit ARM architecture(AArch64). For other architectures or compilers, see How to Add New Architectures or Compilers.
In the following, ${INSTALL_DIRECTORY} shows the directory where hcqc exists.
To execute HCQC, it is necessary to define a compiler and command line options for the compiler to be investigated.
The definition of the investigation target is described in the configuration file of JSON format placed in directory ${INSTALL_DIRECTORY}/hcqc/config.
For example, if you want to investigate the optimization level -O2 of the GCC whose version is 7.1.1, whose absolute path is /usr/bin/gcc, then the configuration file should be written as follows:
{
"DISTRIBUTION" : "OpenSUSE Tumbleweed",
"ARCH" : "aarch64",
"CPU" : "AMD Opteron A1100 Cortex A57",
"LANGUAGE" : "C",
"COMPILER" : "GCC",
"COMMAND" : "/usr/bin/gcc",
"VERSION" : "7.1.1",
"OPT_FLAGS" : ["-O2"],
"ASM_FLAGS" : ["-S", "-fverbose-asm"],
"FLAG_DB" : [["?DEBUG_FLAG", "-g"],
["?C99_STANDARD", "-std=c99"]]
}
Explanation of each field of the configuration file is described in How to Create New Configuration Files.
The name of the configuration file including this definition is gcc-config.json in the following.
To investigate the quality of the defined configuration, it is necessary to compile and execute test programs and collect data using the configuration file.
All test programs exist under directory ${INSTALL_DIRECTORY}/hcqc/test-program.
In the following description, it is assumed that sample test program is used as a test program.
To investigate the quality of the compiler, it is necessary to specify a metric indicating the type of data collection.
In this case, the metric criteria kind for taking statistics of the mnemonic type, which is included in the function which includes hot spots, is used as an example.
Regarding the configuration file gcc-config.json, to collect data of the measurement standard kind from the test program sample, execute the following command:
% cd ${INSTALL_DIRECTORY}/hcqc
% ./command/hcqc gcc-config sample kind
By executing this command, the following processing is executed.
(1) Using the compiler and compile options specified in the configuration file, compile and execute the test program sample and confirm that the execution result is correct.
All the files generated by this work are placed under the following directory:
${INSTALL_DIRECTORY}/hcqc/work/sample/gcc-config/kind
(2) From the assembly code generated by compiling the kernel part of the test program sample,
a control flow graph of the kernel part is created.
Based on the control flow graph, a result file gcc-config--sample--kind.json which is statistical data of the mnemonic based on the measurement standard kind is created.
The content of the result file gcc-config--sample--kind.json is, for example, as follows:
[
[ "TITLE", ["CFG", "SIZE", "DEPTH", "memory", "branch", "other"]],
[ "kernel cond .L1", [ "2", "0", "0", "1", "1"]],
[ " ", [ "7", "0", "0", "0", "7"]],
[ ".L5 ", [ "1", "1", "0", "0", "1"]],
[ ".L4 ", [ "4", "2", "1", "0", "3"]],
[ ".L3 cond .L3", [ "8", "3", "3", "1", "4"]],
[ " cond .L4", [ "3", "2", "0", "1", "2"]],
[ " cond .L5", [ "6", "1", "0", "1", "5"]],
[ ".L1 end", [ "1", "0", "0", "1", "0"]],
[ "*SUMMARY*", [ "32", "-", "4", "5", "23"]]]
All the result files generated using the test program sample are placed under the following directory:
${INSTALL_DIRECTORY}/hcqc/result/sample
The execution result of command ./command/hcqc is data in JSON format.
To make them easier to see, you can use the command ./command/hcqc-report.
For example, the following command:
% ./command/hcqc-report gcc-config sample R0 kind
creates a report file gcc-config--sample--R0.csv with CSV format in the following directory:
${INSTALL_DIRECTORY}/hcqc/report/sample
The contents of the report file are as follows, for example:
CFG,SIZE,DEPTH,memory,branch,other
kernel cond .L1,2,0,0,1,1
,7,0,0,0,7
.L5 ,1,1,0,0,1
.L4 ,4,2,1,0,3
.L3 cond .L3,8,3,3,1,4
cond .L4,3,2,0,1,2
cond .L5,6,1,0,1,5
.L1 end,1,0,0,1,0
*SUMMARY*,32,-,4,5,23
By executing the following command, all generated files under work or report directories except the result data files under result directory can be deleted.
% cd ${INSTALL_DIRECTORY}/hcqc
% ./clean-all.sh
To delete all the generated files including the result data files under result directory,
Execute the following command:
% cd ${INSTALL_DIRECTORY}/hcqc
% ./realclean-all.sh
The HCQC acquires the test program data on the specified configuration file in the following procedure. In the following, when using the configuration file
${INSTALL_DIRECTORY}/hcqc/config/CONFIG.json
the test program
${INSTALL_DIRECTORY}/hcqc/test-program/sample
and the metric program name kind, the execution of HCQC is as follows:
% cd ${INSTALL_DIRECTORY}/hcqc
% ./command/hcqc CONFIG sample kind
The workflow executing this command is as follows:
First, HCQC opens the configuration file
${INSTALL_DIRECTORY}/hcqc/config/CONFIG.json
and reads each field of the JSON format file. The configuration file defines the compiler to be investigated and optimization options, and includes, for example, the following contents:
{
"DISTRIBUTION" : "OpenSUSE Tumbleweed",
"ARCH" : "aarch64",
"CPU" : "AMD Opteron A1100 Cortex A57",
"LANGUAGE" : "C",
"COMPILER" : "GCC",
"COMMAND" : "/usr/bin/gcc",
"VERSION" : "7.1.1",
"OPT_FLAGS" : ["-O2"],
"ASM_FLAGS" : ["-S", "-fverbose-asm"],
"FLAG_DB" : [["?DEBUG_FLAG", "-g"],
["?C99_STANDARD", "-std=c99"]]
}
The meaning of each field of the configuration file is as follows:
DISTRIBUTION: distribution of OS(Currently unused)ARCH: the machine hardware name(Must matchuname -m)CPU: the CPU name(Currently unused)LANGUAGE: target programming language(Must matchLANGUAGEof test programs)COMPILER: the name of compilerCOMMAND: full path name of the compiler commandVERSION: the compiler version number(Must match--versionresult)OPT_FLAGS: compiler options to be investigatedASM_FLAGS: compiler options for generating assembly codeFLAG_DB: definition of flag variables used in program information file
From the field information of FLAG_DB, HCQC creates a flag replacement map.
For example, the definition of the field of FLAG_DB:
"FLAG_DB" : [["?DEBUG_FLAG", "-g"],
["?C99_STANDARD", "-std=c99"]]
generates the map:
{ "?DEBUG_FLAG" : "-g",
"?C99_STANDARD" : "-std=c99" }
Next, the program information file
${INSTALL_DIRECTORY}/hcqc/test-program/sample/program-info.json
for the specified test program sample is opened and each field of the JSON format file is read.
The program information file includes information for compiling and executing the test program.
The contents are, for example, as follows:
{
"LANGUAGE" : "C",
"MAIN_FLAGS" : ["?DEBUG_FLAG", "?C99_STANDARD"],
"KERNEL_FLAGS" : ["-DFAST", "?C99_STANDARD"],
"LINK_FLAGS" : ["?C99_STANDARD"],
"LIB_LIST" : ["-lm"],
"MAIN_FILENAME" : "main.c",
"KERNEL_FILENAME" : "kernel.c",
"KERNEL_FUNCTION_NAME" : "kernel",
"INPUT" : [ "STDIN", "in.data" ],
"OUTPUT" : [ "STDOUT", "out.data" ]
}
The meaning of each field is as follows:
LANGUAGE: programming language describing the test programMAIN_FLAGS: compile options for compiling files in the main partKERNEL_FLAGS: compile options for compiling files in the kernel partLINK_FLAGS: link options for generating executable filesLIB_LIST: library specification options used to generate executable fileMAIN_FILENAME: the file name of the main partKERNEL_FILENAME: the file name of the kernel partKERNEL_FUNCTION_NAME: the name of kernel functionINPUT: specification of input data for executable fileOUTPUT: specification of output data for executable file
Using these pieces of information, HCQC compiles and executes the test program, and verifies the result. First, it compiles the main file and generates the object file. At this time, HCQC uses the compiler which is specified in the COMMAND field of the configuration file. That is, the command to compile the file of the main part has the following form:
% COMMAND MAIN_FILENAME MAIN_FLAGS -c -o RESULT_FILENME
Here, RESULT_FILENME is the file name obtained by converting the suffix of MAIN_FILENAME into .o.
At this time, the flag variable in MAIN_FLAGS is replaced by using the FLAG_DB information in the configuration file.
The command actually executed is, for example, as follows:
% /usr/bin/gcc ${INSTALL_DIRECTORY}/hcqc/test-program/sample/main.c \
-g -std=c99 -c -o ${INSTALL_DIRECTORY}/hcqc/work/sample/CONFIG/kind/main.o
Similarly, HCQC compiles the file of the kernel part with the following command:
% COMMAND KERNEL_FILENAME KERNEL_FLAGS OPT_FLAGS -c -o RESULT_FILENME
At this time, the command actually executed is, for example, as follows:
% /usr/bin/gcc ${INSTALL_DIRECTORY}/hcqc/test-program/sample/kernel.c \
-DFAST -std=c99 -O2 -c -o ${INSTALL_DIRECTORY}/hcqc/work/sample/CONFIG/kind/kernel.o
HCQC executes the following command to create an executable file from generated object files:
% COMMAND LINK_FLAGS RESULT_FILENAME_LIST -o EXEC_FILENAME LIB_LIST
At this time, the command actually executed is, for example, as follows:
% /usr/bin/gcc -std=c99 ${INSTALL_DIRECTORY}/hcqc/work/sample/CONFIG/kind/main.o \
${INSTALL_DIRECTORY}/hcqc/work/sample/CONFIG/kind/kernel.o \
-o ${INSTALL_DIRECTORY}/hcqc/work/sample/CONFIG/kind/a.out -lm
Next, HCQC executes the created executable file with the specified input data and verifies the result. In the sample program information file, the specification of the input data is as follows:
"INPUT" : [ "STDIN", "in.data" ]
This description means that HCQC executes the program by inputting the data of the file in.data to the standard input.
In the sample program information file, the specification of the output data is as follows:
"OUTPUT" : [ "STDOUT", "out.data" ]
This description means that HCQC verifies the execution result by comparing the result of the standard output with the content of the file out.data.
At this time, the command actually executed is, for example, as follows:
% ${INSTALL_DIRECTORY}/hcqc/work/sample/CONFIG/kind/a.out \
< in.data \
> ${INSTALL_DIRECTORY}/hcqc/work/sample/CONFIG/kind/out.data
% /usr/bin/diff \
${INSTALL_DIRECTORY}/hcqc/work/sample/CONFIG/kind/out.data \
${INSTALL_DIRECTORY}/hcqc/test-program/sample/out.data
After confirming that the execution result is correct, HCQC executes the metric program and obtain data related to the test program. For this purpose, HCQC generates an assembly code file of the kernel part of the test program. To do this, HCQC executes the next command which changed part of the command to compile the kernel part.
% COMMAND KERNEL_FILENAME KERNEL_FLAGS OPT_FLAGS ASM_FLAGS -o ASM_FILENAME
At this time, the command actually executed is, for example, as follows:
% /usr/bin/gcc ${INSTALL_DIRECTORY}/hcqc/test-program/sample/kernel.c \
-DFAST -std=c99 -O2 -S -fverbose-asm \
-o ${INSTALL_DIRECTORY}/hcqc/work/sample/CONFIG/kind/kernel.s
When execution of the test program is confirmed, HCQC executes the metric program. When the metric program name is M, HCQC executes the set of scripts
${INSTALL_DIRECTORY}/hcqc/command/metric/M/M000.py
${INSTALL_DIRECTORY}/hcqc/command/metric/M/M001.py
....
${INSTALL_DIRECTORY}/hcqc/command/metric/M/M999.py
under the directory
${INSTALL_DIRECTORY}/hcqc/command/metric/M
in order.
HCQC uses the field information read from the configuration file, evaluates the match_p method in each script, and executes a script whose result is True.
If the evaluation result of the match_p method is False, HCQC stops executing the script and tries the next script.
The method of acquiring metric information from the target test program differs depending on the target architecture and compiler. Therefore, it is necessary to select a script suitable for the specified configuration file and test program.
The example execution command is as follows:
% ./command/hcqc CONFIG sample kind
then the metric program name is kind.
Therefore, HCQC executes the script that exists under the directory
${INSTALL_DIRECTORY}/hcqc/command/metric/kind
This directory has only the following script file:
${INSTALL_DIRECTORY}/hcqc/command/metric/kind/kind999.py
and, HCQC executes this script.
Generally, each metric program performs the following operations.
(1) Create a control flow graph from the generated assembly code file.
In this example, a control flow graph is created from the file
${INSTALL_DIRECTORY}/hcqc/work/sample/CONFIG/kind/kernel.s
and the file
${INSTALL_DIRECTORY}/hcqc/work/sample/CONFIG/kind/kernel.s.dot
which represents the control flow graph is created.
This file contains information for Graphviz dot command.
In this example, the control flow graph is as follows:
The information of this control flow graph corresponds to the following part of the result information file.
[ "TITLE", ["CFG", "SIZE", "DEPTH",
[ "kernel cond .L1", [ "2", "0",
[ " ", [ "7", "0",
[ ".L5 ", [ "1", "1",
[ ".L4 ", [ "4", "2",
[ ".L3 cond .L3", [ "8", "3",
[ " cond .L4", [ "3", "2",
[ " cond .L5", [ "6", "1",
[ ".L1 end", [ "1", "0",
[ "*SUMMARY*", [ "32", "-",
It corresponds to the first and second column parts of the CSV file of the result report file.
CFG,SIZE,DEPTH,
kernel cond .L1,2,0,
,7,0,
.L5 ,1,1,
.L4 ,4,2,
.L3 cond .L3,8,3,
cond .L4,3,2,
cond .L5,6,1,
.L1 end,1,0,
*SUMMARY*,32,-,
Here, the column of CFG represents the control flow graph and each row of the CFG column corresponds to each basic block of the control flow graph.
The column of SIZE represents the number of instruction in each basic block and the column of DEPTH represents the depth of nesting of the loop.
A value of depth 0 means that the basic block is outside loops and a value of depth 999 means that the basic block does not reach the exit of the function by entering an infinite loop.
Here, cond means that the basic block ends with a conditional branch instruction having a fallthrough, and goto means that the basic block ends with an unconditional branch instruction.
Also, end means the end of the function.
(2) Information unique to the metric program is collected and output as information of additional columns to the result file.
For example, the metric program kind classifies the mnemonic contained in each basic block of the control flow graph into a memory access instruction (memory), a branch instruction (branch), and others (other).
In this example, kind generates the following part of the result information file.
"memory", "branch", "other"]],
"0", "1", "1"]],
"0", "0", "7"]],
"0", "0", "1"]],
"1", "0", "3"]],
"3", "1", "4"]],
"0", "1", "2"]],
"0", "1", "5"]],
"0", "1", "0"]],
"4", "5", "23"]]]
This information corresponds to the following part of the result report file.
memory,branch,other
0,1,1
0,0,7
0,0,1
1,0,3
3,1,4
0,1,2
0,1,5
0,1,0
4,5,23
Information gathered by the metric program is saved in the result file under the directory
${INSTALL_DIRECTORY}/hcqc/result/sample
if the test program name is sample.
In that case, the file name becomes:
"configuration file name"--"test program name"--"metric program name".json
In this example, the file name is as follows:
${INSTALL_DIRECTORY}/hcqc/result/sample/CONFIG--sample--kind.json
The command clean-all.sh does not delete the file under the directory result.
Therefore, if the same configuration file, test program, and metric program are specified and executed for the second time and thereafter:
% ./command/hcqc CONFIG sample kind
In this case, the file name of the result to be created is:
${INSTALL_DIRECTORY}/hcqc/result/sample/CONFIG--sample--kind.json.new
first. And HCQC compares the contents of the following two files:
${INSTALL_DIRECTORY}/hcqc/result/sample/CONFIG--sample--kind.json
${INSTALL_DIRECTORY}/hcqc/result/sample/CONFIG--sample--kind.json.new
If the contents of these two files are the same, delete the new file with the suffix .new.
If those are different, HCQC rename the old file to the file name with the current time appended to the old file, for example
${INSTALL_DIRECTORY}/hcqc/result/sample/CONFIG--sample--kind.json--2017-10-12-13-26-13
and delete the suffix .new from the name of the new file.
That is, as a result, the following two files remain.
${INSTALL_DIRECTORY}/hcqc/result/sample/CONFIG--sample--kind.json
${INSTALL_DIRECTORY}/hcqc/result/sample/CONFIG--sample--kind.json--2017-10-12-13-26-13
All result files of hcqc command are data with JSON format.
the command ./command/hcqc-report converts the data of these results into a table of CSV format.
For example, the command
% ./command/hcqc-report CONFIG TEST_PROGRAM_NAME REPORT_NAME METRIC_PROGRAM
finds the result data file which is
${INSTALL_DIRECTORY}/hcqc/result/TEST_PROGRAM_NAME/CONFIG--TEST_PROGRAM_NAME--METRIC_PROGRAM_NAME.json
and generate the report file which is
${INSTALL_DIRECTORY}/hcqc/report/TEST_PROGRAM_NAME/CONFIG--TEST_PROGRAM_NAME--REPORT_NAME.csv
The command ./command/hcqc-report can combine multiple tables.
For example, the command
% ./command/hcqc-report CONFIG TEST_PROGRAM_NAME REPORT_NAME M1 M2 ... Mn
can combine tables for the following data files in the order specified:
${INSTALL_DIRECTORY}/hcqc/result/TEST_PROGRAM_NAME/CONFIG--TEST_PROGRAM_NAME--M1.json
${INSTALL_DIRECTORY}/hcqc/result/TEST_PROGRAM_NAME/CONFIG--TEST_PROGRAM_NAME--M2.json
...
${INSTALL_DIRECTORY}/hcqc/result/TEST_PROGRAM_NAME/CONFIG--TEST_PROGRAM_NAME--Mn.json
This is possible because CFG, SIZE ,and DEPTH columns in the result table are equivalent
when the result information is generated from the same configuration file and test program.
The metric program in HCQC is a set of programs to investigate the quality of code generated by the compiler according to various metrics.
| arch | compiler | op | kind | regalloc | height | ilp | swpl | vectorize |
|---|---|---|---|---|---|---|---|---|
| AArch64 | GCC | Y | Y | Y | Y | N | N | N |
| AArch64 | Clang/LLVM | Y | Y | Y | Y | N | N | N |
| x86_64 | GCC | Y | N | N | N | N | N | N |
| x86_64 | Clang/LLVM | Y | N | Y | N | N | N | N |
| x86_64 | ICC | Y | N | N | N | N | N | N |
The metric program op investigates the number of mnemonics of the instructions used in the assembly code of the kernel function.
The number of mnemonics is summarized for each basic block of the control flow graph.
The metric program kind investigates the kind of mnemonic of the instruction used in the assembly code of the kernel function.
The number of mnemonic kinds is summarized for each basic block of the control flow graph.
The kinds of mnemonics handled by the metric program kind are as follows:
memory: memory access instructionsbranch: branch instructions and any control transfer instructionsother: remaining instructions not included in the above two kinds
The metric program regalloc examines the quality of register allocation by the compiler.
It counts the number of spill codes which exist in the assembly code of the kernel function.
The number of spill codes is summarized for each basic block of the control flow graph.
The metric program regalloc counts the spill code by dividing it into the following two types.
spill out: a set of store instructions to save the value of the register to the spill area prepared by the compilerspill in: a set of load instructions to restore the value of the register from the spill area prepared by the compiler
The quality of register allocation is determined by the number of spill codes generated by the compiler.
However, the exact definition of spill code may differ depending on the compiler.
For example, some compilers may consider saving and restoring the values of registers generated before and after function calls as spill codes.
Therefore, if the compiler is different, it can not simply decide the quality of register allocation by comparing the number of spill codes.
Also, note that the location where the spill code exists is also important.
The spill code in the innermost loop becomes a cause of the performance degradation more than the spill code outside the loop.
The result of regalloc should be judged with these considerations taken into account.
The metric program regalloc detects spill out and spill in instructions from the resulting assembly code.
The details of this process vary depending on the compiler.
-
LLVM
If the compiler is Clang/LLVM, the instruction with the following comment in the assembly code file is the spill code.
// ???-byte Spill // ???-byte Folded Spill // ???-byte Reload // ???-byte Folded ReloadThe following script
${INSTALL_DIRECTORY}/hcqc/command/metric/regalloc/regalloc000.pyimplements the process of detecting and counting them.
-
GCC
If the compiler is GCC, the instruction with the following comment in the assembly code file handles the address of the spill area prepared by the compiler.
/// ... %sfp ...These comments can be generated by creating an assembly code file by attaching option
-fverbose-asmto GCC. Therefore, if the compiler is GCC, you can regard memory access instructions with these comments as spill codes.The following script
${INSTALL_DIRECTORY}/hcqc/command/metric/regalloc/regalloc001.pyimplements the process of detecting and counting them.
(Note): LLVM considers load instructions or store instructions generated at the entry and the exit of a function as spill codes. However, GCC does not regard these instructions as spill codes. Therefore, when comparing the number of spill codes at the entry and the exit of a function, it is necessary to consider these differences.
The metric program height examines the height of the data dependence graph of the instruction in each basic block.
By referring to these values, it is possible to detect the problem of lowering the instruction level parallelism as a result of using the same register within a narrow range by a register allocation pass.
This metric program is for investigating the quality of instruction scheduling by the compiler. This program has not been implemented yet.
This metric program is for investigating the quality of software pipelining by the compiler. This program has not been implemented yet.
This metric program is for investigating the quality of vectorization or SIMDization by the compiler. This program has not been implemented yet.
In the following, it is assumed that the name of the newly created configuration file is NEWCONFIG.
In this case, it is necessary to add a new file
${INSTALL_DIRECTORY}/hcqc/config/NEWCONFIG.json
under the directory
${INSTALL_DIRECTORY}/hcqc/config
The file NEWCONFIG.json need to define the following fields:
-
DISTRIBUTIONThis field defines the name of OS distribution. This field is not currently used.
-
ARCHThis field defines the machine hardware name. HCQC checks whether this definition matches the result of
uname -m. -
CPUThis field defines the name of CPU. This field is not currently used.
-
LANGUAGEThis field defines the programming language which targeted by the compiler
COMPILER. HCQC checks whether this definition matches theLANGUAGEfield in theprogram-info.jsonfor test programs. -
COMPILERThis field defines the name of the compiler to be investigated. The string is used for selecting the Python class in
hcqc/command/config.py. Currently, onlyGCCandClangLLVMare supported. For introducing the new compiler name, you need to add definitions for it in the script filehcqc/command/config.py(see How to Add New Architectures or Compilers). -
COMMANDThis field defines the full path name of the compiler
COMPILERto be investigated. -
VERSIONThis field defines the version number of the compiler
COMPILERto be investigated. HCQC checks whether this definition matches the result of theCOMMANDexecution results with the--versionoption. -
OPT_FLAGSThis field defines the optimization options using the compiler
COMPILERto be investigated. HCQC regards a pair of a compiler and optimization options used by its compiler as identifiers to be investigated. -
ASM_FLAGSThis field defines the options for generating assembly codes by the compiler
COMPILERto be investigated. Because HCQC uses assembly code with detailed information added, the option-Smay not be enough. -
FLAG_DBThis field defines the flag variables used in program information files. Each compiler often has different options for the same feature. Program information files use flag variables in the definitions, then HCQC replaces those flag variables using the definition of the field
FLAG_DBbefore executing the compiler. If there is no option to specify the specific feature in the target compiler, you can specify an empty string for the flag variable like the following:"FLAG_DB" : [["?COMPILER_RARE_FLAG", ""], ...]
By executing HCQC with option --v, you can check what kind of commands are actually executed for the processing of the test program using the specific configuration file.
In the following, it is assumed that the name of the newly added test program is newtest.
In this case, it is necessary to create a new directory
${INSTALL_DIRECTORY}/hcqc/test-program/newtest
under the directory
${INSTALL_DIRECTORY}/hcqc/test-program
This new directory should have at least the following three files:
-
program-info.jsonThis file defines how to compile, execute, and check the result of the test program. See below for details of this definition.
-
a source file of kernel part of the test program
In order to prevent the function from disappearing due to interprocedural optimizations performed in the file, the kernel part and the main part of the test program should be divided into different files.
-
a source file of main part of the test program
The main part of the test program should execute the kernel function in the kernel file and verify the result. If there is no output result in the program, the main part should report the result status with an exit code.
Both the compiler command and the generated executable file are executed in directory
${INSTALL_DIRECTORY}/hcqc/test-program/newtest
then, header files etc. in this directory can be referenced from the test program by the relative path.
The file program-info.json needs to define the following fields:
-
LANGUAGEThis field defines the programming language which the test program is described. HCQC checks whether this definition matches the
LANGUAGEfield in the configuration file. -
MAIN_FLAGSThis field defines the options for compiling the main file by the compiler to be investigated.
-
KERNEL_FLAGSThis field defines the options for compiling the kernel file by the compiler to be investigated. This field should not contain optimization options which should be included in the configuration file.
-
LINK_FLAGSThis field defines the options for building the executable file by the compiler to be investigated.
-
LIB_LISTThis field defines the library options for building the executable file by the compiler to be investigated.
-
MAIN_FILENAMEThis field defines the file name of the main part.
-
KERNEL_FILENAMEThis field defines the file name of the kernel part.
-
KERNEL_FUNCTION_NAMEThis field defines the kernel function name in the kernel file. This kernel function should be called from the main file. There is no problem if other functions in the kernel file are removed by some optimizations.
-
INPUTThis field defines how to handle the input data for the generated executable file.
-
[ "STDIN", INPUT_FILENAME ]This description means the generated executable file is executed by inputting the data of the file INPUT_FILENAME to the standard input.
-
[ "FILE", INPUT_FILENAME ]This description means the generated executable file is executed by specifying the input file name INPUT_FILENAME on the command line.
-
[ "NONE", "NONE" ]This description means the generated executable file doesn't use input data for its execution.
-
-
OUTPUTThis field defines how to handle the output data for the generated executable file.
-
[ "STDOUT", OUTPUT_FILENAME ]This description means the execution of the generated executable file outputs the result data into the standard output. HCQC verifies the output by comparing the content of the file OUTPUT_FILENAME.
-
[ "FILE", OUTPUT_FILENAME ]This description means the execution of the generated executable file outputs the result data into the output file OUTPUT_FILENAME. HCQC verifies the output by comparing the content of the answer file OUTPUT_FILENAME.
-
[ "NONE", "NONE" ]This description means the generated executable file doesn't generate output for its execution. The execution result is verified only with the end code.
-
If the test program uses the INPUT_FILENAME file or the OUTPUT_FILENAME file, they should be placed under the test program directory, for example:
${INSTALL_DIRECTORY}/hcqc/test-program/newtest
When defining compiler options in the above fields, the following rules should be followed:
- Those fields should not contain optimization options which should be included in the configuration file.
- Those fields should contain only compiler independent options. If compiler-dependent options are required, flag variables should be used.
By executing HCQC with option --v, you can check what kind of commands are actually executed for the processing of the test program.
In the following, it is assumed that the name of the newly created metric program is NEWMETRIC.
In this case, it is necessary to create a new directory
${INSTALL_DIRECTORY}/hcqc/command/metric/NEWMETRIC
under the directory
${INSTALL_DIRECTORY}/hcqc/command/metric
A new Python script file
${INSTALL_DIRECTORY}/hcqc/command/metric/NEWMETRIC/NEWMETRIC000.py
needs to be added under
${INSTALL_DIRECTORY}/hcqc/command/metric/NEWMETRIC
The file NEWMETRIC000.py needs to define a new class which is a subclass of the class driver.MetricWorker.
The definition of the class driver.MetricWorker exists in the following file:
${INSTALL_DIRECTORY}/hcqc/command/driver.py
In the following, let XxxMetricWorker be the new class name.
This new class needs to have the definition of the following methods.
In the following methods, the argument target_config represents an instance of class config.Config which includes information from the configuration file which is the file specified on the command line of command ./command/hcqc.
-
match_p(self, target_config, test_name)This method decides whether or not to use this script
NEWMETRIC000.pyfor the specified configurationtarget_configor the est program nametest_name. If the result of this method is True, HCQC uses this script file. If the result is False, HCQC tries to use another script file. -
set_up_before_getting_data(self, target_config, bb_list)This method defines the operation to be performed before the metric program returns the result data.
bb_listrepresents a list of basic blocks of the control flow graph. Each basic block of an element of thebb_listis an instance object of the classcfg.BasicBlockin the file${INSTALL_DIRECTORY}/hcqc/command/cfg.pySince the object of each basic block holds the information of the kernel part of the test program, it can be used for data analysis and collection.
-
get_column_name_list(self)This method returns a list of column names in the table of the metric program's result. The implementation of this method depends on the metric program. For example, this method of the metric program
kindreturns the following result.[ 'memory', 'branch', 'other']This method in the metric program
opreturns a list of the mnemonics of the instructions contained in the kernel function in the test program. Therefore, the metric programopneeds to create this list. The methodset_up_before_getting_datain the metric programopimplements the work. -
get_data_list(self, target_config, bb)The row of the result table of the metric program corresponds to each basic block of the control flow graph. This method returns a list of data corresponding to the basic block bb. The length of the resulting data list must be the same as the length of the list of column names returned by the method
get_column_name_list. -
get_summary_list(self, target_config)The last row of the result table of the metric program represents summary data for each column. This method returns a list of summary data in the last row. The length of the resulting data list must be the same as the length of the list of column names returned by the method
get_column_name_list.
The data element of the method that returns the result data needs to be a string.
First, HCQC creates a control flow graph of the test program.
Then HCQC tries executing each script in order of name from the directory of the specified metric program name.
If the method match_p of one script returns True, HCQC does not execute subsequent scripts.
The Python script
${INSTALL_DIRECTORY}/hcqc/command/test-metric.py
is a program for testing execution of any metric programs.
For example, you can test NEWMETRIC000.py as follow:
% cd ${INSTALL_DIRECTORY}/hcqc/command
% python3 test-metric.py metric.NEWMETRIC.NEWMETRIC000 \
aarch64 ClangLLVM 4.0.1 /tmp/AsmByClangLLVM.s kernel_f /tmp/RESULT.json
At the moment, the metric program gets the necessary information from the resulting assembly code. If the new metric program is insufficient only with the assembly code, the compiler needs to be modified to output the necessary information.
For introducing a new architecture for investigating the quality of compilers, it is necessary to create a class representing the architecture in the Python script file
${INSTALL_DIRECTORY}/hcqc/command/config.py
In the following, it is assumed that the name of the newly added architecture is myarch.
This name is used in the ARCH field of the configuration file and must match the name returned by command uname -m.
This rule is to suppress the use of the created configuration file in the wrong environment.
A class representing a new architecture needs to be created as a subclass of the Config class.
Also, the class name must be created as a class name by adding C_ at the beginning of the architecture name and adding __ at the end.
For example, if the name of the newly added architecture is myarch, the following class definition is required.
class C_myarch__(Config):
With this rule, HCQC automatically detects its class definition from the information in the ARCH field in the configuration file.
The class Config defines the following methods.
-
function_entry_p(self, name, line)This method determines whether any line
lineof assembly code is the entry to the kernel function.nameis the name of the kernel function. -
function_exit_p(self, name, line)This method determines whether any line
lineof assembly code is the exit to the kernel function.nameis the name of the kernel function. -
bb_label(self, line)This method determines whether any linelineof assembly code represents the entry label of the basic block.
For these methods, if another definition is needed for the newly added architecture myarch,
the class C_myarch__ can override those definitions.
-
get_asm_comment(self, line)This method returns the string of the comment if any line
lineof assembly code contains the comment of the assembly code. Otherwise, it returns None. -
bb_branch(self, line)If any line
lineof assembly contains a control transfer instruction, this method returns the mnemonic of the instruction and the label of the control transfer destination(if any) as a pair. Otherwise, it returns None. -
call_p(self, branch_op, branch_target)This method decides whether the mnemonic
branch_opand the control transfer destination labelbranch_target(if any) represent a function call instruction. -
tail_call_p(self, branch_op, branch_target)This method decides whether the mnemonic
branch_opand the control transfer destination labelbranch_target(if any) represent a tail function call instruction. -
branch_by_register_p(self, branch_op, branch_target)This method decides whether the mnemonic
branch_opand the control transfer destination labelbranch_target(if any) represent a branch by the value of a register. An instruction which branches by the value of a register is either a [tail] call by a function pointer or a table branch. -
table_branch_p(self, branch_op, branch_target, table_branch_label, line_list)This method decides whether the mnemonic
branch_opand the control transfer destinationbranch_target(if any) represent a table branch instruction. Thetable_branch_labelrepresents a label for table branch(if any) included in the basic block currently being processed, andline_listrepresents a list of line in the basic block. -
get_table_branch_prologue_number(self)This method returns the number of lines(or states) necessary for detecting tables for table branches from the assembly code.
-
trace_table_branch_prologue(self, region_status, line)This method represents a process for detecting tables for table branches. It determines whether to transition to the next state of the process when it reaches the line
linewhen the current state number isregion_status. When transitioning to the next state, this method returns(True, label)as a result. Here, thelabelrepresents the head label of the table for table branches that the linelinecontains. If the linelinedoes not contain the label, thelabelis None. If this method does not transition to the next state, it returns(False, None)as a result. -
get_table_branch_content(self, line)If any line
lineis an element of the table of the table branch, this method returns the label of the destination which the linelineincludes. Otherwise, this method returnsNone. -
fall_through_p(self, branch_op)This method decides whether the mnemonic
branch_opof control transfer instructions falls through the next basic block. -
op(self, line)If any line
lineof assembly code is an instruction, then this method returns its mnemonic. Otherwise, it returns None. -
load_op_p(self, op)This method determines whether the mnemonic
opis a memory read instruction. -
store_op_p(self, op)This method determines whether the mnemonic
opis a memory write instruction. -
control_transfer_op_p(self, op)This method determines whether the mnemonic
opis a control transfer instruction.
Defining a new architecture alone is meaningless. To use that definition, you need to define a compiler that will investigate the quality on the newly added architecture.
For introducing a new compiler for investigating the quality of it, it is necessary to create a class representing the compiler in the Python script file
${INSTALL_DIRECTORY}/hcqc/command/config.py
HCQC treats the compiler and the architecture that runs the compiler as a pair.
Therefore, if there is no definition of the architecture to run the new compiler, it is necessary to define the architecture first.
In the following, it is assumed that the name of the newly added compiler is Foo and the name of the architecture that runs the compiler Foo is myarch.
A class representing a new compiler needs to be created as a subclass of the class representing the architecture to run the compiler.
Also, the class name must be created by appending the name of the compiler after the class name of the architecture.
For example, if the name of a newly added compiler is Foo, the following class definition is required.
class C_myarch__Foo(C_myarch__):
This rule is for automatically detecting the class definition from the information in the ARCH field and COMPILER field in the configuration file.
If the compiler Foo needs to change the behavior of the method defined in the class C_myarch__ or the class Config,
the class C_myarch__Foo can override those definitions.
Since the name of the compiler is used as a Python class name, the name needs to be created only from characters that can be used for Python identifiers.
You can test the new definition of the compiler by creating a control flow graph using the assembly code which the compiler generated. The Python script
${INSTALL_DIRECTORY}/hcqc/command/test-cfg.py
is a program for testing generating control flow graphs. For example, you can generate a control flow graph for the assembly code as follows:
% cd ${INSTALL_DIRECTORY}/hcqc/command
% python3 test-cfg.py aarch64 ClangLLVM 4.0.1 /tmp/AsmByClangLLVM.s kernel_f
- Add supports for the Scalable Vector Extension(SVE) of AArch64 if it becomes available for GCC or Clang/LLVM.
