Welcome to the verification repository!
This repository contains all what you need to test that the translation of ARMv7m instructions (done by translator) is correct by comparing the execution in analyzer with execution on a real Cortex-M3 processor.
├── ci folder with tests and results ├── build.sh script to build the tests ├── cf.py control flow instructions generation ├── ci.sh script to run all the tests of one type ├── config.json config for the analyzer ├── constant.py constants ├── generator.py script to generate the tests and collect the dumps from the real device ├── it.py IT instructions generation ├── ldrstr.py load/store instructions generation ├── LICENSE.TXT license ├── link.ld linker script ├── Makefile makefile to compile the tests ├── os_run.py running commands from python ├── README.md this readme ├── regression_results.md and example of results of run_all.sh ├── run_all.sh script to run all the tests ├── run_klee.sh script to run the test in analyzer ├── verify.py script to compare and verify the results └── versioning.sh script to save version information
We first have to generate many test sequences, and to record the effects of their execution on a real Cortex-M3. For most instructions we just look at the registers of the CPU. For load/store instructions we also look at memory (the stack as we perform all read/writes there). Each sequence is composed by an initialization of the registers (and stack) and by one (or a few) instruction under test. Instructions are generated by expanding a representation of the ARMv7m instruction into all possible instruction types, and by generating many different random operands (register names and values). The sequences are injected and executed by a Cortex-M3 using our custom debugger, which also collects the final register/stack state after execution. This step is time consuming and can be run only once when a new instruction is added to the ones supported by translator. Therefore, we have stored our results and we provide them in this repository. They are organized per type of instruction.
Once we have many tests and their "golden" results, we can use translator to transform them in llvm-ir that can run in analyzer. We can also collect the final register/stack state in analyzer and compare it with the "golden" results. This step has to be reapeated each time that the code of translator or analyzer is modified. We provide a regression script that runs all the tests that we provide and logs the results.
In this part we briefly explain how to use our code.
To run a regression containing all the main tests simply type the following.
You will not need any hardware. The process will take quite some time. The output file will contain information about the version of the components under test and a summarized result for each test suite (number of successful tests, etc.).
To run a test for a single instruction type, simply run the following.
./ci.sh -t $TEST_NAME -f LR,CPSR,.stack,SP,PC
You can find all available tests with:
The -f option is a filter that disables the checks for the registers that follow. For the tests to load/store instructions, remove SP from the list! This filtering is necessary because of the differences between the device and the analyzer. For example, the analyzer does not uses PC and LR for call/returns, and it dumps a .stack value which is not useful here. We prefer to dump all the register results from the real execution on the Cortex-M3 and to filter them, in case they will be useful in the future.
This part is the most complex, but fortunately you will most likely do not need it.
You will need to connect your laptop to our custom debugger, and our debugger to a board equipped with an AHB-AP debug port via JTAG and a Cortex-M3. Refer to the instructions in the debugger repository for this.
Test sequences are generated and run on the Cortex-M3 by the generator.py script. Have a look at it for the options. You can choose the seed, the output folder (put ci/tests/name), whether you want to exit on error or not, and the number of tests to generate for a given instruction.
There is a lot of code repetition and often you will need to comment unused sequences and uncomment the part you want. We apologies for the inconvenience, and we plan to engineer it into a more usable script.
An example could be:
python3.5 generator.py -s 0 -c False -o ci/tests/example -N 10
This repository is the result of several experimets and evolved over time. We have privileged stability and compatibilty with past collections of the results (time consuming process), rather than the development of a brand new a clean architecture. We are working on a clean and optimized version.