RIOT's new test system: large scale testbed based protocol tests
This document describes an approach to large scale IoT testbed based network protocol tests.
- IoT testbeds usually follow different approaches when it comes to their architecture and the interfaces towards the user they offer. However, the protocol tests should be able to run on wide range of testbeds.
- Depending on the way a test is executed and the number of motes involved huge amounts of log file data might be generated. This data needs to be collected and processed in order to decide whether a test failed or not.
- It is desirable to have a simple yet powerful protocol test API.
- It should be possible to utilize this API to test network protocols no matter on which layer of the network stack they are located.
Testbed abstraction layer
- The system must abstract from differences between testbeds. This means it must offer a unified way to do the following:
- data collection
- communications with motes
- flashing of motes
- addressing of motes (think DHCP for testbeds)
- It must be possible to specify the test itself in a concise way. A pattern where a protocol test consists of multiple phases or steps should be the preferred way to write such a test.
- setUp and tearDown hooks must be provided by the test system so that a test designer can execute code before and after the protocol test.
- Tests must return an exit status. They also must return a message indicating which (if any) test criterion wasn't met.
- The tests should be designed primarily as black-box tests as much as possible. However, this might not be possible for all protocols. This implies that we might need to find a generic way to expose certain internals of a protocol implementation to the test system.
- The system must be capable of executing performance tests.
Sketch of a large scale protocol test system
Architectural overview (you might have to zoom in): The proposed system follows a layered approach. The lowest layer is a testbed on which the network tests should be executed (such as DES-Testbed or FIT IoT-Lab). On top of that layer sits an abstraction layer which exposes commonly used testbed functions to higher layers through a standardized API (e.g.: flashing of motes, communication with motes and log data collection etc.). Tests are executed using the API provided by the abstraction layer. This way a test needs only be written once and can be executed on different testbeds. The following sections will describe each layer in more detail.
IoT Testbeds come in a wide variety of shapes and sizes. It is not the goal of this project to support every single one of those. Instead, we will try to concentrate on testbeds which are of interest for testing network protocols. A testbed may be of interest for this purpose due to:
- Its size
- Measurement capabilities (GPS time synchronization, Out-of-band measurements etc.)
- Used hardware (special transceiver / ethernet / microcontroller / sensors etc.)
For the first version of the test system we aim to support the following testbeds:
- DES-Testbed: simply because it is easily accessible for most of the RIOT team
- FIT IoT-Lab: due to its measurement capabilities, size and modern hardware
- Local testbeds: a miniature testbed which consists of multiple motes connected to a single computer
Abstraction Layer (TODO: find a catchy name for this)
Provides access to testbed functions through a standardized API. This means that for every testbed the API has to be implemented. This layer must provide the following functionality:
- Sending commands to motes through their serial line (to individual motes or groups of motes)
- Scheduling of events such as sending a command to all motes or a controlled shutdown of a test etc.
- Centralized logging capabilities which supports automated log file evaluation by a test
- Network address assignment and flashing of motes
If out-of-band measurement methods or other special features are available in a testbed then those function should be exposed through this API as well. If a test requests a special feature in a testbed that does not support this feature the API method should return an error code accompanied by a sensible error message.
Testbeds can be utilized to perform the following tests:
- Performance tests for protocols like load and stress tests
- Interoperability tests (e.g. does protocol implementation of OS Y play nicely with the protocol implementation of OS Z etc.)
The focus of this project is on the former test methods. Hence, tests should be designed so that they can answer question like:
- What is the mean/worst case packet loss rate for routing protocol x?
- What is the mean/worst case delay in the network for routing protocol y?
- What is the mean/worst case frame loss in a MAC-layer protocol?
It would then be possible to define threshold values for each of those protocols performance parameters which must or should be met. Thus, a test specification might look like (in pseudo-code):
var mean_pkt_loss, worst_pkt_loss // [...] log file parser assigns values to variables mean_pkt_loss.should_be_lower_than(0.2) worst_pkt_loss.must_be_lower_than(0.3)
This test would return a warning if the packet loss rate is higher than 20% and would exit with an error if the packet loss rate is higher than 30%. Please note that this example test specification does not show the individual test steps which led to a system state that can be tested against the shown test specification. In order to reach that state several steps need to be taken:
A RIOT test application needs to be written. This application must be configured to:
- use the protocol implementation under test
- output state log messages when certain code is executed (most likely functions of the protocol implementation)
- use appropriate shell handlers to modify the state of the mote according to the requirements of the test. This must include a command to start and end a test but may also include commands to change the state of the protocol implementation and/or mote.
A test scenario must be defined. Test scenarios can be defined as a sequence of steps which lead the system under test into a specific state. In the case of routing protocols this might mean the following (in pseudo-code):
def start_scenario: execute_after(1min, testbed.force_virtual_line_topology()) execute_after(2min, testbed.send_to_all("test start")) execute_after(5min, testbed.send_to_all("test stop"))
The resulting log files must be parsed and processed in order to extract meaningful values which can be used by the test specification in order to determine whether a test passed or not. This is probably the most involved step. It is not clear whether a simple yet powerful abstraction can be found to simplify this task. Depending on the protocol and the collected data points even difficult stochastic / graph theoretical problems may need to be solved.
- One way to simplify at least the data collection and processing part would be to introduce some kind of standardized logging mechanism with a well defined format into RIOT. This mechanism could be implemented by means of a special LOG macro akin to the DEBUG macro. It should be possible to only enable specific logging messages (for example only the second logging message in a specific API function).
- It might also be advantageous to introduce scripts which create a code skeleton for protocol tests.
- It might be possible to use Robot framework for the test specification
Protocol tests best practices
This section explains how to choose meaningful metrics for protocols which are located at different layers in the network stack. It also shortly discusses possible test scenarios for each layer.