# Hardware description of Real Time Hardware Sorter (RTHS)



Author: Amin Norollah Last modified: 2019/10/09

## **Table of Contents**

| COPYRIGHT NOTICE                | 3 |
|---------------------------------|---|
|                                 |   |
| INTRODUCTION                    | 4 |
| BITONIC SORTING NETWORK DESIGN  |   |
| BITONIC SORTING NET WORK DESIGN | 3 |
| MAIN RTHS DESIGN                | Е |

# Copyright notice Copyright 2018 by Amin Norollah, amin<dot>norollah<at>gmail<dot>com. This project and its content is provided for academic use only under GPLv3 License. You can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version. Some open source application is distributed in the hope that it will be useful, but WITHOUT ANY

WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR

PURPOSE. See the GNU General Public License for more details.

### Introduction

Sorting is one of the fundamental computation task in computer science, used in many applications. In the high performance applications like big data processing, it is important to sort the input data in the minimal time with high performance. A lot of software algorithms proposed for this matter, but by increasing the computation requirements and executing the programs in many-core platforms like Network on chip systems, we want new algorithms that let us to sort data in parallel platform.

The conventional sorting networks not suitable for this manner. Therefore, the new parallel algorithms based on networks were proposed to use the features of the high performance platforms and increase the performance, such as the bitonic and odd-even sorting networks. FPGA-based solutions help us to design specific hardware for sorter and let us design an efficient hardware that only has the task of sorting the data.

The main challenge in this hardware sorters is how to design a high performance sorter that be able to receive the maximum number of records and sort them in the minimal time. But it is big challenge, because increasing the number of input records in the sorter means increasing the amount of hardware resource required that is limited. This limitation caused by limited number of resource in FPGA chip and by this fact that the sorting module is one of the components of the system and we like to minimize this module.

We proposed a new hardware solution for sorting the data based on the novel multidimensional sorting algorithm. This projects is divided in 2 main parts:

- 1. Designing the scalable Bitonic sorting network,
- 2. Designing and implementing the Real-Time Hardware Sorter (RTHS).

We have tried to simplify the code of the RTHS design in this github project to make it easier to understand. For this reason, we described hardware sorter for sorting the 16 input records with 16 data width.

In the following of this documents, bitonic sorting network and the RTHS design will be described.

### **Bitonic sorting network Design**

We tried to find the similarities of the sorting network and program the related modules, which can be scalable in the different simulations without need to change the codes. Figure 1 shows the 8 input records Bitonic sorting network.



Figure 1 simple 8 input Bitonic sorting network.

We designed this network with 5 modules. "BitonicPreStage" modules describe the Iterative parts of the network that can be used in the next stages and "BitonicPosStageX" modules describe the other part of the network and in each stage we need to use the corresponding modules.

It can be imaged with increasing the input records in this network to 16 input records. We will be able to describe the 16 input records Bitonic network with duplicating the network of Figure 1 in parallel form and adding the new "BitonicPreStage" and "BitonicPosStage4" modules in series.



Figure 2 scalability.

### Main RTHS design

Main module is used the Bitonic sorting network as a basic block. In this sorting method, input records has entered into sorter in the form of one dimension array and after that is transformed to 2D matrix form. Each Bitonic sorting network received one column of the 2D matrix for partial sorting simultaneously. After that, column and row of the matrix is switched and 2D matrix is turned to other dimension (from I to J and vice versa). This process is one of the phases of the sorting. After 6 phase of the system, all the records of 2D matrix have sorted. Then 2D matrix is flattened by dimension order and sent to output.



Figure 3 schematic of proposed sorter.

Figure 3 shows the schematic of the basic idea of the proposed 2D sorter for sorting 16 input records. For getting the more information about this architecture, you can see the following paper: "RTHS: A Low-Cost High-Performance Real-Time Hardware Sorter, Using a Multidimensional Sorting Algorithm", doi: 10.1109/TVLSI.2019.2912554.

As mentioned above, this design consisted of 4 modules as follows:

| Num | Name of module           | Description                                                                                                                             |
|-----|--------------------------|-----------------------------------------------------------------------------------------------------------------------------------------|
| 1   | Bitonic sorting networks | We used 4 bitonic sorting network to sort 16 input records.                                                                             |
| 2   | 2D switch                | Also called it implicit switch. It has a task of turning the 2D matrix of records that is sorted in Bitonic sorting networks.           |
| 3   | temporary<br>registers   | They located in the main module and have task of storing the input record in the first phase and output of the 2D switch in the others. |
| 4   | control unit             | Set the Direction signals and Ready signal at the end of the process.                                                                   |