## An experimental analysis of loop pipelining techniques on SIMD-like architectures\*

Mehmet Ali Arslan Flavius Gruian Krzysztof Kuchcinski Lund University, Computer Science {mehmet\_ali.arslan, flavius.gruian, krzysztof.kuchcinski}@cs.lth.edu

## **Abstract**

## 1. Introduction

General intro

Background on CP and modulo scheduling...

- 2. Related Work
- 3. Approach
- 3.1 Scheduling one iteration
- 3.2 Scheduling several iterations simultaneously
- 3.2.1 Overlapping (Chenxin's way)
- 3.2.2 Modulo scheduling
- 3.2.3 Unrolling and modulo scheduling
- 4. Experiments and evaluation

comparisons... (of which measures?)

- 4.1 Average throughput
- 4.2 Code size
- 4.3 Burstiness

- 4.4 Reconfiguration
- 4.5 Scheduling time
- 5. Experimental analysis
- 6. Conclusions and future work



Figure 1: Burstiness

2014/12/5

<sup>\*</sup>This work has been supported by the Swedish Foundation for Strategic Research (SSF) as part of the High Performance Embedded Computing project (HiPEC).