In this assignment, your task is to optimize the problem of “Dilated Convolution (DC)” which involves an input matrix and a kernel matrix. The sample algorithm is depicted below using an animation.
A. Input Matrix of dimensions:
B. Kernel Matrix of dimensions:
An Output Matrix of dimensions:
Dilated Convolution (DL) is a variant of the convolution operation. Following is the explanation of the algorithm:
Let us assume for the explanation of the algorithm that the dimensions of the Input Matrix and Kernel Matrix are as follows:
A. Input Matrix
B. Kernel Matrix
So, dimensions of the Output Matrix
[
The Kernel Matrix
Contained are three folders:
- PartA: Contains setup for single-threaded and multi-threaded program.
- PartB: Contains setup for GPU program.
- test: Contains a test program to aid in understanding the algorithm.
Each of the folders PartA and PartB contain two sub-folders, a Makefile, and a main program.
- Makefile: Contains commands necessary to compile, generate inputs, and run the program.
- data folder: Contains program that generates input, and will contain input once generated.
- header folder: Files containing the function that performs the operation. Modify the files in this folder.
- main.cpp: Program that takes inputs and executes the functions. DO NOT MODIFY THIS.
Navigate to each folder to begin setting up the system. Inside each folder do the following:
Use the following command to compile the programs and generate required input:
make
You can use make to run the executable with the following command:
make run
Alternatively, you can manually run the program for the different input sets using the following commands:
./dilated_conv -i data/4096.in -k data/13.in
./dilated_conv -i data/8192.in -k data/64.in
./dilated_conv -i data/16384.in -k data/3.in
Note: the -i
flag is for the input matrix and -k
flag is for the kernel matrix.
To compile the code for use on native GPU use the following command:
make server
For use with GPGPU-Sim, additional flags are required during compilation, which can be done with the following command:
make sim
You can use make to run the executable with the following command for native execution:
make run_server
When running on GPGPU-Sim, use the following command instead:
make run_sim
This directory contains a simple program to help you understand how the algorithm works. You can play around with the sizes of the Input-Matrix