In this project, you will implement an image convolution routine using the 32-bit RISC-V (RV32I) assembly language. An image file in the BMP format will be given as an input. The goal of this project is to give you an opportunity to practice the RISC-V assembly programming. In addition, this project introduces various RISC-V tools that help you compile and run your RISC-V programs.
The RGB color model is one of the most common ways to encode color images in the digital world. The RGB color model is based on the theory that all visible colors can be created using the primary additive colors: red, green, and blue. When two or three of them are combined in different amounts, other colors are produced. The RGB color model is important to graphic design as it is used in computer monitors.
The BMP file format is an image file format used to store digital images, especially on Microsoft Windows operating systems. A BMP file contains a BMP file header, a Bitmap information header, an optional color palette, and an array of bytes that defines the bitmap data. Since the BMP file format has been extended several times, it supports different types of encoding modes. For example, image pixels can be stored with a color depth of 1 (black and white), 4, 8, 16, 24 (true color, 16.7 million colors) or 32 bits per pixel. Images of 8 bits and fewer can be either grayscale or indexed color mode. More details on the BMP file format can be found at http://en.wikipedia.org/wiki/BMP_file_format.
In this project, we will focus only on the 24-bit uncompressed RGB color mode with the "Windows V3" bitmap information header. Under this mode, our target image file has the following structure.
+-----------------------------------------+
| BMP file header (14 bytes) |
+-----------------------------------------+
| Bitmap information header (40 bytes) |
+-----------------------------------------+
imgptr -> | Bitmap data |
| |
| |
| |
+-----------------------------------------+
We will provide you with the skeleton code that has only the bitmap data. So you don't have to worry about these headers.
The bitmap data describes the image, pixel by pixel. Each pixel consists of an 8-bit blue (B) byte, a green (G) byte, and a red (R) byte in that order. Pixels are stored "upside-down" with respect to normal image raster scan order, starting in the lower left corner, going from left to right, and then row by row from the bottom to the top of the image. Note that the number of bytes occupied by each row should be a multiple of 4. If that's not the case, the remaining bytes are padded with zeroes. The following figure summarizes the structure of the bitmap data.
In image processing, a kernel is a small matrix used for blurring, sharpening, embossing, edge detection, and more. It is also used in machine learning for feature extraction, a technique for determining the most important portions of an image. This is accomplished by performing a convolution operation between the kernel and an image.
Convolution is the process of adding each element of the image to its local neighbors, weighted by the kernel. The following example shows the result where the 3x3 kernel K
is applied to the 5x5 source image S
.
S = [[ 1, 1, 1, 0, 0],
[ 0, 1, 1, 1, 1], K = [[1, 0, 1], [[4, 3, 4],
[ 0, 0, 1, 1, 1], * [0, 1, 0], = [2, 4, 3],
[ 0, 0, 1, 1, 0], [1, 0, 1]] [2, 3, 4]]
[ 0, 1, 1, 0, 0]]
Differently sized kernels containing different patterns of numbers produce different results under convolution. The size of a kernel is arbitrary but we will always use the 3x3 kernel.
There are several ways to handle the missing values outside of the image boundary. We assume that any pixel in the output image which would require values from beyond the edge is skipped. Hence, for the given S
= h
x w
image, our convolution operation always results in the D
= (h-2)
x (w-2)
output image.
Complete the file bmpconv.s
that implements the function bmpconv()
in the 32-bit RISC-V assembly language. The prototype of bmpconv()
is as follows:
void bmpconv (unsigned char *imgptr, int h, int w,
unsigned char *k, unsigned char *outptr);
The first argument, imgptr
points to the bitmap data that stores the actual image, pixel by pixel. The next two arguments, h
and w
, represent the height and the width of the given image (in pixels), respectively. The argument k
points to the address of the kernel. The last argument outptr
points to the address of the output image. Note that the pixel data for the input image (indicated by imgptr
) and the output image (indicated by outptr
) should follow the same bitmap data format used in the BMP image file. You can assume that a sufficient memory region has been already allocated to store the output image, whose start address is given by outptr
. You need to perform the convolution operation for each color separately.
In the assembly code, those arguments, imgptr
, h
, w
, k
, and outptr
, are available in the a0
, a1
, a2
, a3,
and a4
registers. Because we are using the 32-bit RISC-V simulator, all the registers are 32-bit wide.
The kernel is always a 3x3 matrix of unsigned characters stored in memory using the row-major ordering (as in the C language). That is, k == &k[0][0]
, k+1 == &k[0][1]
, k+2 == &k[0][2]
, ... and k+8 == &k[2][2]
. In this project, each element in the kernel is set to one of 0, 1, and -1, in order to avoid the use of the multiplication instruction (Note: The RV32I lacks multiplication/division instructions).
Let us assume that S
represents the input image and D
the output image. For the given h
x w
input image, the output image will have the dimension of (h-2)
x (w-2)
. In this case, each pixel value in D[i][j]
is given by:
D[i][j] = Sat(S[i][j]*K[0][0] + S[i][j+1]*K[0][1] + S[i][j+2]*K[0][2]
+ S[i+1][j]*K[1][0] + S[i+1][j+1]*K[1][1] + S[i+1][j+2]*K[1][2]
+ S[i+2][j]*K[2][0] + S[i+2][j+1]*K[2][1] + S[i+2][j+2]*K[2][2])
The function Sat(v)
is the saturation arithmetic operation. For the given signed integer v
, Sat(v)
returns an 8-bit unsigned integer, where
Sat(v) = 0 if v < 0,
Sat(v) = 255 if v > 255,
Sat(v) = v otherwise
-
You are allowed to use only the following registers in the
bmpconv.s
file:zero (x0)
,sp
,ra
,a0
~a4
,t0
~t4
. If you are running out of registers, use stack as temporary storage. -
The maximum amount of the space you can use in the stack is limited to 256 bytes. Let
A
be the address indicated by thesp
register at the beginning of the functionbmpconv()
. The valid stack area you can use is fromA - 256
toA - 1
. You should always access the stack area using thesp
register such assw a0, 16(sp)
. -
The
lb
andsb
instructions are not available in the simulator. Therefore, you need to uselw
andsw
instructions to access data in memory. -
The padding area in the output image should be set to the value 0, if any.
-
Your program should work for any
h
xw
input image whereh
> 3 andw
> 3. -
You are allowed to define any extra functions inside of the
bmpconv.s
file, if necessary. -
Your program should finish within a reasonable time. If your code does not finish within a predefined threshold, it will be terminated.
-
The top 15 fastest
bmpconv()
implementations will receive a 10% extra bonus. Also, the next 15 fastest implementations will receive a 5% bonus. The time will be measured in clock cycles which is identical to the total number of instructions executed.
In order to compile RISC-V assembly programs, you need to build a cross compiler, i.e. the compiler that generates the RISC-V binary code on the x86-64 machine. To build the RISC-V toolchain on your machine (on either Linux or MacOS), please take the following steps. These instructions are also available in the README.md file of the PyRISC toolset.
For Ubuntu, perform the following commands:
$ sudo apt-get install autoconf automake autotools-dev curl libmpc-dev
$ sudo apt-get install libmpfr-dev libgmp-dev gawk build-essential bison flex
$ sudo apt-get install texinfo gperf libtool patchutils bc zlib1g-dev libexpat-dev
If your machine runs MacOS, perform the following command (If you don't have the brew
utility, you have to install it first -- please refer to https://brew.sh):
$ brew install gawk gnu-sed gmp mpfr libmpc isl zlib expat
$ git clone --recursive https://github.com/riscv/riscv-gnu-toolchain
$ cd riscv-gnu-toolchain
$ mkdir build
$ cd build
$ ../configure --prefix=/opt/riscv --with-arch=rv32i
Note that they are installed in the path given as the prefix, i.e. /opt/riscv
in this example. (Warning: This step may take some time.)
$ sudo make
$ export PATH=/opt/riscv/bin:$PATH
PATH
is an environment variable on Linux, specifying a set of directories where executable programs are located. When a command name is specified by the user, the system searches through $PATH
, examining each directory from left to right in the list, looking for a filename that matches the command name.
If you don't want to type the above command every time you log in, put it in your ~/.bashrc
or ~/.bash_aliases
file.
We provide you with the skeleton code for this project. It can be downloaded from Github at https://github.com/snu-csl/ca-pa3/.
To download and build the skeleton code, please follow these steps:
$ git clone https://github.com/snu-csl/ca-pa3.git
$ cd ca-pa3
$ make
riscv32-unknown-elf-gcc -c -Og -march=rv32i -mabi=ilp32 -static bmpconv-main.s -o bmpconv-main.o
riscv32-unknown-elf-gcc -c -Og -march=rv32i -mabi=ilp32 -static bmpconv.s -o bmpconv.o
riscv32-unknown-elf-gcc -c -Og -march=rv32i -mabi=ilp32 -static bmpconv-test.s -o bmpconv-test.o
riscv32-unknown-elf-gcc -T./link.ld -nostdlib -nostartfiles -o bmpconv bmpconv-main.o bmpconv.o bmpconv-test.o
The executable file generated by riscv32-unknown-elf-gcc
should be run in the RISC-V machine. In this project, we provide you with a RISC-V instruction set simulator written in Python, called snurisc. It is available at the separate Github repository at https://github.com/snu-csl/pyrisc. You can install it by performing the following command.
$ git clone https://github.com/snu-csl/pyrisc
To run your RISC-V executable file, you need to modify the ./ca-pa3/Makefile
so that it can find the snurisc simulator. In the Makefile
, there is a variable called PYRISC
. By default, it was set to ../pyrisc/sim/snurisc.py
. For example, if you have downloaded PyRISC in /dir1/dir2/pyrisc
, set PYRISC
to /dir1/dir2/pyrisc/sim/snurisc.py
.
...
PREFIX = riscv32-unknown-elf-
CC = $(PREFIX)gcc
CXX = $(PREFIX)g++
AS = $(PREFIX)as
OBJDUMP = $(PREFIX)objdump
PYRISC = /dir1/dir2/pyrisc/sim/snurisc.py # <-- Change this line
PYRISCOPT = -l 1
INCDIR =
LIBDIR =
LIBS =
...
Now you can run bmpconv
, by performing make run
. The result of a sample run using the snurisc simulator looks like this:
$ make run
/dir1/dir2/pyrisc/sim/snurisc.py -l 1 bmpconv
Loading file bmpconv
Execution completed
Registers
=========
zero ($0): 0x00000000 ra ($1): 0x800001b8 sp ($2): 0x80017ffc gp ($3): 0x00000000
tp ($4): 0x00000000 t0 ($5): 0x000005a0 t1 ($6): 0x4c000000 t2 ($7): 0x00000000
s0 ($8): 0x00000000 s1 ($9): 0x00000003 a0 ($10): 0x80012d2c a1 ($11): 0x00000000
a2 ($12): 0x000000bc a3 ($13): 0x80017ff0 a4 ($14): 0x8001aa30 a5 ($15): 0x00000000
a6 ($16): 0x00000000 a7 ($17): 0x00000000 s2 ($18): 0x80010008 s3 ($19): 0x80010018
s4 ($20): 0x80018020 s5 ($21): 0x80012f24 s6 ($22): 0x80015934 s7 ($23): 0x80018000
s8 ($24): 0x00000000 s9 ($25): 0x00000000 s10 ($26): 0x00000000 s11 ($27): 0x00000000
t3 ($28): 0x00000000 t4 ($29): 0x00000000 t5 ($30): 0x80018020 t6 ($31): 0x00000003
1945220 instructions executed in 1945220 cycles. CPI = 1.000
Data transfer: 448930 instructions (23.08%)
ALU operation: 1180661 instructions (60.70%)
Control transfer: 315629 instructions (16.23%)
If the value of the t6
(or x31
) register is nonzero, it means that your program has failed to pass all test cases. If you failed a test case, it shall stop running and the index of the test case will be stored in t6
. The memory address of the first incorrect word will be stored in t5
. For example, if the value of t6
is equal to 0x00000003, it means that your program passed test 1 and 2, but didn't pass test 3. If the value of t5
is equal to 0x80018020, it means that your program has failed to match the value of the answer output at the memory address 0x80018020.
Please note that the simulator snurisc.py
has the ability to show various log information. For example, if you specify the log level 3 (-l 3
), you can see each instruction executed by the simulator as shown below.
/dir1/dir2/pyrisc/sim/snurisc.py -l 3 bmpconv
Loading file bmpconv
0 0x80000000: lui sp, 0x80018000
1 0x80000004: jal ra, 0x8000000c
2 0x8000000c: addi sp, sp, -4
3 0x80000010: sw ra, 0(sp)
4 0x80000014: addi t6, zero, 1
5 0x80000018: auipc s2, 0x10000
6 0x8000001c: addi s2, s2, -24
7 0x80000020: auipc s3, 0x10000
8 0x80000024: addi s3, s3, -16
9 0x80000028: lui a4, 0x80018000
10 0x8000002c: lw a3, 0(s2)
11 0x80000030: lw a2, 12(a3)
12 0x80000034: lw a1, 16(a3)
13 0x80000038: addi a0, a3, 20
14 0x8000003c: jal ra, 0x80000090
15 0x80000090: jalr zero, ra, 0
16 0x80000040: lui s4, 0x80018000
17 0x80000044: lw s5, 0(s3)
18 0x80000048: lw s6, 4(s3)
19 0x8000004c: lw s7, 0(s4)
20 0x80000050: lw s8, 0(s5)
21 0x80000054: bne s7, s8, 0x80000088
22 0x80000088: addi t5, s4, 0
23 0x8000008c: ebreak
Execution completed
...
-
Submit only the
bmpconv.s
file to the submission server. -
If your program contains any register names other than the allowed ones, it will be rejected by the server.
-
Your program will be rejected if it contains such keywords as
.octa
,.quad
,.long
,.int
,.word
,.short
,.hword
,.byte
,.double
,.single
,.float
, etc.
- You will work on this project alone.
- Only the upload submitted before the deadline will receive the full credit. 25% of the credit will be deducted for every single day delay.
- You can use up to 4 slip days during this semester. If your submission is delayed by 1 day and if you decided to use 1 slip day, there will be no penalty. In this case, you should explicitly declare the number of slip days you want to use in the QnA board of the submission server after each submission. Saving the slip days for later projects is highly recommended!
- Any attempt to copy others' work will result in heavy penalty (for both the copier and the originator). Don't take a risk.
Have fun!
Jin-Soo Kim
Systems Software and Architecture Laboratory
Dept. of Computer Science and Engineering
Seoul National University