Skip to content

Day 3 (190903)

Juhwi Eden Kim edited this page Oct 7, 2019 · 1 revision
  • Optimized the code by removing unnecessary calling and copies.

Optimization


Arccording to the lecture, I can profile the program with gprof. gprof
For this time, let's just see the lecture.

1. Reduce the time which is spent on calling line( )

%   cumulative   self              self     total 
time   seconds   seconds    calls  ms/call  ms/call  name 
69.16      2.95     2.95  3000000     0.00     0.00  line(int, int, int, int, TGAImage&, TGAColor) 
19.46      3.78     0.83 204000000     0.00     0.00  TGAImage::set(int, int, TGAColor) 
8.91      4.16     0.38 207000000     0.00     0.00  TGAColor::TGAColor(TGAColor const&) 
1.64      4.23     0.07        2    35.04    35.04  TGAColor::TGAColor(unsigned char, unsigned char, unsigned char, unsigned char) 
0.94      4.27     0.04                             TGAImage::get(int, int)

This is the output of gprof from the lecture. 70% of the time is spent on calling line( ).

int dx = x1-x0;
int dy = y1-y0;
float derror = std::abs(dy/float(dx));

float error = 0;
int y = y0;

for (int x=x0; x<=x1; x++) {
    if (steep) {
        image.set(y, x, color);
    } else {
        image.set(x, y, color);
    }
    error += derror;
    if (error>.5) {
        y += (y1>y0?1:-1);
        error -= 1.;
    }
}

So that I can take out the division with the same divisior(float t = (x-x0)/(float)(x1-x0); int y = y0*(1.-t) + y1*t;) of the loop. Now only setters are in the loop. It was hard to figure out what the algorithm means. optmz1
The float numbers in the table is the value of error.
But what if I take just the first calculation out of the loop? I wanted to try, but maybe later.

2. Remove the floating point

%   cumulative   self              self     total 
time   seconds   seconds    calls  ms/call  ms/call  name 
69.16      2.95     2.95  3000000     0.00     0.00  line(int, int, int, int, TGAImage&, TGAColor) 
19.46      3.78     0.83 204000000     0.00     0.00  TGAImage::set(int, int, TGAColor) 
8.91      4.16     0.38 207000000     0.00     0.00  TGAColor::TGAColor(TGAColor const&) 
1.64      4.23     0.07        2    35.04    35.04  TGAColor::TGAColor(unsigned char, unsigned char, unsigned char, unsigned char) 
0.94      4.27     0.04                             TGAImage::get(int, int)

To remove the floating point, let's assign the value which is equal to the origin x 2 x dx. optmz2

int dx = x1-x0;
int dy = y1-y0;
int derror = std::abs(dy)*2;

float error = 0;
int y = y0;

for (int x=x0; x<=x1; x++) {
    if (steep) {
        image.set(y, x, color);
    } else {
        image.set(x, y, color);
    }
    error += derror;
    if (error > dx) {
        y += (y1>y0?1:-1);
        error -= dx*2;
    }
}

optmz3
The result:

%   cumulative   self              self     total 
time   seconds   seconds    calls  ms/call  ms/call  name 
42.77      0.91     0.91 204000000     0.00     0.00  TGAImage::set(int, int, TGAColor) 
30.08      1.55     0.64  3000000     0.00     0.00  line(int, int, int, int, TGAImage&, TGAColor) 
21.62      2.01     0.46 204000000     0.00     0.00  TGAColor::TGAColor(int, int) 
1.88      2.05     0.04        2    20.02    20.02  TGAColor::TGAColor(unsigned char, unsigned char, unsigned char, unsigned char) 


optmz4 It works really fast now.