Skip to content

OpenBLAS 'slower' than FOR LOOP for Matrix Multiplication ? #1636

@cefengxu

Description

@cefengxu

comparing the speed between traditional meth and openBLAS for Matrix Multiplication , however , obtain result seems kindar confusing.

testing code below

    double start , finish ;
    int NUM = 100;
    double a1[3*2] = {  1, 2,  /* CblasRowMajor */
                        3, 4,
                        5, 6
                };
    double b[2*3] = {  7,8, 9,   /* CblasRowMajor */
                       1, 2, 3
                };

    double c[3*3] = {0,0,0,0,0,0,0,0,0};

    start = clock();

    for( int k=0 ; k<NUM ;k++)
    {
        for(int i=0 ; i<3 ; i++)
        {
            for(int j=0;j<3;j++){
                for (int p = 0; p < 2 ; ++p) 
                    c[i*3 +j] = c[i*3 +j] + a1[i*2+p]*b[p*3+j];         
            }
        }
    }

    finish = clock();
    printf("FOR LOOP : %f seconds\n",(finish-start)/1000);

    start = clock();

    for( int k=0 ; k<NUM ;k++)
    {
        cblas_dgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans, 3,    3,    5, 1.0,   a1,   2, b, 3,  0.0, c,  3);
    }
    finish = clock();
    printf("OpenBLAS : %f seconds\n",(finish-start)/1000);

i run the code on PC and Android Platform , and then get the result as follow:

// run on PC ( build with gcc )

FOR LOOP : 0.012000 seconds
OpenBLAS : 0.186000 seconds

FOR LOOP : 0.008000 seconds
OpenBLAS : 0.130000 seconds

// run on ANDROID ( build with clang )

FOR LOOP : 0.006000 sec
OpenBLAS : 0.303000 sec

FOR LOOP : 0.012000 sec
OpenBLAS : 0.214000 sec

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions