Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dot product is slow! #131

Closed
ashmat98 opened this issue Aug 29, 2019 · 2 comments
Closed

dot product is slow! #131

ashmat98 opened this issue Aug 29, 2019 · 2 comments

Comments

@ashmat98
Copy link

I'm trying to reach the same performance as NumPy. With this code below I evaluate the speed of dot product using xtensor-blas:
benchmark.cpp

#include <iostream>
#include "xtensor/xarray.hpp"
#include "xtensor/xrandom.hpp"
#include "xtensor-blas/xlinalg.hpp"


xt::xtensor<double, 2> my_dot(xt::xtensor<double, 2> &A, xt::xtensor<double, 2> &B){
    auto start = std::chrono::steady_clock::now();

    xt::xtensor<double, 2> C = xt::linalg::dot(A, B);

    auto end = std::chrono::steady_clock::now();
    std::cout << std::chrono::duration<double, std::milli>(end - start) .count()
                << " ms" << std::endl;
    return C;
}


int main(int argc, char *argv[])
{
    for (int i=0;i<3;i++){
        xt::xtensor<double, 2> A = xt::random::randn({1000,2000}, 0., 1.);
        xt::xtensor<double, 2> B = xt::random::randn({2000,2000}, 0., 1.);

        xt::xtensor<double, 2> C = my_dot(A, B);
    }
    return 0;
}

I compile this with this command:

$ g++ benchmark.cpp -O3 -mavx2 -ffast-math \
 -I/home/--user--/install_path/include \
 -lblas -llapack \
 -DHAVE_CBLAS=1 \
 -o a.out

and I get

$ ./a.out
2395.9 ms
2415.98 ms
2461.93 ms

(Install_path/include contains all necessary libraries I.e xtensor, xtensor-blas...)
While when I multiply arrays of the same size in NumPy I get

$ python benchmark.py
76.17 ms
76.05 ms
79.00 ms

So this is mush SLOWER!
I've installed sudo apt-get install libblas-dev liblapack-dev
System:

Ubuntu: 18.04
gcc: 7.4.0
latest versions of xtensor libraries

Could you please help me with this?

@ashmat98
Copy link
Author

I installed libopenblas-dev instead of libblas-dev and I got faster computation. I dont know why blas was slower than openblas (blas was using single thread) but I got what I needed. I'm closing the issue

@wolfv
Copy link
Member

wolfv commented Aug 31, 2019

thanks for figuring it out :)

NumPy and xtensor-blas both use BLAS so the performance should be dependent on whcih BLAS library you use. OpenBLAS and MKL are probably the fastest.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants