Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Linalg benchmark #3247

Merged

Conversation

OXPHOS
Copy link
Member

@OXPHOS OXPHOS commented Jun 6, 2016

  • compare CPU.dot and GPU.dot of current linalg and refactored linalg
  • ref Linalg Refactor - CPU dot #3230
  • cannot use global variable: sg_linalg (segmentation fault 11)

@OXPHOS
Copy link
Member Author

OXPHOS commented Jun 6, 2016

I can only do:
clang++ -O3 -std=c++11 ../benchmarks/linalg_refactor_benchmark.cpp -I/usr/include/eigen3 -lshogun -lhayai_main -framework OpenCL -o benchmark

g++ and -lOpenCL didn't work for me.

For -lOpenCL, the error was: cannot find OpenCL.
For g++ and -framework OpenCL, I got error: https://gist.github.com/OXPHOS/dc59556267bc24614ebceb0af9ba8218

@lambday
Copy link
Member

lambday commented Jun 6, 2016

Hi @OXPHOS

The point of the benchmark is to compare the performance between direct library function call and shogun linalg method call, NOT between CPU and GPU backend. The two benchmarks should be something like

  • create SGVector and CPUVector from it. Now compare between (a) creating Eigen3 map directly and calling eigen3 dot (b) calling shogun linalg dot on CPUVector that uses Eigen3 backend.
  • create SGVector and GPUVector from it. Now compare between (a) calling viennacl inner_prod on the viennacl vector (b) calling shogun linalg dot on GPUVector.

Run this experiment with vectors of size 1000, 10000, 100000 and so on. We want to check whether using our version of dot has any overhead, and if yes, how much. See the point?

There might be an issue with your OpenCL installation. Should not be too hard to fix it.

{
index_t n = num_rows;
std::vector<int32_t> mem(n);
std::iota(mem.data(), mem.data() + n, begin);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why create std::vector first?

@OXPHOS
Copy link
Member Author

OXPHOS commented Jun 6, 2016

(not sure where to put this)
graph:
https://drive.google.com/file/d/0B6YkOSoC_E5raktfWFRweXgwa1U/view
x-axis: vector size ranges from 10 - 10^7,
left panel compares Eigen3 dot product with explicit method(blue) and CPUVector wrapper method(black).
right panel compares ViennaCL dot product with explicit method(blue) and GPU_Vector wrapper method(black).
GPU_Vector - GPUBackend is extremely slow..

@lambday
Copy link
Member

lambday commented Jun 7, 2016

Hi @OXPHOS!

Great job with the benchmarks :) This is exactly what we were looking for. Interesting issues will be visible from these benchmarks.

Let me comment on the code first and then we'll discuss

Ac = init_c(A);
Bc = init_c(B);
Ag = init_g(A);
Bg = init_g(B);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indentation :)

@lambday
Copy link
Member

lambday commented Jun 7, 2016

Hey there! Please check the comments. We need to make sure that no extra copying is happening under the hood. The dot interface we designed is supposed to work with base vector pointers. So that's exactly what the test-cases should deal with. Maybe changing those to pointers have an effect on the result?

Good job by the way! Also, please check the coding style for shogun. It is better to follow a uniform style everywhere.

Let me know when you fix the issues. Would be interesting to see how it performs in other machines.


SGVector<value_type> A;
SGVector<value_type> B;
CPUVector<value_type> Ac;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just have a BaseVector<value_type>* Ac here. Same for others. Should be allocated in the heap.

@lambday
Copy link
Member

lambday commented Jun 7, 2016

benchmark code, travis check not needed. merging.

@lambday lambday merged commit ca09094 into shogun-toolbox:feature/linalg_refactor Jun 7, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants