See https://gist.github.com/karlnapf/5722391 to reproduce
from numpy.random import randn,randint
statement in the gist, is causing compile errors.
code fix for Issue #1158
Getting info from gdb: the segfault occurs in : shogun/src/shogun/regression/LeastAngleRegression.cpp line:332
float64_t diag_k = cblas_ddot(X.num_rows, X.get_column_vector(i_max_corr), 1,
This .cpp is actually the implementation of LARS. The cblas_ddot function calculates the dot product of two vectors. With a close look we can see that in the cblas_ddot function as N parameter (which should be the number of each vector's attributes) it is passed the X.num_rows (number of rows in X matrix ) which is wrong. In order to calculate properly the dot product we need to pass the actual number of attributes(N) which in this case should be the X.num_cols (the columns of X matrix).
In this example the segfaults occurs because the rows of matrix X (dim var in python code) are bigger than the X columns (n var in python code). In that case it where dim > n (rows > columns) the cblas_ddot will try to access the vector pointers (double *X, *Y) beyond memory reach (N can not be bigger than the actual number of the vectors' attributes which is the number of the columns of X matrix).
If you try the same piece of python code with values where dim < n (eg try with dim <= 250) the script will work. Even if it is working with dim < n, I think it is not correct cause the dot product is not calculated properly.
I will make a pull request with the corresponding code fix. (please correct me if I misunderstood something)
So this line is basically computing the dot product of the column of a matrix with itself. I understand your explanation. However, since the number of elements in the column of a matrix is actually the number of rows, I think I am missing something here.
Fixed in #1893
I also just tried the original python code and no more segfaults. Nice!