LARS Segfault in train method #1158

Closed
karlnapf opened this Issue Jun 6, 2013 · 5 comments

Projects

None yet

5 participants

@lisitsyn lisitsyn was assigned Jun 6, 2013
@pranet
pranet commented Feb 15, 2014

A missing
from numpy.random import randn,randint
statement in the gist, is causing compile errors.

https://gist.github.com/pranet/9019380

@alamages

Getting info from gdb: the segfault occurs in : shogun/src/shogun/regression/LeastAngleRegression.cpp line:332
float64_t diag_k = cblas_ddot(X.num_rows, X.get_column_vector(i_max_corr), 1,
X.get_column_vector(i_max_corr), 1);

This .cpp is actually the implementation of LARS. The cblas_ddot function calculates the dot product of two vectors. With a close look we can see that in the cblas_ddot function as N parameter (which should be the number of each vector's attributes) it is passed the X.num_rows (number of rows in X matrix ) which is wrong. In order to calculate properly the dot product we need to pass the actual number of attributes(N) which in this case should be the X.num_cols (the columns of X matrix).

In this example the segfaults occurs because the rows of matrix X (dim var in python code) are bigger than the X columns (n var in python code). In that case it where dim > n (rows > columns) the cblas_ddot will try to access the vector pointers (double *X, *Y) beyond memory reach (N can not be bigger than the actual number of the vectors' attributes which is the number of the columns of X matrix).

If you try the same piece of python code with values where dim < n (eg try with dim <= 250) the script will work. Even if it is working with dim < n, I think it is not correct cause the dot product is not calculated properly.

I will make a pull request with the corresponding code fix. (please correct me if I misunderstood something)

@iglesias
Contributor

So this line is basically computing the dot product of the column of a matrix with itself. I understand your explanation. However, since the number of elements in the column of a matrix is actually the number of rows, I think I am missing something here.

@alamages

@iglesias Sorry my bad, I was hasty and overlooked that it is the dot
product of the column vector(and not the row vector). I will look at it
again.

On Thu, Feb 20, 2014 at 7:45 AM, Fernando Iglesias <notifications@github.com

wrote:

So this line is basically computing the dot product of the column of a
matrix with itself. I understand your explanation. However, since the
number of elements in the column of a matrix is actually the number of
rows, I think I am missing something here.

Reply to this email directly or view it on GitHubhttps://github.com/shogun-toolbox/shogun/issues/1158#issuecomment-35591148
.

@karlnapf
Member

Fixed in #1893
I also just tried the original python code and no more segfaults. Nice!

@karlnapf karlnapf closed this Feb 27, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment