Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SGVector cleanup. #2582

Open
iglesias opened this issue Oct 27, 2014 · 23 comments
Open

SGVector cleanup. #2582

iglesias opened this issue Oct 27, 2014 · 23 comments

Comments

@iglesias
Copy link
Collaborator

Remove all the methods producing clutter in SGVector. This includes (although it is not limited to) math operations like linspace, dot, and other operations like find. The objective is to have an SGVector as lean as possible.

Systematically, the work to be done for each method reduced is:

  1. Move the method from SGVector to another class, like for instance to CMath or to whichever it makes sense the most.
  2. Refactor Shogun's internal code.
  3. Refactor examples (libshogun and interface examples, if any).
  4. In case the method already had unit tests in SGVector, then refactor them (and move them). Otherwise, write some good unit tests.

See this pull request to gain understanding of what this task consists of, #2579.

@iglesias
Copy link
Collaborator Author

I forgot to mention that it is important to do this separately, both to ease the review and the cleanup process. That means that the best practice would be to send small and individual pull requests for each method.

@iglesias
Copy link
Collaborator Author

See the wiki for more motivation why we want to do this.

@sanuj
Copy link
Contributor

sanuj commented Dec 8, 2014

@iglesias, I would like to work on this. Shall I start by moving dot to CMath? Should I do more changes in the first pull request?

@iglesias
Copy link
Collaborator Author

iglesias commented Dec 8, 2014

According to what I wrote above, moving dot to CMath sounds good :-)

I think that doing that one alone in a first pull request is good.

@sanuj
Copy link
Contributor

sanuj commented Dec 13, 2014

@iglesias I'll leave the functions related to linear algebra (we might want to move them to linalg). So I'll leave functions like add, sum, norm, add_scalar, product, vector_multiply etc for now while we are working on linalg NATIVE implementations. Should I meanwhile move functions like max, min, mean, arg_max, arg_min, max_abs to CMath? Don't know where to put linspace and sorting functions.

@iglesias
Copy link
Collaborator Author

Sure. In my opinion, max, min, argmax, argmin, and linspace make sense in CMath. Mean might make more sense in CStatistics. About sorting I am not sure, I leave it up to you :-) But there might be already some sorting implemented in CMath (iirc).

@sanuj
Copy link
Contributor

sanuj commented Dec 15, 2014

I am a bit confused. As far as I know, the declaration and definition of a template should be kept in one single file, but if we take the case of SGVector, then it has a header file and a cpp file. How is this working out?

@iglesias
Copy link
Collaborator Author

We do a small trick. Look at the end of SGVector.cpp. There, we have defined the classes for which SGVector can be templated (which are basically primitive data types).

@sanuj
Copy link
Contributor

sanuj commented Dec 17, 2014

There is SGVector::linspace_vec() which returns a vector, SGVector::linspace() which returns a float64_t * and depends on CMath::linspace which returns void (passes the output through an input parameter pointer). None of them are used anywhere apart from SGVector and CMath. What to do with these?

@iglesias
Copy link
Collaborator Author

If they are not used anywhere, I'd say we can drop them. We can for sure drop the ones in SGVector. The CMath one can stay.

@sanuj
Copy link
Contributor

sanuj commented Dec 22, 2014

There are three functions related to sorting in SGVector: qsort(), argsort() and is_sorted()
qsort() looks like this

template <class T>
void SGVector<T>::qsort()
{
    CMath::qsort<T>(vector, vlen);
}

So if we move qsort to CMath then we will have to pass the vector as an argument or we can remove SGVector<T>::qsort entirely and use the one mentioned in the above patch but in this case we'll need to pass the length as well.

Should I also move argsort? (moving is_sorted() doesn't make sense to me)

@iglesias
Copy link
Collaborator Author

In my opinion, void CMath::qsort(SGVector<T>), SGVector<index_t> CMath::argsort(SGVector), and bool CMath::is_sorted(SGVector<T>) make sense.

@curiousguy13
Copy link
Contributor

@iglesias would it be fine , if I moved range_fill to CMath ?

@iglesias
Copy link
Collaborator Author

@curiousguy13, it sounds good to me, both range_fill and range_fill_vector should go out of SGVector. Do something nice with them (perhaps, simplifying to only one method) and move to CMath. Keep in mind unit tests. Looking forward to see the pull request.

@lambday
Copy link
Member

lambday commented Feb 18, 2015

@iglesias I think range_fill, zeros, these are perfect to be shifted to linalg instead.

@iglesias
Copy link
Collaborator Author

It sounds good!
On 18 Feb 2015 10:26, "Soumyajit De" notifications@github.com wrote:

@iglesias https://github.com/iglesias I think range_fill, zeros, these
are perfect to be shifted to linalg instead.


Reply to this email directly or view it on GitHub
#2582 (comment)
.

@Hephaestus12
Copy link
Contributor

I would like to work on this. Shall I start by moving add to CMath?

@gf712
Copy link
Member

gf712 commented Mar 22, 2020

I would like to work on this. Shall I start by moving add to CMath?

You shouldn’t move to CMath but to linalg instead. However linalg already has add. But it would be good to remove add from SGVector. AFAIK CMath (now Math) should be dropped.

@Hephaestus12
Copy link
Contributor

This is for the method add() in SGVector.
I don't think simply removing it and replacing its calls with the add method in the Linalg framework will help. For the following reasons:

In file: src/shogun/mathematics/linalg/backend/eigen/BasicOps.cpp :

#define BACKEND_GENERIC_IN_PLACE_ADD(Type, Container)                          \
	void LinalgBackendEigen::add(                                              \
	    const Container<Type>& a, const Container<Type>& b, Type alpha,        \
	    Type beta, Container<Type>& result) const                              \
	{                                                                          \
		add_impl(a, b, alpha, beta, result);                                   \
	}

add() calls add_impl()

In file: src/shogun/mathematics/linalg/backend/eigen/BasicOps.cpp :

template <typename T>
void LinalgBackendEigen::add_impl(
    const SGVector<T>& a, const SGVector<T>& b, T alpha, T beta,
    SGVector<T>& result) const
{
	typename SGVector<T>::EigenVectorXtMap a_eig = a;
	typename SGVector<T>::EigenVectorXtMap b_eig = b;
	typename SGVector<T>::EigenVectorXtMap result_eig = result;

	result_eig = alpha * a_eig + beta * b_eig;
}

add_impl() uses the operator+

In file: src/shogun/lib/SGVector.cpp :

/** addition operator */
template<class T>
SGVector<T> SGVector<T>::operator+ (SGVector<T> x)
{
	assert_on_cpu();
	require(x.vector && vector, "Addition possible for only non-null vectors.");
	require(x.vlen == vlen, "Length of the two vectors to be added should be same. [V({}) + V({})]", vlen, x.vlen);

	SGVector<T> result=clone();
	result.add(x);
	return result;
}

The operator+ uses SGVector's add() method.

So indirectly, even linalg's add() method uses SGVector's add() method.
Will we need to entirely refactor the linalg framework code for this?

@gf712
Copy link
Member

gf712 commented Mar 23, 2020

hmm, I think you are confusing things. operator+ in linalg is from Eigen, which uses SIMD instructions where possible.
result_eig = alpha * a_eig + beta * b_eig; is written with Eigen types. See the definition of EigenVectorXtMap in SGVector

@Hephaestus12
Copy link
Contributor

Is there a method in linalg to add a dense vector to a sparse vector and/or add two sparse vectors?

@Hephaestus12
Copy link
Contributor

Is there a method in linalg to add a dense vector to a sparse vector and/or add two sparse vectors?
Or will we need to write one ourselves?
@gf712

@karlnapf
Copy link
Member

don't think there is atm.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants