Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Linalg - Vector Class #3277

Closed
wants to merge 8 commits into from

Conversation

OXPHOS
Copy link
Member

@OXPHOS OXPHOS commented Jun 9, 2016

No description provided.

@@ -33,6 +33,8 @@

#include <shogun/mathematics/linalg/SGLinalg.h>

#include<iostream>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not be required :)

@@ -54,7 +54,7 @@ struct BaseVector
}

/** Data Storage */
virtual bool onGPU() = 0 ;
virtual bool onGPU() const = 0 ;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

forgive my ignorance but if we have CPUVector and GPUVector then why do we need onGPU function. since the class could identify the storage or?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, this was decided because dynamic cast is thought to be slower compared to one virtual call + static cast. But I see what you mean. We probably won't need to have two different classes, only one class should suffice.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@OXPHOS what was the issue again with putting both these thing inside a single class? I think it is doable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well instead of dynamic casting see the SGObject get_name virtual function. that's a more general way to identify the type of the object without dynamic casting.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay thanks. But maybe we won't have to do the casting, if we can do what you discussed earlier in the gist @OXPHOS made - have it in the same class and call onGPU method. I can't seem to remember what was stopping us from doing that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah that would be more ideal indeed.

@OXPHOS OXPHOS changed the title Linalg - Vector Convert Method Linalg - Vector Class Jun 13, 2016
@OXPHOS
Copy link
Member Author

OXPHOS commented Jun 13, 2016

ref #3278
The vector class implementation based on my understand.

So the user will convert SGVector -> LinalgVector for both CPU and GPU vector as the first step.
Vector can be transferred to GPU if user explicitly calls LinalgVector.transfer_to_GPU().

I'll check #3278 'comments for more details tomorrow as it is kinda late now.
@karlnapf @lambday @vigsterkr


#ifdef HAVE_CXX11
#ifdef HAVE_VIENNACL
#include <viennacl/vector.hpp>
Copy link
Member Author

@OXPHOS OXPHOS Jun 13, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class should be able to transfer vectors from CPU to GPU via multiple GPU libraries (and use whatever GPU library the user assigns) in the future

private:
alignas(CPU_CACHE_LINE_SIZE) bool m_onGPU;
alignas(CPU_CACHE_LINE_SIZE) index_t m_len; // same logic
alignas(CPU_CACHE_LINE_SIZE) T* m_data; // non-owning ptr, referring to the SGVector.vector
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment // non-owning ptr, referring to the SGVector.vector is no longer valid, since this is now a deep copy. Can we please have shared-ptr for m_data here instead?

@lambday
Copy link
Member

lambday commented Jun 14, 2016

Hi @OXPHOS !

Looks good. Please check my comments. Also, please squash all commits together. We should be able to merge this soon!

@lambday
Copy link
Member

lambday commented Jun 14, 2016

Also, please add unit-tests to all the methods you added. Specially transferToCPU, transferToGPU, cast operators, etc etc. Make sure everything works with valgrind as well :)

@OXPHOS
Copy link
Member Author

OXPHOS commented Jun 14, 2016

There's still a problem - say we have both CPU and GPU storage for the same vector. Now every time when we re extracting the Vector, we update the Vector based on GPU information. However, if someone transferred the vector to GPU, but have the elementwise.product done on CPU, GPU is actually behind CPU. In this case, current SGVector<T>() -> transfertoCPU() doesn't make sense at all.

I can think of 1). If onGPU()==true and the calculation is done on CPU, we synchronize GPU vector everytime. This will be slow. 2). Ask the user to transfer/synchronize GPU/CPU data before using Vector instance to initialize SGVector

@lambday
Copy link
Member

lambday commented Jun 14, 2016

Aren't we performing the operations on GPU if onGPU returns true by design? If you check the dot method that you implemented (other methods will be implemented in a similar fashion), then there won't be any case when GPU data is present but the operation is performed on CPU. Does it solve the issue?

Vector<T>::Vector(SGVector<T> const &vector)
{
init();
m_data = std::shared_ptr<T>(reinterpret_cast<T*>(SG_MALLOC(aligned_t, vector.vlen)), free);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use SG_FREE instead of free since you have used SG_MALLOC :)

}

template <class T>
viennacl::const_entry_proxy<T> Vector<T>::GPUVectorImpl::operator[](index_t index) const
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a dangerous operations, as expensive. Remove.
For unit tests, we can just transfer the full structure back to main memory.
There is no need to access single elements of gpu structures, just do linalg operations on them

@vigsterkr
Copy link
Member

@OXPHOS OXPHOS closed this Jun 22, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants