New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/factory #4198
Feature/factory #4198
Conversation
e8ca7d1
to
822504d
Compare
@@ -11,7 +11,7 @@ | |||
"Comment": "//$comment\n", | |||
"Init": { | |||
"Construct": "auto $name = some<C$typeName>($arguments)$kwargs", | |||
"Copy": "auto $name = $expr$kwargs", | |||
"Copy": "auto $name = wrap($expr)$kwargs", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lisitsyn re-introduced wrap since we will get memory errors otherwise (things are not ref'ed and then deleted by objects)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit strange. Copy ctor doesn't ref?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The point here is if you do
'labels = svm->precict'
then Shogun returns a raw pointer. Now if you pass this to another object that increases the ref-counter, and then decreases it, the object is destroyed. So calling a method on it subsequently crashes. This is why I re-added it, just like when you introduced it in the first place.
Or you mean something else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I've got it now. I guess we change it once we get rid of raw pointers.
@@ -2,6 +2,7 @@ STRING(REGEX REPLACE "(.*)/narray.*$" "\\1" NARRAY_PATH ${NARRAY_LIB}) | |||
|
|||
LIST(APPEND EXCLUDED_RUBY_META_EXAMPLES | |||
base_api-put_get | |||
base_api-factory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vigsterkr same problem as for put: the check for SGMatrix/object bails
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(so I blacklisted it for now)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but that should be fixed in develop, so rebase over develop, otherwise i dont understand what's the problem
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed in develop? I don't remember us fixing that SWIG bug? Did I miss it?
#4177
File features_file = csv_file("../../data/classifier_binary_2d_nonlinear_features_train.dat") | ||
Features features_from_file = features(features_file, enum EPrimitiveType.PT_FLOAT64) | ||
|
||
File labels_file = csv_file("../../data/classifier_binary_2d_nonlinear_labels_train.dat") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
labels from file will be CDenseLabels
which are then converted in the classes (needs some refactor in algorithms, see below)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you have just added the factory.sg could you change the relative path (../../
) to an env var? i'm happy to add the support for that for every language 💃
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as if we install every these meta examples, the location of data should be somewhere very else :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep absoltely agree. But I think maybe let's do that for all meta examples in another PR?
I can do that next
if (!object) | ||
{ | ||
SG_SERROR("No such class %s", name); | ||
SG_SERROR( | ||
"Class %s with primitive type %s does not exist.\n", name, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO we need std::iostream for our logs, then we could just overload the << operator and dont have to introduce all these to_string methods ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
although it looks shitty but operator <<
will not solve your problem to convert some objects to string.. that only works with primitive types
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on the other hand having these macros is nice in a long run as we could switch finally our logging backend. maybe spdlog? that allows us to redirect the logs into any sink
not only stdout/err
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know but I can overload the << thing, no? Then I dont need the function calls when I write error messages, but can collect them somewhere else?
Totally up for doing a new logging backend. GSoC project part?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
new logging backend = just change the macro definitions.
yeah you can offload the operator << but that means that you force to use std::iostream
-like interface to be used, which is not the most liked IO back-end (for good reasons) in c++ community
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on the other hand you can do the same thing, i.e. implement to_string() for that object where you wanna have operator <<, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can introduce an std::cout-like object which supports <<. Internally it could use any logger we want.
src/shogun/classifier/svm/LibSVM.cpp
Outdated
@@ -48,7 +48,7 @@ bool CLibSVM::train_machine(CFeatures* data) | |||
struct svm_node* x_space; | |||
|
|||
ASSERT(m_labels && m_labels->get_num_labels()) | |||
ASSERT(m_labels->get_label_type() == LT_BINARY) | |||
auto binary_labels = m_labels->as_binary(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lisitsyn @vigsterkr this is how labels could be converted in algorithms, if they need an interface to CBinaryLabels for example
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
template<class ST> CSparseFeatures<ST>::CSparseFeatures(CDenseFeatures<ST>* dense) | ||
template <class ST> | ||
CSparseFeatures<ST>::CSparseFeatures(CDenseFeatures<ST>* dense) | ||
: CDotFeatures(0), feature_cache(NULL) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was causing a segfault in python legacy examples every now and then
src/shogun/labels/BinaryLabels.cpp
Outdated
|
||
using namespace shogun; | ||
|
||
CBinaryLabels::CBinaryLabels() : CDenseLabels() | ||
{ | ||
} | ||
|
||
CBinaryLabels::CBinaryLabels(const CBinaryLabels& orig) : CDenseLabels(orig) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I gave labels copy constructors to the sake of converting them, see as_binary
below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need a copy? i mean it's over the same datastruct no? do we really want the as_binary
be a copy? dont we wanna use c++11's feature of std::move and rvalue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I thought about this. The reason why I made it a new instance was the fact that in the as_binary
below, all other cases return a new instance, so to be consistent.
And also then the as_binary
method can be const. On the other hand, as_binary
could return const pointer ...
If you think it is better to just cast, I can do that. Not sure I get what you mean with the move and rvalue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm not so sure if i get what this has to do with constness...
i'm talking about the fact that instead of copying the label, what you want is the same memory/object & everything just as a different class, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so then use rvalue/move instead of this pattern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how do you want to deal with converting say multiclass labels (that only contain two classes) to binary? For this, the actual vector needs to be changed, so a new instance is needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For converting the same datastruct, I see the point
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is fine to copy BinaryLabels most of the time. What should be rather copy-on-write is the data beneath.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would actually prefer if this thing gave you a read-only labels object (we wanna have immutable labels anyways)
Note that memory is not copied here.
Also, after reading on it, I think the move constructor stuff doesnt really help here
@@ -77,15 +83,15 @@ void CDenseLabels::set_labels(SGVector<float64_t> v) | |||
m_labels = v; | |||
} | |||
|
|||
SGVector<float64_t> CDenseLabels::get_labels() | |||
SGVector<float64_t> CDenseLabels::get_labels() const |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made lots of getters const
|
||
virtual const char* get_name() const override | ||
{ | ||
return "DenseLabels"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lisitsyn made the class non-abstract, for the sake of loading from files (one doenst know what type of labels it is at that point, only when the algorithm tries to convert to say binary labels we know)
src/shogun/labels/Labels.cpp
Outdated
{ | ||
return m_current_values; | ||
} | ||
|
||
Some<CBinaryLabels> CLabels::as_binary() const |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mentioned conversion method to convert label types into each other
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see above regarding move/rvalue
@@ -927,7 +927,7 @@ void SGVector<T>::scale(T alpha) | |||
|
|||
template<class T> void SGVector<T>::load(CFile* loader) | |||
{ | |||
REQUIRE(loader, "Require a valid 'c FILE pointer'\n"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great error message for users
@@ -66,7 +67,7 @@ struct CQuadraticTimeMMD::Self | |||
SGVector<float64_t> gamma_fit_null(); | |||
|
|||
CQuadraticTimeMMD& owner; | |||
unique_ptr<CMultiKernelQuadraticTimeMMD> multi_kernel; | |||
shared_ptr<CMultiKernelQuadraticTimeMMD> multi_kernel; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
meta examples now use some
so cannot have a unique ptr inside shogun and expose the raw pointer via SWIG
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but this way you loose:
- std::move the thing around - maybe not so important
- enforce sole ownership of the resource.... @lambday was this intentional to have it as unique? i would presume yes....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
discussed with him, he was fine with this change. Alternative would be to always return a new instance in the ::multiclass()
method (which currently returns the raw pointer)
return result; | ||
} | ||
|
||
CLabels* labels(CFile* file) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vigsterkr @lisitsyn this is the stuff behind the factory API
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the approach. It would become the main interface.
If u need to have new mem of course u need a new instance but thats not always the case. Hence have both of them and the compiler will figure it out for u
… On 7 Mar 2018, at 17:22, Heiko Strathmann ***@***.***> wrote:
@karlnapf commented on this pull request.
In src/shogun/labels/BinaryLabels.cpp:
>
using namespace shogun;
CBinaryLabels::CBinaryLabels() : CDenseLabels()
{
}
+CBinaryLabels::CBinaryLabels(const CBinaryLabels& orig) : CDenseLabels(orig)
For converting the same datastruct, I see the point
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
Btw now thinking of this i’m not so sure we need this things explicit... as it makes the whiole thing less smooth.... afaik you could have the whole thing implicit with the right ctors
… On 7 Mar 2018, at 17:22, Heiko Strathmann ***@***.***> wrote:
@karlnapf commented on this pull request.
In src/shogun/labels/BinaryLabels.cpp:
>
using namespace shogun;
CBinaryLabels::CBinaryLabels() : CDenseLabels()
{
}
+CBinaryLabels::CBinaryLabels(const CBinaryLabels& orig) : CDenseLabels(orig)
For converting the same datastruct, I see the point
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
src/shogun/classifier/svm/LibSVM.cpp
Outdated
@@ -48,7 +48,7 @@ bool CLibSVM::train_machine(CFeatures* data) | |||
struct svm_node* x_space; | |||
|
|||
ASSERT(m_labels && m_labels->get_num_labels()) | |||
ASSERT(m_labels->get_label_type() == LT_BINARY) | |||
auto binary_labels = m_labels->as_binary(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
src/shogun/labels/BinaryLabels.cpp
Outdated
|
||
using namespace shogun; | ||
|
||
CBinaryLabels::CBinaryLabels() : CDenseLabels() | ||
{ | ||
} | ||
|
||
CBinaryLabels::CBinaryLabels(const CBinaryLabels& orig) : CDenseLabels(orig) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is fine to copy BinaryLabels most of the time. What should be rather copy-on-write is the data beneath.
src/shogun/lib/DataType.h
Outdated
* @param pt primitive type | ||
* @return string for primitive type | ||
*/ | ||
std::string ptype(EPrimitiveType pt); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like primitive_name
a bit better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will update
return result; | ||
} | ||
|
||
CLabels* labels(CFile* file) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the approach. It would become the main interface.
We could also have a constructor in CBinaryLabels that takes a CLabels pointer and then does the same thing as this introduced |
@lisitsyn see updated version |
a636b4f
to
faf7f6e
Compare
src/shogun/base/SGObject.h
Outdated
* @return The requested type | ||
*/ | ||
template <class T> | ||
const T* as() const |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i wouldn't copy-paste this... some parts of it at least can be merged
src/shogun/base/some.h
Outdated
@@ -153,6 +153,11 @@ namespace shogun | |||
return value; | |||
} | |||
|
|||
inline const char* wrap(const char* ptr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for inline-ness plz use the macro: SG_FORCED_INLINE
which will make it sure that it's really inlined
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just replace it with the given macro above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in case you insist on using inline... otherwise just drop the keyword inline
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't insist, I am just following the rest of the file, also it doesnt have any shogun includes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but let me remove the keyword
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dont get it why is it better to remove instead of using a macro that ensures the inline-ess... hence make the whole code faster... ¯_(ツ)_/¯
|
||
namespace shogun | ||
{ | ||
Some<CBinaryLabels> binary_from_binary(CBinaryLabels* orig) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
que?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these are helper methods, could also move the contents directly into the switch below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cpp only, nevermind
src/shogun/labels/BinaryLabels.cpp
Outdated
labels[i]) | ||
} | ||
} | ||
auto result = new CBinaryLabels(labels); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
afaik this copies the whole vvector itself.... do we really need to copy the data where it doesn't change at all....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it copies indeed, which I actually didnt realise. Why does it copy? Doesnt seem to make sense to me. And agree, it should not
return Some<CBinaryLabels>::from_raw(orig); | ||
} | ||
|
||
Some<CBinaryLabels> binary_from_dense(CDenseLabels* orig) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
que?
do we need to spell this one out? does this need to be expose to anybody apart from ourself internally?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah this is in .cpp. nevermind
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no this is only internally, didnt put it in the header.
You think it should be just inside the switch
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see the 2nd comment... it's foobar, just ignore the first comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
browser didnt reload ;)
(*std::min_element(unique.begin(), unique.end())) == 0 && | ||
(*std::max_element(unique.begin(), unique.end())) == (index_t)unique.size()-1, | ||
"Multiclass labels must be contiguous integers in [0, ..., num_classes -1].\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we fixme/todo this... as this should not be a hard requirement, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i mean one would like to have something even like a: ['foo', 'bar', 'zelda'] multiclass labeling possible sooner than later
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep!
tests/meta/tester.cpp
Outdated
@@ -62,6 +83,6 @@ int main(int argc, const char *argv[]) | |||
SG_UNREF(a_ref); | |||
|
|||
exit_shogun(); | |||
return equal != true; | |||
return (loaded_equals_ref && ref_equals_loaded) != true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could drop the != true
part and just have a negation, right?
ASSERT_TRUE(cast != nullptr); | ||
auto loaded_mat = cast->get_feature_matrix(); | ||
EXPECT_TRUE(loaded_mat.equals(mat)); | ||
delete obj; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
delete the file as well plz.
make some methods const, add copy constructors, add CBinaryLabels::from(...)
Populate for kernel, machine, features, labels
Includes LibSVM using new CLabels::as_binary()
due to typemap bug, shogun-toolbox#4177
-don't copy memory when converting from dense to binary labels -delete temporary files in unit tests -clean ups -remove unused const version of CSGObject::as
-add == operator for Some -clean-ups
9be4cfe
to
7b43b50
Compare
Alright, let's see what the buildbot says to this... |
No description provided.