New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up KNN #3608

Merged
merged 3 commits into from Feb 16, 2017

Conversation

Projects
None yet
3 participants
@MikeLing
Contributor

MikeLing commented Jan 27, 2017

No description provided.

@MikeLing

This comment has been minimized.

Show comment
Hide comment
@MikeLing

MikeLing Jan 27, 2017

Contributor

Hi @karlnapf , I had close #3598 and recreate a PR which first move code to new file without change anything.

Contributor

MikeLing commented Jan 27, 2017

Hi @karlnapf , I had close #3598 and recreate a PR which first move code to new file without change anything.

@karlnapf

This comment has been minimized.

Show comment
Hide comment
@karlnapf

karlnapf Jan 27, 2017

Member

Not sure I understand the purpose of this PR ... just moving the function implementations to a separate file? Sorry maybe I was not clear earlier, but I was talking about doing the full refactoring in a separate patch, including adding a new base class for the solver and then making CKNN a specialization

Shall we maybe draw a little class diagram to make clear which parts of the code will be solver, which of them will be in the base class etc.
Also, please check the existing PR on KNN refactorinbg.

Member

karlnapf commented Jan 27, 2017

Not sure I understand the purpose of this PR ... just moving the function implementations to a separate file? Sorry maybe I was not clear earlier, but I was talking about doing the full refactoring in a separate patch, including adding a new base class for the solver and then making CKNN a specialization

Shall we maybe draw a little class diagram to make clear which parts of the code will be solver, which of them will be in the base class etc.
Also, please check the existing PR on KNN refactorinbg.

@MikeLing

This comment has been minimized.

Show comment
Hide comment
@MikeLing

MikeLing Jan 28, 2017

Contributor

@karlnapf Oh,I misunderstood the meaning of " moving things to different files and without change anything else", sorry!

Shall we maybe draw a little class diagram to make clear which parts of the code will be solver, which of them will be in the base class etc.

Sure thing, so I would like to create a class named KNNsolver, and all the other solver (BruteSolver, KDTreeSolver, CoverTreeSolver and LSHSolver) inherit from KNNSolver.

The KNNSolver will have most part of variable(k, q, number of classes,etc) which will been used to other solver.

Furthermore, KNNSolver need to have nearest_neighbors(related BruteSolver and LSH), choose_class and choose_class_for_multiple_k(related to KDTreeSolver and CoverTreeSolver)

We need to define two virtual function in KNNSolver (named classify_objects and classify_objects_k) which will been used to point out each solver's classify implantation.

Is that looks good to you? Please tell me if it's not. And I promise I will be carful with the PR creation :) Thank you.

Contributor

MikeLing commented Jan 28, 2017

@karlnapf Oh,I misunderstood the meaning of " moving things to different files and without change anything else", sorry!

Shall we maybe draw a little class diagram to make clear which parts of the code will be solver, which of them will be in the base class etc.

Sure thing, so I would like to create a class named KNNsolver, and all the other solver (BruteSolver, KDTreeSolver, CoverTreeSolver and LSHSolver) inherit from KNNSolver.

The KNNSolver will have most part of variable(k, q, number of classes,etc) which will been used to other solver.

Furthermore, KNNSolver need to have nearest_neighbors(related BruteSolver and LSH), choose_class and choose_class_for_multiple_k(related to KDTreeSolver and CoverTreeSolver)

We need to define two virtual function in KNNSolver (named classify_objects and classify_objects_k) which will been used to point out each solver's classify implantation.

Is that looks good to you? Please tell me if it's not. And I promise I will be carful with the PR creation :) Thank you.

@MikeLing MikeLing changed the title from Move the solver related code to new file to Clean up KNN Feb 2, 2017

@MikeLing

This comment has been minimized.

Show comment
Hide comment
@MikeLing

MikeLing Feb 2, 2017

Contributor

@karlnapf there is no link error this time, but I have no idea about where the Segment Fault [1]from, could you give me some feedbacks about this PR? of-course, I will keep debug and improve it until all the things works :)

Furthermore, I don't know how to use lldb to debug knn unittest, you can see the log from travis that it looks like fail to clone the object to SGobject[2].

[1] https://pastebin.mozilla.org/8972402
[2]https://travis-ci.org/shogun-toolbox/shogun/jobs/197601158#L4013

Contributor

MikeLing commented Feb 2, 2017

@karlnapf there is no link error this time, but I have no idea about where the Segment Fault [1]from, could you give me some feedbacks about this PR? of-course, I will keep debug and improve it until all the things works :)

Furthermore, I don't know how to use lldb to debug knn unittest, you can see the log from travis that it looks like fail to clone the object to SGobject[2].

[1] https://pastebin.mozilla.org/8972402
[2]https://travis-ci.org/shogun-toolbox/shogun/jobs/197601158#L4013

@vigsterkr

This comment has been minimized.

Show comment
Hide comment
@vigsterkr

vigsterkr Feb 2, 2017

Member

@MikeLing good idea to have 1 class per solver type... i would create 1 file per class defintion, that way it's much more clearer than this when you have defined several classes in one file

Member

vigsterkr commented Feb 2, 2017

@MikeLing good idea to have 1 class per solver type... i would create 1 file per class defintion, that way it's much more clearer than this when you have defined several classes in one file

@vigsterkr

This comment has been minimized.

Show comment
Hide comment
@vigsterkr

vigsterkr Feb 3, 2017

Member

@MikeLing please check other implementation and you'll see that we always have an init() function of a class and that there's an SG_ADD(...), which adds the parameter(s) to the parameter framework... which actually is needed for SGObject.clone (and many other things).

Member

vigsterkr commented Feb 3, 2017

@MikeLing please check other implementation and you'll see that we always have an init() function of a class and that there's an SG_ADD(...), which adds the parameter(s) to the parameter framework... which actually is needed for SGObject.clone (and many other things).

@MikeLing

This comment has been minimized.

Show comment
Hide comment
@MikeLing

MikeLing Feb 3, 2017

Contributor

Hi @vigsterkr , I found not all the parameters will been inited and added by SG_ADD, some implementation like GMNPSVM in[1] don't pass any parameter by SG_ADD at all. So, how to determine what parameter should been added by SG_ADD?

Thank you

[1] https://github.com/shogun-toolbox/shogun/blob/develop/src/shogun/multiclass/GMNPSVM.cpp#L43

Contributor

MikeLing commented Feb 3, 2017

Hi @vigsterkr , I found not all the parameters will been inited and added by SG_ADD, some implementation like GMNPSVM in[1] don't pass any parameter by SG_ADD at all. So, how to determine what parameter should been added by SG_ADD?

Thank you

[1] https://github.com/shogun-toolbox/shogun/blob/develop/src/shogun/multiclass/GMNPSVM.cpp#L43

@vigsterkr

This comment has been minimized.

Show comment
Hide comment
@vigsterkr

vigsterkr Feb 3, 2017

Member

you should SG_ADD params that you want to have serialized... and you should always init the params in the init to their default values...

Member

vigsterkr commented Feb 3, 2017

you should SG_ADD params that you want to have serialized... and you should always init the params in the init to their default values...

@MikeLing

This comment has been minimized.

Show comment
Hide comment
@MikeLing

MikeLing Feb 5, 2017

Contributor

Hi @vigsterkr, the remote build failed out even tests can passed on my local environment, and the log message looks weird to me. The error message on appveyor is link error and all ctest failed out on travis. Do I need to care about them? And please give me some feedback, I will add some more comment and write some more documents for these solvers if necessary :) Thank you!

// Result of bin/shogun-unit-test

[----------] Global test environment tear-down
[==========] 6856 tests from 271 test cases ran. (35110 ms total)
[ PASSED ] 6856 tests.

// Result of ctest

99% tests passed, 1 tests failed out of 454

Total Test time (real) = 106.85 sec

The following tests FAILED:
282 - integration_meta_cpp-distance-cosine (Failed) //This test always failed out even I don't change anyhting.
Errors while running CTest

Contributor

MikeLing commented Feb 5, 2017

Hi @vigsterkr, the remote build failed out even tests can passed on my local environment, and the log message looks weird to me. The error message on appveyor is link error and all ctest failed out on travis. Do I need to care about them? And please give me some feedback, I will add some more comment and write some more documents for these solvers if necessary :) Thank you!

// Result of bin/shogun-unit-test

[----------] Global test environment tear-down
[==========] 6856 tests from 271 test cases ran. (35110 ms total)
[ PASSED ] 6856 tests.

// Result of ctest

99% tests passed, 1 tests failed out of 454

Total Test time (real) = 106.85 sec

The following tests FAILED:
282 - integration_meta_cpp-distance-cosine (Failed) //This test always failed out even I don't change anyhting.
Errors while running CTest

@MikeLing

This comment has been minimized.

Show comment
Hide comment
@MikeLing

MikeLing Feb 7, 2017

Contributor

@karlnapf ask review for this PR :)

Contributor

MikeLing commented Feb 7, 2017

@karlnapf ask review for this PR :)

@vigsterkr

my initial review about this PR. the initial idea is good, but please try to use our internal data representations as well as some matrix algebra, instead of brute force solutions.

Show outdated Hide outdated src/shogun/multiclass/KNNSolver.h Outdated
Show outdated Hide outdated src/shogun/multiclass/KNNSolver.h Outdated
Show outdated Hide outdated src/shogun/multiclass/KNNSolver.h Outdated
Show outdated Hide outdated src/shogun/multiclass/KNNSolver.h Outdated
Show outdated Hide outdated src/shogun/multiclass/KNNSolver.h Outdated
Show outdated Hide outdated src/shogun/multiclass/COVERTree.cpp Outdated
Show outdated Hide outdated src/shogun/multiclass/COVERTree.cpp Outdated
Show outdated Hide outdated src/shogun/multiclass/COVERTree.cpp Outdated
Show outdated Hide outdated src/shogun/multiclass/COVERTree.cpp Outdated
Show outdated Hide outdated src/shogun/multiclass/COVERTree.cpp Outdated
@MikeLing

This comment has been minimized.

Show comment
Hide comment
@MikeLing

MikeLing Feb 7, 2017

Contributor

Hi @vigsterkr , thank you for your review and comment! I will use the shogun internal data like SGVector in separate PR.

Contributor

MikeLing commented Feb 7, 2017

Hi @vigsterkr , thank you for your review and comment! I will use the shogun internal data like SGVector in separate PR.

@vigsterkr

This comment has been minimized.

Show comment
Hide comment
@vigsterkr

vigsterkr Feb 7, 2017

Member

ok, please address the rest at least...

Member

vigsterkr commented Feb 7, 2017

ok, please address the rest at least...

@vigsterkr

This comment has been minimized.

Show comment
Hide comment
@vigsterkr

vigsterkr Feb 9, 2017

Member

still many places int32_t for indexing instead of index_t.... and the general design of KNNSolver needs a bit of more attention (see the const-ness of functions....) and dont swallow virtual functions....

Member

vigsterkr commented Feb 9, 2017

still many places int32_t for indexing instead of index_t.... and the general design of KNNSolver needs a bit of more attention (see the const-ness of functions....) and dont swallow virtual functions....

@MikeLing

This comment has been minimized.

Show comment
Hide comment
@MikeLing

MikeLing Feb 10, 2017

Contributor

ask review for this pr

Contributor

MikeLing commented Feb 10, 2017

ask review for this pr

@vigsterkr

This comment has been minimized.

Show comment
Hide comment
@vigsterkr

vigsterkr Feb 10, 2017

Member

@MikeLing before asking for a review make sure that your code actually compiles and passes all the tests with the CI; in your case appveyor is obviously failing:

[00:17:35] c:\projects\shogun\src\shogun\multiclass\lshknnsolver.cpp(104): error C4716: 'shogun::CLSHKNNSolver::classify_objects_k': must return a value [C:\projects\shogun\build\src\shogun\libshogun.vcxproj]
Member

vigsterkr commented Feb 10, 2017

@MikeLing before asking for a review make sure that your code actually compiles and passes all the tests with the CI; in your case appveyor is obviously failing:

[00:17:35] c:\projects\shogun\src\shogun\multiclass\lshknnsolver.cpp(104): error C4716: 'shogun::CLSHKNNSolver::classify_objects_k': must return a value [C:\projects\shogun\build\src\shogun\libshogun.vcxproj]
@vigsterkr

getting there.

Show outdated Hide outdated src/shogun/multiclass/LSHKNNSolver.h Outdated
Show outdated Hide outdated src/shogun/multiclass/LSHKNNSolver.h Outdated
@@ -575,55 +315,32 @@ void CKNN::store_model_features()
SG_UNREF(d_rhs);
}
int32_t CKNN::choose_class(float64_t* classes, int32_t* train_lab)
void CKNN::init_solver(KNN_SOLVER knn_solver)

This comment has been minimized.

@vigsterkr

vigsterkr Feb 10, 2017

Member

the initialised solver is never released actually.... hence this will leak memory

@vigsterkr

vigsterkr Feb 10, 2017

Member

the initialised solver is never released actually.... hence this will leak memory

This comment has been minimized.

@MikeLing

MikeLing Feb 12, 2017

Contributor

Hi @vigsterkr , I got error while running Ctest[1] and it seems like there are a bunch of errors I haven't meet before[2]. BTW, I run all these test to virtualbox with ubuntu16.04 due to the valgrind doesn't full support OS X10.12.3[3], is that possible I got ViennaCL: FATAL ERROR is because I only give one kernel to virtual machine?

[1] https://pastebin.mozilla.org/8979031
[2] https://pastebin.mozilla.org/8979032
[3] http://valgrind.org/downloads/current.html

@MikeLing

MikeLing Feb 12, 2017

Contributor

Hi @vigsterkr , I got error while running Ctest[1] and it seems like there are a bunch of errors I haven't meet before[2]. BTW, I run all these test to virtualbox with ubuntu16.04 due to the valgrind doesn't full support OS X10.12.3[3], is that possible I got ViennaCL: FATAL ERROR is because I only give one kernel to virtual machine?

[1] https://pastebin.mozilla.org/8979031
[2] https://pastebin.mozilla.org/8979032
[3] http://valgrind.org/downloads/current.html

This comment has been minimized.

@vigsterkr

vigsterkr Feb 13, 2017

Member

mmmm i'm not so sure if i follow.
you can simply manually run valgrind on the knn examples/unit tests and you'll see how it is leaking.

@vigsterkr

vigsterkr Feb 13, 2017

Member

mmmm i'm not so sure if i follow.
you can simply manually run valgrind on the knn examples/unit tests and you'll see how it is leaking.

Show outdated Hide outdated src/shogun/multiclass/CoverTreeKNNSolver.cpp Outdated
#ifdef HAVE_CXX11
#include <shogun/lib/external/falconn/lsh_nn_table.h>
#endif
//#define DEBUG_KNN

This comment has been minimized.

@vigsterkr

vigsterkr Feb 10, 2017

Member

this can be removed....

@vigsterkr

vigsterkr Feb 10, 2017

Member

this can be removed....

Show outdated Hide outdated src/shogun/multiclass/KDTreeKNNsolver.cpp Outdated
Show outdated Hide outdated src/shogun/multiclass/KDTreeKNNsolver.cpp Outdated
Show outdated Hide outdated src/shogun/multiclass/LSHKNNSolver.cpp Outdated
Show outdated Hide outdated src/shogun/multiclass/LSHKNNSolver.h Outdated
Show outdated Hide outdated src/shogun/multiclass/LSHKNNSolver.h Outdated
Show outdated Hide outdated src/shogun/multiclass/LSHKNNSolver.h Outdated
Show outdated Hide outdated src/shogun/multiclass/KDTreeKNNSolver.h Outdated
Show outdated Hide outdated src/shogun/multiclass/CoverTreeKNNSolver.h Outdated
Show outdated Hide outdated src/shogun/multiclass/BruteKNNSolver.h Outdated
@vigsterkr

This comment has been minimized.

Show comment
Hide comment
@vigsterkr

vigsterkr Feb 15, 2017

Member

@MikeLing ok cool! so let's see how the CI are acting, and once done we could merge! would you mind actually doing a squash of commits? :)

Member

vigsterkr commented Feb 15, 2017

@MikeLing ok cool! so let's see how the CI are acting, and once done we could merge! would you mind actually doing a squash of commits? :)

@MikeLing

This comment has been minimized.

Show comment
Hide comment
@MikeLing

MikeLing Feb 15, 2017

Contributor

@vigsterkr no problem, I will rebase them after ci works fine. Thank you for you patience and all these comments. I don't have too much experience about C++ coding before, all the things I know are learned from classes and books. It's great appreciate that you tell me all these stuff, thank you vey much, again. :)

Contributor

MikeLing commented Feb 15, 2017

@vigsterkr no problem, I will rebase them after ci works fine. Thank you for you patience and all these comments. I don't have too much experience about C++ coding before, all the things I know are learned from classes and books. It's great appreciate that you tell me all these stuff, thank you vey much, again. :)

@vigsterkr

This comment has been minimized.

Show comment
Hide comment
@vigsterkr

vigsterkr Feb 16, 2017

Member

@MikeLing ok so i've just checked there are still 3 minor changes that i've requested but you havent changed them:

  • init function to be private
  • some of the functions should be private

could you send in a minipatch to address those? no need for squashing... thxn

Member

vigsterkr commented Feb 16, 2017

@MikeLing ok so i've just checked there are still 3 minor changes that i've requested but you havent changed them:

  • init function to be private
  • some of the functions should be private

could you send in a minipatch to address those? no need for squashing... thxn

@MikeLing

This comment has been minimized.

Show comment
Hide comment
@MikeLing

MikeLing Feb 16, 2017

Contributor

Hi @vigsterkr, I only change init() to private due to I haven't found other functions need to been change to private. Theget_name() will been called in SGObject_unittest to verify the class's name and the rest of functions need to been called in its derived classes.

Contributor

MikeLing commented Feb 16, 2017

Hi @vigsterkr, I only change init() to private due to I haven't found other functions need to been change to private. Theget_name() will been called in SGObject_unittest to verify the class's name and the rest of functions need to been called in its derived classes.

@vigsterkr

This comment has been minimized.

Show comment
Hide comment
@vigsterkr

vigsterkr Feb 16, 2017

Member

ok cool lets see now the ci and then we merge....

Member

vigsterkr commented Feb 16, 2017

ok cool lets see now the ci and then we merge....

@vigsterkr vigsterkr merged commit 988fe1e into shogun-toolbox:develop Feb 16, 2017

2 of 3 checks passed

default DEV build done.
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@karlnapf

The class docs in this patch are very fluffy. See my comments. Please pay more attention to clean language.

API docs are one of the main points where users interact with Shogun. We want them to be as good as possible

It would be great if you could send a clean up patch for that

namespace shogun
{
/**
* The KNNSolver class is the virtual base class of BruteKNNSolcer, CoverTreeKNNSolver, KDTreeKNNsolver, LSHKNNSolver.

This comment has been minimized.

@karlnapf

karlnapf Feb 16, 2017

Member

Dont mention subclass names in the base class. Hard to maintain

@karlnapf

karlnapf Feb 16, 2017

Member

Dont mention subclass names in the base class. Hard to maintain

{
/**
* The KNNSolver class is the virtual base class of BruteKNNSolcer, CoverTreeKNNSolver, KDTreeKNNsolver, LSHKNNSolver.
* The KNNSolver class will include most of Variables which been used all the other Solver and methods been used to choose

This comment has been minimized.

@karlnapf

karlnapf Feb 16, 2017

Member

dont capitalize words that are not names (solver)

@karlnapf

karlnapf Feb 16, 2017

Member

dont capitalize words that are not names (solver)

This comment has been minimized.

@karlnapf

karlnapf Feb 16, 2017

Member

it is not "will include" but "includes". Please pay a bit attention to grammar.

@karlnapf

karlnapf Feb 16, 2017

Member

it is not "will include" but "includes". Please pay a bit attention to grammar.

This comment has been minimized.

@karlnapf

karlnapf Feb 16, 2017

Member

The KNNSolver class will include most of Variables which been used all the other Solver and methods been used to choose

In fact this sentence doesnt make any sense. Remove it and just say that this is the base class for all KNN solvers

@karlnapf

karlnapf Feb 16, 2017

Member

The KNNSolver class will include most of Variables which been used all the other Solver and methods been used to choose

In fact this sentence doesnt make any sense. Remove it and just say that this is the base class for all KNN solvers

This comment has been minimized.

@karlnapf

karlnapf Feb 16, 2017

Member

If you want to say something here, mention important methods (just say what they are intended to do). On the other hand, if the methods are well documented, no need to say a lot (apart from that this is the base class)

@karlnapf

karlnapf Feb 16, 2017

Member

If you want to say something here, mention important methods (just say what they are intended to do). On the other hand, if the methods are well documented, no need to say a lot (apart from that this is the base class)

/**
* The BruteKNNSolver class is a solver class which inherit from KNNSolver class. It will use brute way
* to classify the object which mean compare the object against all training objects for each prediction.
*/

This comment has been minimized.

@karlnapf

karlnapf Feb 16, 2017

Member

No need to talk about inheritance, this is clear for the API docs automatically.

@karlnapf

karlnapf Feb 16, 2017

Member

No need to talk about inheritance, this is clear for the API docs automatically.

This comment has been minimized.

@karlnapf

karlnapf Feb 16, 2017

Member

It will use -> It uses

@karlnapf

karlnapf Feb 16, 2017

Member

It will use -> It uses

This comment has been minimized.

@karlnapf

karlnapf Feb 16, 2017

Member

It will use brute way to classify the object which mean compare the object against all training objects for each prediction. -- should be

Test points are compared to all training data for each prediction.

@karlnapf

karlnapf Feb 16, 2017

Member

It will use brute way to classify the object which mean compare the object against all training objects for each prediction. -- should be

Test points are compared to all training data for each prediction.

{
/**
* The CoverTreeKNNSolver class is a solver class which inherit from KNNSolver class.

This comment has been minimized.

@karlnapf

karlnapf Feb 16, 2017

Member

Please adjust as in my other comment

@karlnapf

karlnapf Feb 16, 2017

Member

Please adjust as in my other comment

/**
* The CoverTreeKNNSolver class is a solver class which inherit from KNNSolver class.
* It use the cover tree to speed up the nearest neighbor search and you can find more
* detail information on https://en.wikipedia.org/wiki/Cover_tree

This comment has been minimized.

@karlnapf

karlnapf Feb 16, 2017

Member

typo: detailed

@karlnapf

karlnapf Feb 16, 2017

Member

typo: detailed

/**
* The CoverTreeKNNSolver class is a solver class which inherit from KNNSolver class.
* It use the cover tree to speed up the nearest neighbor search and you can find more

This comment has been minimized.

@karlnapf

karlnapf Feb 16, 2017

Member

It uses cover trees to speed up the nearest neighbour search (typo and grammar mistake).
Make it two sentences: For more information, see [link]

@karlnapf

karlnapf Feb 16, 2017

Member

It uses cover trees to speed up the nearest neighbour search (typo and grammar mistake).
Make it two sentences: For more information, see [link]

{
/**
* The KDTREEKNNSolver class is a solver class which inherit from KNNSolver class.

This comment has been minimized.

@karlnapf

karlnapf Feb 16, 2017

Member

same as above comments for class docs

@karlnapf

karlnapf Feb 16, 2017

Member

same as above comments for class docs

karasikov pushed a commit to karasikov/shogun that referenced this pull request Apr 15, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment