Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Random Forest added #2282

Merged
merged 3 commits into from
Jun 8, 2014
Merged

Random Forest added #2282

merged 3 commits into from
Jun 8, 2014

Conversation

mazumdarparijat
Copy link
Contributor

  • Random Forest added

Reference

@mazumdarparijat
Copy link
Contributor Author

@iglesias I have added RF w/o unittests. I will add unittests tomorrow. But you may start reviewing.
I went with public inheritance only instead of private inheritance and friend class. Using the latter method would have made reusing the code of BaggingMachine very difficult. So I dropped the idea.


namespace shogun
{
class CRandomForest : public CBaggingMachine
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documentation is missing.

@iglesias
Copy link
Collaborator

iglesias commented Jun 6, 2014

@vigsterkr, maybe you can have a quick look and see if the changes in BaggingMachine are all right for you. I think they are.

@iglesias
Copy link
Collaborator

iglesias commented Jun 6, 2014

@mazumdarparijat, see the comments above. The most important is the large amount of code duplication in the train method of CRandomCARTree ;-)

Please, keep this pull request of this size. You may add unit tests, but nothing else before we merge it. So the plan is to address the comments, add unit tests and then it is ready to be merged.

Good job and thanks!

@mazumdarparijat
Copy link
Contributor Author

yes! I will just add unittest and include your comments. Thats it! :-)

@mazumdarparijat
Copy link
Contributor Author

@iglesias I have addressed your previous comments, added unittest as well. There were also quite a few errors and leaks in BaggingMachine which I fixed. Please have a look!

@@ -81,7 +81,7 @@ SGVector<float64_t> CBaggingMachine::apply_get_outputs(CFeatures* data)
{
CMachine* m = dynamic_cast<CMachine*>(m_bags->get_element(i));
CLabels* l = m->apply(data);
SGVector<float64_t> lv = l->get_values();
SGVector<float64_t> lv = dynamic_cast<CDenseLabels*>(l)->get_labels();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For casting labels, use obtain_from_generic please. This way the actual cast is only written in one place and it is more practical if at some time we need to modify the way we are doing the casts.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iglesias there is no obtain_from_generic method for CDenseLabels. I have added check for NULL pointer in my latest commit, that should solve the possible coverity defect.

@iglesias
Copy link
Collaborator

iglesias commented Jun 7, 2014

It seems that this has broken the unit test of Bagging machine. Can you fix that, @mazumdarparijat?

@mazumdarparijat
Copy link
Contributor Author

@iglesias I have fixed the existing unittest! I must say that though the fix is very simple, but I really got tired finding the problem. Phew! ;-)
The mock test was also a straw man, so I added another unittest to it. Please have a look!

@iglesias
Copy link
Collaborator

iglesias commented Jun 8, 2014

Right about obtain_from_generic in labels. We have that method for features, for labels we have the class LabelsFactory.

iglesias added a commit that referenced this pull request Jun 8, 2014
@iglesias iglesias merged commit b5d7c0d into shogun-toolbox:develop Jun 8, 2014
@jondo
Copy link
Contributor

jondo commented Sep 26, 2014

So shouldn't the shogun feature page be updated?

@vigsterkr
Copy link
Member

@jondo oh yeah, it should be. it can be updated via:
https://docs.google.com/spreadsheet/ccc?key=0Aunb9cCVAP6NdDVBMzY1TjdPcmx4ei1EeUZNNGtKUHc&hl=en#gid=0

would really appreciate the update.

@karlnapf
Copy link
Member

May I suggest to do a proper comparison with some other implementation, say scikit-learn?
Both in terms of speed and accuracy. @tklein23 and me recently had a discussion and think we should do this for any std method.

@iglesias
Copy link
Collaborator

I have updated now the sheet above with decision trees in Shogun. Didn't
see random forests in that list. Any idea how the information in the sheet
is rendered in the website?

Also, http://www.shogun-toolbox.org/page/features/ seems to be sort of
broken now.

On 26 September 2014 13:26, Heiko Strathmann notifications@github.com
wrote:

May I suggest to do a proper comparison with some other implementation,
say scikit-learn?
Both in terms of speed and accuracy. @tklein23
https://github.com/tklein23 and me recently had a discussion and think
we should do this for any std method.


Reply to this email directly or view it on GitHub
#2282 (comment)
.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants