Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adapted Classifiers #1678

Merged
merged 17 commits into from Feb 15, 2019

Conversation

Projects
None yet
3 participants
@walragatver
Copy link
Contributor

commented Jan 19, 2019

Hi,
This is pull Request for #1483 Issue.
I have changed three classifiers

  1. adaboost - returning ztProduct as it's upper bound for training error
  2. decision stump - returning the entropy after splitting.
  3. decision tree - returning final entropy of the tree . Added some code to get final entropy in recursive decision tree.
    Let me know if I have returned something wrong.

Update One:
Changed Two Classifiers

  1. HMM - In this returning final log likelihood obtained after training using data sequences.
  2. Random Forest - Returning average entropy of all the decision trees trained under the forest
    (Not Sure of this)

Update Two:

  1. LARS : Returning final absolute maximum correlation after training
  2. Linear_regression : Returning least squares error after training.
  3. sparse_coding : Returning final minmized objective function value.
  4. ANN : Returning objective function from GANS, FNN, RNN, RBM.

Update Three:

  1. local_coordinate_coding : returning final objective value

Also I went through some other classifiers and drawn following conclusions

  1. Hoeffding Trees : In this we are using gain to find best split hence returning entropy is not good.
  2. Naive Bayes Classifier : In this we are updating the probability and distribution curve and thus we don't need to return anything in this.
  3. Perceptron : In this we are doing simple weight update without any objective function hence no need to return anything from it.
  4. Softmax Regression : Train function is already modified.
  5. GMM : Train function is already modified

I am quite new to these algorithms, please correct me if I have made a mistake .

@walragatver

This comment has been minimized.

Copy link
Contributor Author

commented Jan 25, 2019

Hi @rcurtin, @zoq,
Sorry for disturbance. But it looks like I have covered all classifiers. Are the conclusions drawn by me correct?

@rcurtin

This comment has been minimized.

Copy link
Member

commented Jan 25, 2019

Hey @walragatver, sorry that I haven't had a chance to take a look yet. I'll review it when I'm able to (hopefully in the next few days). 👍

@walragatver walragatver changed the title Adapted adaboost decision stump and decision tree. Sub Part of #1483 issue Adapted Classifiers Jan 31, 2019

@zoq

This comment has been minimized.

Copy link
Member

commented Jan 31, 2019

Sorry for the slow response, will take a look at the PR over the weekend.

@rcurtin
Copy link
Member

left a comment

Hey there @walragatver, thanks for the contribution! 👍 Overall it looks good to me, and this is definitely the right approach. I had a few comments about a few of the specific methods and what is returned, so if you can address those I would appreciate it. There are also a couple of little style issues, but those should be easy.

It might be nice to also adapt the tests that call Train() to make sure a reasonable value is returned for the objective, but I am having trouble thinking of exactly how to structure those tests so that what's returned is robust. It might make sense just to ensure that what's returned is a finite number.

Thanks again for the hard work with this. Let me know if I can clarify any of my comments. 👍

@@ -624,7 +624,7 @@ void DecisionTree<FitnessFunction,
}

// Figure out counts of children.
arma::Row<size_t> childCounts(numClasses, arma::fill::zeros);
arma::Row<size_t> childCounts(numChildren, arma::fill::zeros);

This comment has been minimized.

Copy link
@rcurtin

rcurtin Feb 4, 2019

Member

Is this a bug you've found? I think you are right with this code, but I just want to check into what you were thinking with the change here.

This comment has been minimized.

Copy link
@walragatver

walragatver Feb 5, 2019

Author Contributor

Yes that was a bug. Actually that childCounts was not used anywhere. If we look below it it is calculating number of instances each children will have after split. Hence I though that it should be numChildren rather than number of classes. Also I crosschecked that looking into another train function at line 780 in same file. There it was numChildren so I changed that

This comment has been minimized.

Copy link
@rcurtin

rcurtin Feb 9, 2019

Member

Yeah. I think we happen to be guaranteed that numClasses is always greater than or equal to numChildren in the code we have, so it was a dormant bug, but still a bug. Nice catch 😎

@@ -141,7 +141,7 @@ class AdaBoost
* @param labels Labels for each point in the dataset.
* @param learner Learner to use for training.
*/
void Train(const MatType& data,
double Train(const MatType& data,

This comment has been minimized.

Copy link
@rcurtin

rcurtin Feb 4, 2019

Member

Can you also be sure to adapt the spacing of the lines following this please? (This applies across many of the changes.)

This comment has been minimized.

Copy link
@walragatver

walragatver Feb 5, 2019

Author Contributor

Okay Fixed it !

@@ -198,6 +198,7 @@ void AdaBoost<WeakLearnerType, MatType>::Train(
// Accumulate the value of zt for the Hamming loss bound.
ztProduct *= zt;
}
return ztProduct;

This comment has been minimized.

Copy link
@rcurtin

rcurtin Feb 4, 2019

Member

Actually we can make a few more changes now that we've done this, which is a nice simplification. The tests in src/mlpack/tests/adaboost_test.cpp use the member AdaBoost::ZtProduct() to access the loss for testing... but now since Train() returns ztProduct, there is no need for either ztProduct as a member of the AdaBoost class or the ZtProduct() method. So it can be removed and adaboost_test.cpp can be adapted slightly where ZtProduct() was being used before. 👍

This comment has been minimized.

Copy link
@walragatver

walragatver Feb 5, 2019

Author Contributor

Yes, It looks like many of our tests can be modified after this modification. I am thinking to work on this in a separate PR

This comment has been minimized.

Copy link
@rcurtin

rcurtin Feb 9, 2019

Member

Ok, that's fine, but could you please open an issue so that we don't forget it? You can assign it to yourself if you like.

double newContribute = newChildGain * childCounts[i] / count;
newBestGain = newBestGain + newContribute;
bestGain = std::max(bestGain,newBestGain);
}

This comment has been minimized.

Copy link
@rcurtin

rcurtin Feb 4, 2019

Member

It looks like the strategy you're using here is to check if the entropy of the child is different than the entropy of the child set. That might happen if the child node makes any additional splits. But when we call child->Train(), it will return either exactly the value of prevChildGain, if no further splits were done (i.e. if that child was a leaf), or something better, if further splits were done. So there should be no need to compute or store prevChildGain.

Also, it looks like your strategy for computing bestGain is to subtract the contribution of the child points from the original bestGain (computed at the beginning of the method), and then add back in the contribution of the newly-built child node. I think we can ignore the original bestGain computation though and essentially do something like this:

// Outside of the loop, initialize bestGain.
bestGain = 0;

// This is the loop over the children.
for (size_t i = 0; i < numChildren; ++i)
{
  ...
  double childGain = child->Train(...);
  bestGain += double(childCounts[i]) / double(count) * childGain;
}

(The double() is to make sure we don't do integer division. I think it's necessary but I'm not 100% sure.)

Let me know if I can clarify this idea further.

This comment has been minimized.

Copy link
@walragatver

walragatver Feb 5, 2019

Author Contributor

Yes, I got that thanks for the suggestion

@@ -129,7 +129,7 @@ void LARS::Train(const arma::mat& matX,
{
lambdaPath[0] = lambda1;
Timer::Stop("lars_regression");
return;
return maxCorr;

This comment has been minimized.

Copy link
@rcurtin

rcurtin Feb 4, 2019

Member

I'm not sure that maxCorr is the objective minimized by LARS. I think that objective is Equation 1.3 from https://projecteuclid.org/download/pdfview_1/euclid.aos/1083178935 . Maybe I overlooked how maxCorr is equivalent to that value?

This comment has been minimized.

Copy link
@walragatver

walragatver Feb 5, 2019

Author Contributor

Actually, I was not sure of that.
I am returning that because in section 2 fig 3(right) of the paper they have plotted absolute current correlations as function of LARS step . In that maximum correlation decreases as we keep training. So I thought it's quite similar to decreasing error hence I thought returning that might be reasonable. Also I went through code and it was training on maxCorr so I thought that returning maxCorr is correct. Also in our code there is no function or something which is calculating that error.
If there are modifications let me know

This comment has been minimized.

Copy link
@rcurtin

rcurtin Feb 9, 2019

Member

I think there is a relation between the two---we keep adding variables to the solution until maxCorr is sufficently small, so in some sense maxCorr captures the amount of Y that is not captured by X*beta. There may be some way to convert to something much closer to the mean-squared-error of the regression, but if not I think maxCorr can work. If you'd like to take the time to look into it further, please feel free.

@@ -85,6 +85,7 @@ void LinearRegression::Train(const arma::mat& predictors,
lambda * arma::eye<arma::mat>(p.n_rows, p.n_rows);

parameters = arma::solve(cov, p * r.t());
return ComputeError(predictors,responses);

This comment has been minimized.

Copy link
@rcurtin

rcurtin Feb 4, 2019

Member

Tiny style issue---there should be a space between predictors, and responses.

This comment has been minimized.

Copy link
@walragatver

walragatver Feb 5, 2019

Author Contributor

Okay fixed it!

Show resolved Hide resolved src/mlpack/methods/random_forest/random_forest_impl.hpp
Show resolved Hide resolved src/mlpack/methods/sparse_coding/sparse_coding_impl.hpp
@walragatver

This comment has been minimized.

Copy link
Contributor Author

commented Feb 5, 2019

Hi @rcurtin ,
I have fixed and answered the things you have said. Just let me know if I have made a mistake.
Also I have modified local_coordinate_coding.
Umm, actually I have not adapted searching algorithms like range_search and collabarative filtering. I will adapt them in 2 days and will ping you for final review.

@rcurtin

This comment has been minimized.

Copy link
Member

commented Feb 5, 2019

Range search doesn't have a reasonable objective so I am not sure what we would return there. If you can think of something reasonable, go ahead, but I'm not sure what. (Similar situation for collaborative filtering.)

@walragatver

This comment has been minimized.

Copy link
Contributor Author

commented Feb 5, 2019

Range search doesn't have a reasonable objective so I am not sure what we would return there. If you can think of something reasonable, go ahead, but I'm not sure what. (Similar situation for collaborative filtering.)

Okay @rcurtin I have a list, if no one among this is having something reasonable to return then you can review now as well.
And the list goes as follows:

  1. fastmks
  2. lmnn
  3. kfn(aprox_kfn)
  4. kde
  5. lmnn
  6. lsh
  7. neighbour search
  8. range_search
  9. collabarative_filtering.
  10. reinforcement learning.
    11 collabarative filtering.

These all contain train function which I have not checked yet. If there is something to reasonble to return in this then let me know.
In collabarative filtering it looks we are using svd which is a model based approach so I thought it might be returning something not gone through code though will look at it

@walragatver

This comment has been minimized.

Copy link
Contributor Author

commented Feb 7, 2019

Hi @rcurtin , @zoq ,
I have checked all the above algorithms and it looks like we cannot return anything reasonable. It looks like most of them use matrices as it's output. Also your thesis on dual tree algorithms helped me a lot in clearing the basics of dual and single-tree algorithms.
So, this PR is now ready for it's final review.
Sorry for the delay.

@rcurtin
Copy link
Member

left a comment

@walragatver: thanks for the hard work on this. I left a few more comments there, but if you can handle those I don't have anything else and I think it's ready to merge, except for one thing---

Do you think you can add @return documentation to each of the modified Train() functions? I anticipate that people will wonder what the double returned is, and it's easier to write it in the comment than make them dig through the source to figure out what it is. :)

@@ -198,6 +198,7 @@ void AdaBoost<WeakLearnerType, MatType>::Train(
// Accumulate the value of zt for the Hamming loss bound.
ztProduct *= zt;
}
return ztProduct;

This comment has been minimized.

Copy link
@rcurtin

rcurtin Feb 9, 2019

Member

Ok, that's fine, but could you please open an issue so that we don't forget it? You can assign it to yourself if you like.

@@ -624,7 +624,7 @@ void DecisionTree<FitnessFunction,
}

// Figure out counts of children.
arma::Row<size_t> childCounts(numClasses, arma::fill::zeros);
arma::Row<size_t> childCounts(numChildren, arma::fill::zeros);

This comment has been minimized.

Copy link
@rcurtin

rcurtin Feb 9, 2019

Member

Yeah. I think we happen to be guaranteed that numClasses is always greater than or equal to numChildren in the code we have, so it was a dormant bug, but still a bug. Nice catch 😎

for (size_t i = begin; i < begin + count; ++i)
childCounts[childAssignments[i - begin]]++;

//Initialize bestGain if recursive split is allowed
if(!NoRecursion)

This comment has been minimized.

Copy link
@rcurtin

rcurtin Feb 9, 2019

Member

Just a few style comments---can you make sure that each comment is a fully punctuated sentence? (i.e. period at the end). Also we can put a space between // and Initialize, and a space between if and (!NoRecursion) to match the style guide. These comments apply throughout, by the way, even though I've only pointed them out here.

Show resolved Hide resolved src/mlpack/methods/decision_tree/decision_tree_impl.hpp
@@ -129,7 +129,7 @@ void LARS::Train(const arma::mat& matX,
{
lambdaPath[0] = lambda1;
Timer::Stop("lars_regression");
return;
return maxCorr;

This comment has been minimized.

Copy link
@rcurtin

rcurtin Feb 9, 2019

Member

I think there is a relation between the two---we keep adding variables to the solution until maxCorr is sufficently small, so in some sense maxCorr captures the amount of Y that is not captured by X*beta. There may be some way to convert to something much closer to the mean-squared-error of the regression, but if not I think maxCorr can work. If you'd like to take the time to look into it further, please feel free.

Show resolved Hide resolved src/mlpack/methods/random_forest/random_forest_impl.hpp
Show resolved Hide resolved src/mlpack/methods/sparse_coding/sparse_coding_impl.hpp

@walragatver walragatver force-pushed the walragatver:devOne branch from c00e4d2 to 650b772 Feb 10, 2019

@walragatver walragatver force-pushed the walragatver:devOne branch from 650b772 to 42aed43 Feb 11, 2019

@walragatver

This comment has been minimized.

Copy link
Contributor Author

commented Feb 12, 2019

Hi @rcurtin, @zoq,
I have made the final changes and fixed some static errors. It looks there is some glitch in the static check because the error it is pointing is not an error. There is no local activeSet in the train function of lars.cpp.
Can you look into it?

@rcurtin

This comment has been minimized.

Copy link
Member

commented Feb 12, 2019

Can you try rebasing the master branch into your branch? #1713 had some fixes for the static code analysis checks that might help here.

@walragatver

This comment has been minimized.

Copy link
Contributor Author

commented Feb 12, 2019

@rcurtin: I have already rebased my branch but it looks like a new error. I tried to clear it but looks like it's not an error.

@rcurtin

This comment has been minimized.

Copy link
Member

commented Feb 12, 2019

Ah, ok, sorry. Yeah, I took a look at the code and I have to agree, the static code analyzer must be wrong here. I can't see why it would think that activeSet is local. So let's just not worry about that. 👍

@walragatver

This comment has been minimized.

Copy link
Contributor Author

commented Feb 13, 2019

Hi @rcurtin, @zoq,
Can you review this PR today? I am thinking to move towards writing and modifying test cases and finish it as fast as possible.
Also I went through this year's GSOC ideas they are quite fascinating. Which one will you recommend for me?

@rcurtin

This comment has been minimized.

Copy link
Member

commented Feb 13, 2019

It's in my queue to do soon, but this week is a paper deadline week so there's not too much time for me. Sorry about that...

@zoq

This comment has been minimized.

Copy link
Member

commented Feb 13, 2019

@walragatver regarding GSoC, difficult to recommend something at this point, personally, I would go with the idea I'm most passionate about. Also, you are free to propose something that's not listed.

@@ -84,12 +84,13 @@ void FFN<OutputLayerType, InitializationRuleType, CustomLayers...>::Train(

Log::Info << "FFN::FFN(): final objective of trained model is " << out
<< "." << std::endl;
return out;

This comment has been minimized.

Copy link
@zoq

zoq Feb 13, 2019

Member

Might be more intuitive to call this loss, or objective, what do you think?

This comment has been minimized.

Copy link
@walragatver

walragatver Feb 14, 2019

Author Contributor

Yes, correct but it was already declared or defined. So I thought it's better not to change it's name. Also at every optimizer function it was written like that so I decided to keep it as it is.
I thought it might be done to keep things standarized.

This comment has been minimized.

Copy link
@zoq

zoq Feb 14, 2019

Member

I see, I don't mind to leave it like this.

@@ -97,11 +97,12 @@ class FFN
* @param predictors Input training variables.
* @param responses Outputs results from input training variables.
* @param optimizer Instantiated optimizer used to train the model.
* @return final objective value.

This comment has been minimized.

Copy link
@zoq

zoq Feb 13, 2019

Member

Picky style comment, this should start with a capital letter.

This comment has been minimized.

Copy link
@walragatver

walragatver Feb 14, 2019

Author Contributor

Done

@walragatver

This comment has been minimized.

Copy link
Contributor Author

commented Feb 14, 2019

Hi @zoq,
I have made the required changes.

@zoq

zoq approved these changes Feb 14, 2019

Copy link
Member

left a comment

Just left another minor comment, the rest looks good to me.

@@ -50,6 +50,7 @@ RBM<InitializationRuleType, DataType, PolicyType>::RBM(
reset(false)
{
numFunctions = this->predictors.n_cols;
steps = 0;

This comment has been minimized.

Copy link
@zoq

zoq Feb 14, 2019

Member

Do you mind to use the member initializer list for the steps parameter for consistency?

@rcurtin
Copy link
Member

left a comment

Looks good to me---I've only got one tiny (really pedantic) comment, and if you want, I can handle that during the merge process. Another thing we should do (or that I can do during merge) is to add a line to HISTORY.md... maybe something like this:

 * Where relevant, all models with a `Train()` method now return a `double` value representing the goodness of fit (i.e. final objective value, error, etc.) (#1678).

Once the CI passes check, I'm happy to merge this. I saw that the OS X build failed because Homebrew timed out (or something?) so I restarted it. Hopefully there is no issue this time.

@@ -87,11 +88,12 @@ class LinearRegression
* @param responses y, the responses to the data points.
* @param intercept Whether or not to fit an intercept term.
* @param weights Observation weights (for boosting).
* @return ordinary least squares error.

This comment has been minimized.

Copy link
@rcurtin

rcurtin Feb 15, 2019

Member

Tiny style issue; we should start the comment capitalized, i.e. Ordinary :)

@mlpack-bot

This comment has been minimized.

Copy link

commented Feb 15, 2019

Hello there! Thanks for your contribution. I see that this is your first contribution to mlpack. If you'd like to add your name to the list of contributors in src/mlpack/core.hpp and COPYRIGHT.txt and you haven't already, please feel free to push a change to this PR---or, if it gets merged before you can, feel free to open another PR.

In addition, if you'd like some stickers to put on your laptop, I'd be happy to help get them in the mail for you. Just send an email with your physical mailing address to stickers@mlpack.org, and then one of the mlpack maintainers will put some stickers in an envelope for you. It may take a few weeks to get them, depending on your location. 👍

1 similar comment
@mlpack-bot

This comment has been minimized.

Copy link

commented Feb 15, 2019

Hello there! Thanks for your contribution. I see that this is your first contribution to mlpack. If you'd like to add your name to the list of contributors in src/mlpack/core.hpp and COPYRIGHT.txt and you haven't already, please feel free to push a change to this PR---or, if it gets merged before you can, feel free to open another PR.

In addition, if you'd like some stickers to put on your laptop, I'd be happy to help get them in the mail for you. Just send an email with your physical mailing address to stickers@mlpack.org, and then one of the mlpack maintainers will put some stickers in an envelope for you. It may take a few weeks to get them, depending on your location. 👍

@rcurtin

This comment has been minimized.

Copy link
Member

commented Feb 15, 2019

Hehe, always bugs in code... I wonder why mlpack-bot offered stickers twice...

@walragatver

This comment has been minimized.

Copy link
Contributor Author

commented Feb 15, 2019

Hi @rcurtin,
I have updated History.md. However was not able to preview the change. So if something goes wrong you handle that during merge.
Sorry I saw that it looks good.
Thanks for approval 👍.

@rcurtin rcurtin merged commit 3fded6e into mlpack:master Feb 15, 2019

6 checks passed

LaTeX Documentation Checks Build finished.
Details
Memory Checks Build finished.
Details
Static Code Analysis Checks Build finished.
Details
Style Checks Build finished.
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@rcurtin

This comment has been minimized.

Copy link
Member

commented Feb 15, 2019

Thanks for the contribution! 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.