New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor CF DecompositionPolicy and add BiasSVD, SVD++ #1458

Merged
merged 53 commits into from Sep 6, 2018

Conversation

Projects
None yet
4 participants
@Wenhao-H
Member

Wenhao-H commented Jul 2, 2018

This PR refactors CF DecompositionPolicy and add BiasSVDPolicy, SVDPlusPlusPolicy.

To do:

  • Refactor all existing decomposition policies.
  • Add BiasSVDPolicy, SVDPlusPlusPolicy.
  • Add tests and update comments.
Show outdated Hide outdated src/mlpack/methods/cf/cf_impl.hpp
Show outdated Hide outdated src/mlpack/methods/cf/cf_impl.hpp
Show outdated Hide outdated src/mlpack/methods/cf/cf_impl.hpp
Show outdated Hide outdated src/mlpack/methods/cf/cf_model.hpp
@@ -73,6 +69,89 @@ class BatchSVDPolicy
svdbatch.Apply(cleanedData, rank, w, h);
}
}
/**
* Return predicted rating given user ID and item ID.

This comment has been minimized.

@lozhnikov

lozhnikov Jul 6, 2018

Contributor

If I am not mistaken, this code duplicates six times. I think it is a good idea to add another abstraction layer. How do you think?

@lozhnikov

lozhnikov Jul 6, 2018

Contributor

If I am not mistaken, this code duplicates six times. I think it is a good idea to add another abstraction layer. How do you think?

This comment has been minimized.

@Wenhao-H

Wenhao-H Jul 7, 2018

Member

Hmm I understand what you mean but I am not sure how to add an abstraction layer. If we are going to add another layer (e.g. maybe a wrapper class), we also need to consider how to extend it to BiasSVD and SVD++ which will be implemented.

@Wenhao-H

Wenhao-H Jul 7, 2018

Member

Hmm I understand what you mean but I am not sure how to add an abstraction layer. If we are going to add another layer (e.g. maybe a wrapper class), we also need to consider how to extend it to BiasSVD and SVD++ which will be implemented.

This comment has been minimized.

@lozhnikov

lozhnikov Jul 7, 2018

Contributor

I think that should become clear as soon as BiasSVD and SVD++ are implemented.
Right now I've got the following idea:
Actually, we have 2 policies: CholevskyPolicy and SVDPolicy. Meanwhile, CholevskyPolicy depends on matrices W and H, so it has another template argument which can be equal to BatchSVDPolicy, NMFPolicy, RandomizedSVDPolicy and so on.
For example:

template<typename DecompositionPolicy>
class CholevskyNeighborhoodPolicy
{
  DecompositionPolicy p;

  void Apply(.......)
  {
    p.Apply(............);
  }

  double GetRating(..........);
  void GetNeighbourhood(.........);
};

Thus, let's wait for BiasSVD and SVD++, then maybe I'll suggest something else.

@lozhnikov

lozhnikov Jul 7, 2018

Contributor

I think that should become clear as soon as BiasSVD and SVD++ are implemented.
Right now I've got the following idea:
Actually, we have 2 policies: CholevskyPolicy and SVDPolicy. Meanwhile, CholevskyPolicy depends on matrices W and H, so it has another template argument which can be equal to BatchSVDPolicy, NMFPolicy, RandomizedSVDPolicy and so on.
For example:

template<typename DecompositionPolicy>
class CholevskyNeighborhoodPolicy
{
  DecompositionPolicy p;

  void Apply(.......)
  {
    p.Apply(............);
  }

  double GetRating(..........);
  void GetNeighbourhood(.........);
};

Thus, let's wait for BiasSVD and SVD++, then maybe I'll suggest something else.

This comment has been minimized.

@Wenhao-H

Wenhao-H Jul 7, 2018

Member

Okay! Many thanks!

@Wenhao-H

Wenhao-H Jul 7, 2018

Member

Okay! Many thanks!

@Wenhao-H

This comment has been minimized.

Show comment
Hide comment
@Wenhao-H

Wenhao-H Jul 8, 2018

Member

I added the implementation of BiasSVD and a few tests. The implementation of BiasSVD is modification on the implementation of RegularizedSVD.

Member

Wenhao-H commented Jul 8, 2018

I added the implementation of BiasSVD and a few tests. The implementation of BiasSVD is modification on the implementation of RegularizedSVD.

@lozhnikov

It's close to RegularizedSVD, it would be nice to merge them. However I am not sure how to do it neatly. Let me think about it.

@Wenhao-H

This comment has been minimized.

Show comment
Hide comment
@Wenhao-H

Wenhao-H Jul 26, 2018

Member

CFPredictSVDPPTest passed but RecommendationAccuracySVDPPTest failed. I think the hyperparameters of SVDPlusPlus need to be tuned. I will try to find better default hyperparameters for SVDPlusPlus.

Member

Wenhao-H commented Jul 26, 2018

CFPredictSVDPPTest passed but RecommendationAccuracySVDPPTest failed. I think the hyperparameters of SVDPlusPlus need to be tuned. I will try to find better default hyperparameters for SVDPlusPlus.

Wenhao-H added some commits Aug 5, 2018

CFType(const MatType& data,
DecompositionPolicy& decomposition = DecompositionPolicy(),
const DecompositionPolicy& decomposition = DecompositionPolicy(),

This comment has been minimized.

@lozhnikov

lozhnikov Aug 7, 2018

Contributor

Could you remove @tparam DecompositionPolicy from the comment above?

@lozhnikov

lozhnikov Aug 7, 2018

Contributor

Could you remove @tparam DecompositionPolicy from the comment above?

void Train(const arma::mat& data,
DecompositionPolicy& decomposition,
const DecompositionPolicy& decomposition,

This comment has been minimized.

@lozhnikov

lozhnikov Aug 7, 2018

Contributor

same comment.

@lozhnikov

lozhnikov Aug 7, 2018

Contributor

same comment.

void Train(const arma::sp_mat& data,
DecompositionPolicy& decomposition,
const DecompositionPolicy& decomposition,

This comment has been minimized.

@lozhnikov

lozhnikov Aug 7, 2018

Contributor

Same comment.

@lozhnikov

lozhnikov Aug 7, 2018

Contributor

Same comment.

Show outdated Hide outdated src/mlpack/methods/cf/cf_model.hpp
itemMean = arma::vec(arma::mean(cleanedData, 1));
// Calculate itemMean.
itemMean = arma::vec(cleanedData.n_rows, arma::fill::zeros);

This comment has been minimized.

@lozhnikov

lozhnikov Aug 11, 2018

Contributor

Nice catch! I didn't notice.

@lozhnikov

lozhnikov Aug 11, 2018

Contributor

Nice catch! I didn't notice.

@Wenhao-H

This comment has been minimized.

Show comment
Hide comment
@Wenhao-H

Wenhao-H Aug 22, 2018

Member

I am not able to reduce the number of failures in RecommendationAccuracySVDPPTest to below 17 either. I've tried different combinations of normalization methods, neighbor search methods and interpolation methods and tuned the hyperparameters. The lowest number of failures was around 25. @lozhnikov Do you think we can remove this test? Because we have CFPredictSVDPPTest to test the rsme performance of svd++.

Member

Wenhao-H commented Aug 22, 2018

I am not able to reduce the number of failures in RecommendationAccuracySVDPPTest to below 17 either. I've tried different combinations of normalization methods, neighbor search methods and interpolation methods and tuned the hyperparameters. The lowest number of failures was around 25. @lozhnikov Do you think we can remove this test? Because we have CFPredictSVDPPTest to test the rsme performance of svd++.

@lozhnikov

This comment has been minimized.

Show comment
Hide comment
@lozhnikov

lozhnikov Aug 24, 2018

Contributor

@Wenhao-H I looked cautiously through the SVD++ implementation and I think it's correct. So, I suggest to comment out the test and open an issue (and point out the issue in the comment before the test) since it could be interesting to investigate why the test fails.
Actually, the PR looks good to me. I'd like to look through it again a couple of times to be sure. I hope I'll do that by weekend.

There are two problems:

  1. The decomposition policies contain a lot of duplicated code. I suggest to split DecompositionPolicy into 2 classes: DecompositionPolicy and NeighborhoodPolicy in such a way DecompositionPolicy decomposes a matrix and NeighborhoodPolicy finds the neighbors. However I'd like to move it to another PR since I don't want to overcomplicate this PR. On the other side these changes vary API a bit so they should be done before the release. @rcurtin @zoq What do you think?
  2. Currently RegularizedSVD, SVD++ and BiasSVD are implemented in similar way. However I don't have any clear understanding how to implement it in a general way without code duplication.
    Perhaps it is a good idea to open an issue.
Contributor

lozhnikov commented Aug 24, 2018

@Wenhao-H I looked cautiously through the SVD++ implementation and I think it's correct. So, I suggest to comment out the test and open an issue (and point out the issue in the comment before the test) since it could be interesting to investigate why the test fails.
Actually, the PR looks good to me. I'd like to look through it again a couple of times to be sure. I hope I'll do that by weekend.

There are two problems:

  1. The decomposition policies contain a lot of duplicated code. I suggest to split DecompositionPolicy into 2 classes: DecompositionPolicy and NeighborhoodPolicy in such a way DecompositionPolicy decomposes a matrix and NeighborhoodPolicy finds the neighbors. However I'd like to move it to another PR since I don't want to overcomplicate this PR. On the other side these changes vary API a bit so they should be done before the release. @rcurtin @zoq What do you think?
  2. Currently RegularizedSVD, SVD++ and BiasSVD are implemented in similar way. However I don't have any clear understanding how to implement it in a general way without code duplication.
    Perhaps it is a good idea to open an issue.
@Wenhao-H

This comment has been minimized.

Show comment
Hide comment
@Wenhao-H

Wenhao-H Aug 24, 2018

Member

@lozhnikov I have opened an issue #1501 for the failing RecommendationAccuracySVDPPTest.
For the code duplication in RegularizedSVD, SVD++ and BiasSVD, I will open an issue after this PR is merged if we cannot think of a way to solve this.

Member

Wenhao-H commented Aug 24, 2018

@lozhnikov I have opened an issue #1501 for the failing RecommendationAccuracySVDPPTest.
For the code duplication in RegularizedSVD, SVD++ and BiasSVD, I will open an issue after this PR is merged if we cannot think of a way to solve this.

@lozhnikov

This comment has been minimized.

Show comment
Hide comment
@lozhnikov

lozhnikov Sep 2, 2018

Contributor

Finally, I think the PR is ready to merge. However I'd like to propose some fixes
various-fixes.zip in order to reduce the number of warnings. Also I fixed some bounds in the SVD++ test, now they correspond to RegularizedSVDTest.

If the PR doesn't block anything I suggest to wait 2--3 or 5 days in case if somebody wants to look through it and then merge. Actually, you can merge the PR by yourself since you are a member of the Contributors team:)

Thanks for the contribution!

Contributor

lozhnikov commented Sep 2, 2018

Finally, I think the PR is ready to merge. However I'd like to propose some fixes
various-fixes.zip in order to reduce the number of warnings. Also I fixed some bounds in the SVD++ test, now they correspond to RegularizedSVDTest.

If the PR doesn't block anything I suggest to wait 2--3 or 5 days in case if somebody wants to look through it and then merge. Actually, you can merge the PR by yourself since you are a member of the Contributors team:)

Thanks for the contribution!

@Wenhao-H

This comment has been minimized.

Show comment
Hide comment
@Wenhao-H

Wenhao-H Sep 3, 2018

Member

@lozhnikov Thanks a lot! I will make the changes you proposed probably in the next two days. This is a large PR so let's wait for 5 days before merge.
There is one more thing. The travis build failure is caused by LMNN test. Can we ignore this failure and go ahead to merge this PR?

Member

Wenhao-H commented Sep 3, 2018

@lozhnikov Thanks a lot! I will make the changes you proposed probably in the next two days. This is a large PR so let's wait for 5 days before merge.
There is one more thing. The travis build failure is caused by LMNN test. Can we ignore this failure and go ahead to merge this PR?

@lozhnikov

This comment has been minimized.

Show comment
Hide comment
@lozhnikov

lozhnikov Sep 3, 2018

Contributor

@Wenhao-H The test isn't connected with the PR. Perhaps the failure is fixed in the master branch.

Contributor

lozhnikov commented Sep 3, 2018

@Wenhao-H The test isn't connected with the PR. Perhaps the failure is fixed in the master branch.

@rcurtin

This comment has been minimized.

Show comment
Hide comment
@rcurtin

rcurtin Sep 3, 2018

Member

Yes, the LMNNTest failure was fixed in #1502, so don't worry about it here.

Member

rcurtin commented Sep 3, 2018

Yes, the LMNNTest failure was fixed in #1502, so don't worry about it here.

@Wenhao-H

This comment has been minimized.

Show comment
Hide comment
@Wenhao-H

Wenhao-H Sep 4, 2018

Member

Hmm I'm not sure I understand the AppVeyor build failure. Can someone give me a hand?

Member

Wenhao-H commented Sep 4, 2018

Hmm I'm not sure I understand the AppVeyor build failure. Can someone give me a hand?

@lozhnikov

Why didn't you apply the patch as is? It also fixes some bounds in the SVD++ test. They were incredibly high.

Show outdated Hide outdated src/mlpack/methods/cf/cf_model_impl.hpp
@Wenhao-H

This comment has been minimized.

Show comment
Hide comment
@Wenhao-H

Wenhao-H Sep 4, 2018

Member

Oops I didn't know I can directly apply the patch in git. Will do that shortly.

Member

Wenhao-H commented Sep 4, 2018

Oops I didn't know I can directly apply the patch in git. Will do that shortly.

@lozhnikov

This comment has been minimized.

Show comment
Hide comment
@lozhnikov

lozhnikov Sep 4, 2018

Contributor

@Wenhao-H Yeah, you can apply the patch using the following command:

patch -p1 < path/to/patch

(the working dir should be the root of the repo)

Contributor

lozhnikov commented Sep 4, 2018

@Wenhao-H Yeah, you can apply the patch using the following command:

patch -p1 < path/to/patch

(the working dir should be the root of the repo)

@Wenhao-H

This comment has been minimized.

Show comment
Hide comment
@Wenhao-H

Wenhao-H Sep 6, 2018

Member

I'll go ahead and merge it now. Thanks for reviewing! @lozhnikov

Member

Wenhao-H commented Sep 6, 2018

I'll go ahead and merge it now. Thanks for reviewing! @lozhnikov

@Wenhao-H Wenhao-H merged commit bcaf2d0 into mlpack:master Sep 6, 2018

6 checks passed

LaTeX Documentation Checks Build finished.
Details
Memory Checks Build finished.
Details
Static Code Analysis Checks Build finished.
Details
Style Checks Build finished.
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment