Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement wrapper for diagonally-constrained GMM HMMs. #1666

Merged
merged 52 commits into from Apr 8, 2019

Conversation

Projects
None yet
4 participants
@KimSangYeon-DGU
Copy link
Contributor

commented Jan 15, 2019

Implement wrapper for diagonally-constrained GMM HMMs related to #1658.

TODO

DONE
[GMM part]

  • Make gmm_diag.hpp.
  • Make gmm_diag_impl.hpp.
  • Make gmm_diag.cpp.
  • Make a test case for DiagonalGMM.
  • Add some test cases for DiagonalGMM in gmm_test.cpp.

[HMM part]

  • Modify hmm_model.hpp to support DiagonalGMM.
  • Adapt hmm_train_main.cpp, hmm_viterbi_main.cpp, hmm_loglik_main.cpp, and hmm_generate_main.cpp to support DiagonalGMM.
  • Adapt a test case in hmm_generate_test.cpp, hmm_loglik_test.cpp, hmm_train_test.cpp and hmm_viterbi_test.cpp to support DiagonalGMM.
  • Make some test cases for DiagonalGMM HMMs in hmm_test.cpp.
  • Compare a training performance between diagonal and regular GMM HMMs.
  • Add the test cases DiagonalGMMHMMOneGaussianLabeledTrainingTest, and DiagonalGMMHMMMultipleGaussiansUnlabeledTrainingTest.

[Editing & Refactoring]

  • Use InitialClusteringType instead of FittingType in gmm_diag_impl.hpp.
  • Change r * r.t() to r at line 282 in hmm_train_main.cpp.
  • Create DiagCovGaussianDistribution class that internally can hold arma::vec for the mean, an arma::mat for the diagonal covariance entries, and the inverse of each covariance entry.
  • Add test cases for DiagCovGaussianDistribution class.
  • Edit DiagonalGMM and EMFit classes to support DiagCovGaussianDistribution.
  • Edit and refactor the codes according to the reviews.
  • Add the comment about DiagonalConstraint will be deprecated in HISTORY.md.
  • Change the class name DiagCovGaussianDistribution to DiagonalGaussianDistribution.
@rcurtin
Copy link
Member

left a comment

Hi there @KimSangYeon-DGU, this is looking great. 👍 I looked through most of the code at a high level and it is looking good to me. I expect when you compare this with regular GMMs it will perform quite favorably. I only have a few comments that I've left, so I hope they are helpful. I haven't given the code a full detailed review yet, but when I do that I would expect any comments to be really minor (style, comments, test strategy, etc.).

I'm looking forward to seeing how well it performs! :)

Show resolved Hide resolved src/mlpack/methods/gmm/gmm_diag_impl.hpp Outdated
Show resolved Hide resolved src/mlpack/methods/hmm/hmm_train_main.cpp Outdated
Show resolved Hide resolved src/mlpack/methods/gmm/gmm_diag.hpp Outdated

@KimSangYeon-DGU KimSangYeon-DGU force-pushed the KimSangYeon-DGU:develop-gmm branch 3 times, most recently from 74b845f to 58bde0b Feb 5, 2019

@KimSangYeon-DGU

This comment has been minimized.

Copy link
Contributor Author

commented Feb 7, 2019

Hey @rcurtin. This is a list of updates. :)

[distribution_test.cpp]

  • In distribution_test.cpp, I calculated the ground truth values by using R libraries, dmvnorm, and cov.wt.

[diag_cov_gaussian_distribution.cpp]

  • For the unbiased estimator in the weighted sample case, I applied a formula to Train() that has a probabilities parameter. Below is the formula. And this is implemented in R library as well.

  • each weight
    each row vector of observation
    weighted mean
    sum of each weight squared
    For more details, please refer to Wikipedia

[em_fit.hpp/cpp]

  • I added a template paramter DistributionType to the EMFit class to support two types of distribution. And I set the default value to distribution::GaussianDistribution for the backward compatibility.

TODO

  • Now, although I make some test cases in main_tests/hmm_*_test.cpp, I noticed I should make some test cases in hmm_test.cpp and then I'll try to compare the performance between regular and digonal GMM HMMs.

  • Change DistributionType in em_fit.hpp/cpp to Distribution for unity with hmm.hpp/cpp

If I make some test cases in hmm_test.cpp, I think it is ready for a review. :)
Please, let me know if I should clarify anything. :)

@KimSangYeon-DGU

This comment has been minimized.

Copy link
Contributor Author

commented Feb 9, 2019

@mlpack-jenkins test this please

@KimSangYeon-DGU KimSangYeon-DGU force-pushed the KimSangYeon-DGU:develop-gmm branch 5 times, most recently from 312369b to 8ed72af Feb 9, 2019

@rcurtin
Copy link
Member

left a comment

Hi @KimSangYeon-DGU, thanks for the clear updates.

In distribution_test.cpp, I calculated the ground truth values by using R libraries, dmvnorm, and cov.wt.

Can you add a comment to the tests pointing out that that's where the values came from?

Everything looks pretty good to me overall, although I've left a handful of comments about possible changes (some semi-major, mostly minor). Looking through the tests I think there are a few we could add---

  • Train a diagonal GMM with one Gaussian but with probabilities
  • Train a diagonal GMM with multiple Gaussians
  • Train a diagonal GMM with multiple Gaussians and probabilities
  • The same for HMMs with diagonal GMMs

We should also add a note to HISTORY.md to point out the change. And, we can deprecate DiagonalConstraint, since a user should just use DiagonalGMM instead now.

Thanks so much for the hard work with this. This is definitely not a trivial task, and when we have it done it should significantly improve the speed of the HMM implementation for diagonal GMMs. 👍

Show resolved Hide resolved src/mlpack/core/dists/diag_cov_gaussian_distribution.cpp Outdated
Show resolved Hide resolved src/mlpack/core/dists/diag_cov_gaussian_distribution.cpp Outdated
Show resolved Hide resolved src/mlpack/core/dists/diag_cov_gaussian_distribution.hpp Outdated
Show resolved Hide resolved src/mlpack/core/dists/diag_cov_gaussian_distribution.hpp Outdated
Show resolved Hide resolved src/mlpack/core/dists/diag_cov_gaussian_distribution.hpp Outdated
Show resolved Hide resolved src/mlpack/methods/hmm/hmm_model.hpp Outdated
Show resolved Hide resolved src/mlpack/tests/gmm_test.cpp Outdated
Show resolved Hide resolved src/mlpack/tests/main_tests/hmm_generate_test.cpp Outdated
Show resolved Hide resolved src/mlpack/tests/main_tests/hmm_train_test.cpp Outdated
Show resolved Hide resolved src/mlpack/methods/hmm/hmm_train_main.cpp
@rcurtin

This comment has been minimized.

Copy link
Member

commented Mar 16, 2019

No problem, I'm happy to wait to merge until you guys think it's good. 👍

Show resolved Hide resolved src/mlpack/tests/main_tests/hmm_viterbi_test.cpp Outdated
Show resolved Hide resolved src/mlpack/methods/hmm/hmm_model.hpp Outdated
Show resolved Hide resolved src/mlpack/methods/gmm/em_fit_impl.hpp
Show resolved Hide resolved src/mlpack/tests/hmm_test.cpp

@KimSangYeon-DGU KimSangYeon-DGU force-pushed the KimSangYeon-DGU:develop-gmm branch from cc36895 to b7f9cd0 Mar 16, 2019

@KimSangYeon-DGU KimSangYeon-DGU force-pushed the KimSangYeon-DGU:develop-gmm branch from b7f9cd0 to bfba93b Mar 16, 2019

@KimSangYeon-DGU

This comment has been minimized.

Copy link
Contributor Author

commented Mar 18, 2019

@zoq, @ShikharJ. Thanks Marcus, Shikhar! :)

@zoq

zoq approved these changes Mar 18, 2019

Copy link
Member

left a comment

No further comments from my side, thanks for the great contribution.

@KimSangYeon-DGU

This comment has been minimized.

Copy link
Contributor Author

commented Mar 19, 2019

@ShikharJ. Can you please review this? Actually, after this PR gets merged, I'll work on conversion between DiagonalGMM and GMM when training using mlpack_gmm_train.

@ShikharJ
Copy link
Member

left a comment

Some minor comments. This PR is huge, and as such it will take me some time to review it in full and be confident about the implementation. @KimSangYeon-DGU If you require the code that is present in this PR, please add it to your other branch and then open a PR. I don't wish to rush a review, since this is substantial.

Show resolved Hide resolved src/mlpack/methods/gmm/em_fit.hpp Outdated
Show resolved Hide resolved src/mlpack/methods/gmm/em_fit_impl.hpp Outdated
Show resolved Hide resolved src/mlpack/core/dists/diagonal_gaussian_distribution.hpp Outdated
Show resolved Hide resolved src/mlpack/core/dists/diagonal_gaussian_distribution.hpp Outdated
Show resolved Hide resolved src/mlpack/core/dists/diagonal_gaussian_distribution.hpp Outdated
Show resolved Hide resolved src/mlpack/methods/gmm/diagonal_gmm.cpp
@KimSangYeon-DGU

This comment has been minimized.

Copy link
Contributor Author

commented Mar 19, 2019

Thanks for your sincere advice and comments @ShikharJ. I definitely agree with you.

@KimSangYeon-DGU KimSangYeon-DGU force-pushed the KimSangYeon-DGU:develop-gmm branch 2 times, most recently from c7e9d1d to 7441182 Mar 23, 2019

@KimSangYeon-DGU

This comment has been minimized.

Copy link
Contributor Author

commented Mar 23, 2019

Hi everyone, according to the review, I edited comments except for two spaces parts. And I noticed the conversation, so the inline method LogProbability() was moved to class itself in the DiagonalGaussianDistribution.

@KimSangYeon-DGU KimSangYeon-DGU force-pushed the KimSangYeon-DGU:develop-gmm branch 2 times, most recently from 4ecc266 to ca207b5 Mar 23, 2019

@KimSangYeon-DGU KimSangYeon-DGU force-pushed the KimSangYeon-DGU:develop-gmm branch from ca207b5 to db5dc64 Mar 23, 2019

@ShikharJ
Copy link
Member

left a comment

Minor issues.

Show resolved Hide resolved src/mlpack/methods/gmm/positive_definite_constraint.hpp
Show resolved Hide resolved src/mlpack/core/dists/diagonal_gaussian_distribution.cpp Outdated
Show resolved Hide resolved src/mlpack/core/dists/diagonal_gaussian_distribution.hpp Outdated
Show resolved Hide resolved src/mlpack/core/dists/diagonal_gaussian_distribution.hpp Outdated
Show resolved Hide resolved src/mlpack/methods/gmm/diagonal_gmm.hpp Outdated
Show resolved Hide resolved src/mlpack/tests/hmm_test.cpp Outdated
Show resolved Hide resolved src/mlpack/tests/hmm_test.cpp Outdated
Show resolved Hide resolved src/mlpack/tests/hmm_test.cpp Outdated
Show resolved Hide resolved src/mlpack/tests/hmm_test.cpp Outdated
Show resolved Hide resolved src/mlpack/tests/main_tests/hmm_train_test.cpp Outdated
@ShikharJ

This comment has been minimized.

Copy link
Member

commented Mar 30, 2019

Holy cow! It took me more than three hours just to read and understand the code, let alone reason its correctness. I'll need some more time on this.

@KimSangYeon-DGU

This comment has been minimized.

Copy link
Contributor Author

commented Apr 4, 2019

@ShikharJ Sorry for late response. I edited this PR according to the reviews. :) Thanks!!

@ShikharJ
Copy link
Member

left a comment

Just a minor comment, rest all looks good. I think I am reasonably convinced of the correctness of the code. Can't suggest any optimizations now, but we can work on them later on. Next time, please try breaking your PRs into smaller ones, so that it is easier to provide reviews.

@KimSangYeon-DGU

This comment has been minimized.

Copy link
Contributor Author

commented Apr 7, 2019

@ShikharJ Thanks for reviewing this. :)

Can't suggest any optimizations now, but we can work on them later on.

Sure :)

Next time, please try breaking your PRs into smaller ones, so that it is easier to provide reviews.

Agreed, I'll do that next time. Thanks for your hard work again.

@rcurtin rcurtin merged commit 3ad649c into mlpack:master Apr 8, 2019

6 checks passed

LaTeX Documentation Checks Build finished.
Details
Memory Checks Build finished.
Details
Static Code Analysis Checks Build finished.
Details
Style Checks Build finished.
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@rcurtin

This comment has been minimized.

Copy link
Member

commented Apr 8, 2019

Really happy to see this merged. 👍 Thanks again for the hard work on this!

@KimSangYeon-DGU KimSangYeon-DGU deleted the KimSangYeon-DGU:develop-gmm branch Apr 8, 2019

@KimSangYeon-DGU

This comment has been minimized.

Copy link
Contributor Author

commented Apr 8, 2019

Thanks, @rcurtin @zoq @ShikharJ :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.