[SYSTEMDS-200] Gaussian Mixture Model builtin#976
Conversation
| private void runGMMTest(int G_mixtures, String model, String init_param, int iter, double reg, boolean rewrite, | ||
| LopProperties.ExecType instType) { | ||
| Types.ExecMode platformOld = setExecMode(instType); | ||
| OptimizerUtils.ALLOW_ALGEBRAIC_SIMPLIFICATION = rewrite; |
There was a problem hiding this comment.
Spark test passed after setting OptimizerUtils.ALLOW_ALGEBRAIC_SIMPLIFICATION to false.
Baunsgaard
left a comment
There was a problem hiding this comment.
Hi @Shafaq-Siddiqi,
I think this looks awesome, 🥇
Unfortunately I'm not that well versed in the internals of GMM so i hope the math is in order, but looking at the internal operations i really look forward to try it out on the compressed representation!
I have some minor comments and questions you that would be great if you could answer!
On the less important side, the script gmm.dml as it is now does not follow what i thought to be the coding guidelines.
- Variable inputs names should be CamelCase.
- Indentation should be tabs not double space.
I'm fine if you don't want to change it, but i really value if we could aim for some consistency between our scripts. I can also change my formatting to the settings you use.
I also can't help but notice that there are no doc entry added for this new operation, that would also be great!
| { | ||
| resp = Rand(rows = nrow(X), cols=n_components) | ||
| resp = resp/rowSums(resp) | ||
| } |
There was a problem hiding this comment.
what other initialization methods could be applied and to what benefits?
| else if (model == "VII") | ||
| cov_param = n_components | ||
| else | ||
| stop("invalid model expecting any of [VVV,EEE,VVI,VII], found "+model) |
There was a problem hiding this comment.
Would the previous validation step's stop, not catch this?
| # NAME TYPE DEFAULT MEANING | ||
| # --------------------------------------------------------------------------------------------- | ||
| # X Double --- Matrix X | ||
| # n_components Integer 3 Number of n_components in the Gaussian mixture model |
There was a problem hiding this comment.
Default n_components value is 1 in line 55.
|
Merged into Master. |
New Builtin for Gaussian Mixture Model with four covariance types namely VVV, EEE, diag and spherical (please see the script header for more details).
Spark tests for using kmeans as initialization function are commented out due to a runtime exception in kmeans execution on the following instruction.
SPARK°rexpand°cast=true°max=_Var872°ignore=false°dir=cols°target=_mVar884°_mVar885·MATRIX·FP64