New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML-313] Added testing suite for Classification, Regression and Clustering #258

Merged
merged 6 commits into from Nov 28, 2016

Conversation

Projects
None yet
3 participants
@vivekaxl
Contributor

vivekaxl commented Sep 30, 2016

This Testing Suite would run multiple runs of algorithms and compare results to benchmark with scikit-learn (python implementation). The comparison is based on a Student T-Test.

@vivekaxl vivekaxl changed the title from Added testing suite for Classification, Regression and Clustering to [ML-313] Added testing suite for Classification, Regression and Clustering Sep 30, 2016

@richardkchapman

This comment has been minimized.

Member

richardkchapman commented Oct 3, 2016

@johnholt Please review

@johnholt

TestingSuite/Utils/ReturnSet is just a "todo" comment, it should be removed. TestingSuite/Clustering/TestKMeans.ecl.bak should be removed from the commit.

There looked like there were several cases with tabs instead of blanks. Please make sure that you have the whitespace errors turned on so the commit tells you about bad whitespace.

IMPORT TestingSuite.Classification as Classification;
IMPORT TestingSuite.Clustering as Clustering;
IMPORT TestingSuite.Regression as Regression;
IMPORT TestingSuite.TimeSeries as TimeSeries;

This comment has been minimized.

@johnholt

johnholt Nov 15, 2016

Collaborator

Import is in error. The attribute was not in your commit.

#END
%source_code%;
o := output(results, named('results'));
RETURN when(results, o);

This comment has been minimized.

@johnholt

johnholt Nov 15, 2016

Collaborator

This looks like some debugging code that was left in error. Please either remove or comment.

@johnholt

I think that the Logistic Regression Tests were missed in the commit.

@@ -0,0 +1,65 @@
//RandomForest.ecl

This comment has been minimized.

@johnholt

johnholt Nov 15, 2016

Collaborator

Bad comment. I could not find any case where this attribute was actually used. Has it been tested?

IMPORT ML.Tests.Explanatory as TE;
IMPORT * FROM ML.Types;
IMPORT * FROM TestingSuite.Utils;
IMPORT TestingSuite.Classification as Classification;

This comment has been minimized.

@johnholt

johnholt Nov 15, 2016

Collaborator

Please use explicit imports instead of IMPORT * so that these don't have any conflicts if legacy import rules are enabled.

RETURN (REAL)AVE(results, results.result);
ENDMACRO;
//OUTPUT(TestRandomForestClassification(Classification.Datasets.ecoliDS.content, 3));

This comment has been minimized.

@johnholt

johnholt Nov 15, 2016

Collaborator

Strange comment. I expected to see the logistic regression test.

@johnholt

Looks good. Thanks.

@vivekaxl

This comment has been minimized.

Contributor

vivekaxl commented Nov 22, 2016

@johnholt: Is this PR ready to be merged?

@johnholt

This comment has been minimized.

Collaborator

johnholt commented Nov 25, 2016

@richardkchapman I have reviewed the changes and they are good to merge.

@richardkchapman

This comment has been minimized.

Member

richardkchapman commented Nov 28, 2016

@johnholt What release should this be targetting? 6.2.0? 6.4.0? 7.0?

@johnholt

This comment has been minimized.

Collaborator

johnholt commented Nov 28, 2016

It is not related to any specific release.

This is the old ecl-ml library, not the new one. I think we would want to keep this library flexible until we officially obsolete it.

@richardkchapman richardkchapman merged commit 1f81c0c into hpcc-systems:master Nov 28, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment