Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE][ML] A measure of strength of relationship between RVs plus better seeding of feature sample probabilities for boosted tree #488

Merged

Conversation

tveasey
Copy link
Contributor

@tveasey tveasey commented May 29, 2019

This implements the (refined) "maximal information coefficient" measure of the strength of the relationship between two variables and uses it to initialise feature sample probabilities for the boosted tree. It also puts in place a mechanism to restrict the features used, if there are insufficient training data, to those variables with the strongest relationship with dependent variable.

@tveasey tveasey requested a review from valeriy42 July 2, 2019 15:28
Copy link
Contributor

@valeriy42 valeriy42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job on implementing MICe! 👏 I have a number of comments which aim to improve readability.

include/maths/CMic.h Outdated Show resolved Hide resolved
lib/maths/CDataFrameUtils.cc Show resolved Hide resolved
lib/maths/CDataFrameUtils.cc Outdated Show resolved Hide resolved
lib/maths/CDataFrameUtils.cc Outdated Show resolved Hide resolved
lib/maths/CMic.cc Show resolved Hide resolved
lib/maths/CMic.cc Outdated Show resolved Hide resolved
lib/maths/CMic.cc Show resolved Hide resolved
lib/maths/unittest/CDataFrameUtilsTest.cc Show resolved Hide resolved
lib/maths/unittest/CMicTest.cc Outdated Show resolved Hide resolved
lib/maths/unittest/CMicTest.cc Show resolved Hide resolved
@tveasey
Copy link
Contributor Author

tveasey commented Jul 5, 2019

Thanks for the review @valeriy42! I think I've addressed all your comments. Can you take another look.

Copy link
Contributor

@valeriy42 valeriy42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Good job!

@tveasey tveasey merged commit 8bfe832 into elastic:feature/regression Jul 5, 2019
@tveasey tveasey deleted the regression-tune-feature-weights branch July 5, 2019 12:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants