New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
maximal information coefficient for feature selection #25771
Comments
I want to look into doing this, but can you tell me where in the repo it should be built? I have not contributed to this repo before, and a stronger description would be helpful. Thank you! |
@leeauk21 Do you have more details regarding why this metric is useful? It seems that using mutual information directly is enough and MIC does not provide anything more. So if you have additional thoughts and background this would be useful to evaluate whether or not we should integrate this feature. |
I think Mutual information values can vary depending how you bin continuous data. whereas theoretically MIC wont but if MI works better then. There is no reason in implementing it. I found the above reason in Machine Learning in Probabilistic Perspective |
If someone can craft a use-case where MIC stably return good values while MI can catastrophically fail depending on the binning, then why not. Otherwise it sounds like a YAGNI. |
@ogrisel Hi, would this paper suffice to prove the that MIC is useful? The point here is, in most cases, MIC provides similar scores to equally noisy relationships of different types, but MI does not. Though MIC is computationally more complex, there have been effective approximations that can be more easily evaluated. The paper also gave pratical suggestions as for when and how to use MIC (starting from page 22 in the paper). It says:
Therefore, I do think adding the MIC metric as an option would be useful in specific cases. That being said, it is up to you and other maintainers to decide whether it is a useful feature for |
Describe the workflow you want to enable
maximal information coeffcient
Describe your proposed solution
maximal information coeffcient
Describe alternatives you've considered, if relevant
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: