Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Were you planned to drop duplicated correlation coefficients? #60

Closed
such3r opened this issue Dec 29, 2022 · 3 comments
Closed

Were you planned to drop duplicated correlation coefficients? #60

such3r opened this issue Dec 29, 2022 · 3 comments

Comments

@such3r
Copy link

such3r commented Dec 29, 2022

correlation_dataframe = df.corr().abs().unstack().sort_values().drop_duplicates()

Consider putting this in the following:
correlation_dataframe = df.corr().abs().unstack().sort_values().round(7).drop_duplicates()

Note .round(7) had been added before .drop_duplicates(). Since we can have multiple values close to unity or zero. For some problems round decimals can be even smaller then 7.

@AutoViML
Copy link
Owner

Hi @such3r
Thanks for testing out featurewiz 👍

I liked your suggestion and changed the code. Please upgrade to the latest version and try it again.
Thanks
AutoVimal

@Harry040
Copy link

why use drop_duplicates? that will drop features with 0 corr

@AutoViML
Copy link
Owner

Hi @Harry040
Thanks for picking up on this thread. But I tested it on multiple dataframes and I have not seen it happen yet. Can you provide a code sample where this would impact as you suggesting?
Thanks
Auto Vimal

@AutoViML AutoViML reopened this Dec 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants