New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
plot(df) and plot_correlation(df) fail when data has 'list' columns #48
Comments
@jnwang I think the reason is that plot_correlation() function only supports for numerical data. It is impossible for us to calculate a correlation value between list or categorical data. Thus, I believe that we could solve this problem in the next version. Thanks! |
Can you please check how df.corr() handles this? It would be better if we can make plot_correlation() consistent with df.corr()? |
@jnwang @dovahcrow df.corr() also does not handles this. There is nothing output when we use the df.corr() function. |
It looks like that df.corr() automatically ignores non-numerical columns. Can we do the same rather than raise an error? https://www.geeksforgeeks.org/python-pandas-dataframe-corr/amp/ |
@jnwang @dovahcrow @jinglinpeng I have fixed this bug. The plot_correlation() function only supports for numerical data. Plot_correlation() will ignore non-numerical data and output nothing when the input data is not numerical. If you have any questions, please let me know as soon as possible. Thanks! #57 |
@jnwang why empty output is good? I feel like a hard error is more explicit than an implicit empty return, and explicit is better than implicit. |
We should output an empty plot if all cols are non-numerical. I think we did it for plot_correlation(k). |
@jnwang @dovahcrow I agree with young. However, an empty plot is also a good option. |
Let do empty and can make it error later if users are complaining. |
Okay, I have fixed this bug. Thanks! #57 |
When running plot(df) and plot_correlation(df) on the following dataframe, since the author column is a list, both plot and plot_correlation failed.
For plot(), the reported error is
TypeError: unhashable type: 'list'
For plot_correlation(df), the reported error
AssertionError: No numerical columns found
The text was updated successfully, but these errors were encountered: