-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suggestion: mention most frequent value #47
Comments
Isn't it what it does already? The most frequent occurences are listed along with their frequencies. Am I misunderstanding your suggestion? |
Let me try to this with an example (code below), with a sample data set with salaries. I found that many times there are always a few salaries that appear more often than others. In my example this are 1594.20 (minimum monthly wage in The Netherlands) and 2000. This will create the following data frame summary. With regard to the 2nd column it would be handy if it would mention that values 1594.20 and 2000 appear most frequent. The first one you can see in the graph, but the value 2000 would be overlooked.
|
Ah ok, I see now, thanks for clarifying... Deserves some thinking for sure! |
After giving it some thought, I think this is a bit of an overkill. The mode will be shown for binary variables (issue #48) but otherwise, I think there's already a lot of info for numerical variables. I'll reopen if the feature is requested further in the future. Thx. |
In the data frame summary if an column contains 115 distinct values (such as countries) and 99% of the values is a specific country, this is very useful to mention what the most frequent country is. In general I believe It is usefull to display to most frequent values.
The text was updated successfully, but these errors were encountered: