Join GitHub today
missing values can be hidden in the presence of large (enough) N #18
Sometimes if there is only one cell missing in a large dataset of a few thousand, you cannot see the missing cell.
So I think that a little message for
There are X number of missing values in dataset
this could just be
And perhaps if there are ZERO missing values, it could state that "No missing values found".
I've also had the complementary issue, where almost all the values in a column are missing, but a few present values are too small to be seen on the plot.
I've tried using the alpha levels to indicate when
I think I see where you are going with the histogram idea - but could you end up with the same problem, where a lot of missing values in one column end up obscuring the one missing value in another column because they would expand the scale of the histogram too much?
One other possibility is using
Yeah you are absolutely right, we could run into the same problem.
I was thinking that some sort of a bar could be placed above the columns to indicate whether there are any missings in that column,
Another option would be to include both the
My only concern is that in adding in these features the graph will become more "noisy" and hard to explain