Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return plots as ggplot objects #78

Closed
AndreaPi opened this issue Aug 23, 2018 · 5 comments
Closed

Return plots as ggplot objects #78

AndreaPi opened this issue Aug 23, 2018 · 5 comments

Comments

@AndreaPi
Copy link

@AndreaPi AndreaPi commented Aug 23, 2018

Are the DataExplorer plots ggplot objects? I would like to use patchwork to patch together two plot_missing plots for two different datasets (actually, the same data set, but split by a binary variable). If the plot generated by plot_missing is a ggplot object, then returning it would allow me to patch two (or more) of them together using patchwork.

PS excellent package, thanks for putting it on CRAN!

@boxuancui
Copy link
Owner

@boxuancui boxuancui commented Aug 23, 2018

Yes it is a ggplot object. However, the function is already returning something.

Two options IMO:

  1. Wrap the ggplot object in a list and return along with other things.
  2. The missing value profile is already returned and it goes directly into the plot function from line 57 - 66. You can always create your own ggplot object by reusing/modifying the code.

I am more inclined to option 2, since you already get what you need from the returned object, and it is cleaner. Option 1 seems a little redundant, and might conflict with future development of this package. Again, I do not have clear vision about what that is, but would like to keep it clean and simple for now.

Open for suggestions.

And, thanks for using DataExplorer!

@AndreaPi
Copy link
Author

@AndreaPi AndreaPi commented Aug 23, 2018

Thanks for the explanation! So, if I understand correctly, plot_missing already creates a ggplot object (called output in your code). However, it doesn't return it, because it already returns (invisibly) missing_value, the dataframe containing the missing value profile.
Option 2 would work, but I don't like it because it's basically duplicating work: you suggest that I assign missing_value to a variable, e.g.

missing_value_profile <- plot_missing(my_dataframe)

and then create a plot copying the ggplot code in plot_missing. But you already create and plot output in your code, so I'd generate the plot twice.

Instead, I suggest adding an optional parameter defaulting to FALSE:

plot_missing <- function(data, title = NULL, ggtheme = theme_gray(), theme_config = list("legend.position" = c("bottom")), return_plot_object = FALSE)

and a simple if at the end of plot_missing:

  ## Set return object
 if (return_plot_object)
  return(invisible(list(output, missing_value)))
 else 
  return(invisible(missing_value))
}
@boxuancui boxuancui self-assigned this Aug 23, 2018
@boxuancui boxuancui added this to the v0.7.0 milestone Aug 23, 2018
@boxuancui
Copy link
Owner

@boxuancui boxuancui commented Aug 23, 2018

It will be released in v0.7. Please use either workarounds in the mean time.

@AndreaPi
Copy link
Author

@AndreaPi AndreaPi commented Aug 23, 2018

Absolutely! Thanks a lot :-D BTW, another possible solution:

  • expose a function which creates the missing value profile, e.g., missing_values_profile()
  • have plot_missing() call it
  • at this point, plot_missing() can simply always return the ggplot object (output). If the user wants to get the dataframe containing the missing value profile, instead, s/he can directly call missing_values_profile().

This way, plot_missing() would always return the same object, no matter what its inputs are: I don't think this is something indispensable, but maybe it could simplify your life as a mantainer. At the same time, you would still maintain only one function to generate the missing value profile (missing_values_profile()), because plot_missing() would internally call missing_values_profile().

Personally I think that the first solution (adding the optional argument return_plot_object to plot_missing()) should be fine, and it wouldn't break old code, but of course the decision is yours. Thanks again!!

boxuancui added a commit that referenced this issue Oct 12, 2018
@boxuancui boxuancui closed this Oct 12, 2018
@boxuancui
Copy link
Owner

@boxuancui boxuancui commented Oct 12, 2018

Added profile_missing(), and plot_missing() calls that directly.
All plot_* functions return the ggplot objects now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.