Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Best practice for ggplot2 wrapper functions #6

Open
dannyparsons opened this issue Jan 17, 2022 · 4 comments
Open

Best practice for ggplot2 wrapper functions #6

dannyparsons opened this issue Jan 17, 2022 · 4 comments

Comments

@dannyparsons
Copy link
Contributor

Been looking to best practice for writing functions that produce a ggplot2 graph:

https://rpubs.com/hadley/97970 - suggestions on functions and parameters
https://fishandwhistle.net/slides/rstudioconf2020/#1 - how to call things correctly to pass R package checks
https://icydk.com/how-to-write-functions-to-make-plots-with-ggplot2-in-r/ - idea of using glue package

I haven't quite found what I want which is how to make a function as flexible as possible without having 100s of parameters. The first link suggests optional parameters that can be lists of parameters to be passed to different bits of the plot e.g. bar.params = list(), errorbar.params = list().

I haven't yet found a suggestion on how to do this in general to cover all aspects of the plot.

@isedwards
Copy link

isedwards commented Jan 17, 2022

I'm not sure whether this is applicable to R, but the way this is usually approached in object-orientated languages is to contain all of the parameterisation in a configuration object.

An object has data and behaviours. When you create/modify the configuration object you can perform additional behaviours like checking that the parameterisation you are storing is valid. If, for example, you stored image_size then the object would check that sizes were positive (in addition to being type integer).

> my_config = Config(image_size=[800, 600])   # some settings can be added when the object is instantiated
> my_config.image_size = [-800, -600]   # or added/changed later
Error - image dimensions must be positive   # the object takes care of validation

Within your R code you can then use additional methods that are defined by the class for all config objects, e.g. you could store image_size in cm but have a method that returned the size in inches. This prevents the need to repeat the conversion logic in multiple functions, or having the logic just sitting in a stand-alone function somewhere.

Given my_config and the need to parameterise ggplot2 you could have a method like .ggplot2_parameters that returns the parameters required by ggplot2.

@dannyparsons
Copy link
Contributor Author

Thanks Ian, this has helped with our thinking on this.

This kind of approach makes a lot of sense in line with how ggplot2 works where individual components can be defined, stored and added on. I had initially thought we needed a different approach since we are going through Python wrappers, but I think we can maintain this separation still.

This idea is particularly relevant for themes, which controls the look of the non-data components of the graph e.g.

# base graph
p1 <- ggplot(mtcars, aes(wt, mpg)) +
  geom_point() +
  labs(title = "Fuel economy declines as weight increases")
p1
# add theme options to base graph
p1 + theme(plot.title = element_text(size = 10))

So it makes sense not to implement themes within our own functions for the reasons you mentioned and as they may be used across functions.

The functions in this package can therefore stay relatively simple e.g.:

g <- inventory_plot(data = dodoma, date = "date", elements = "rain", year = "year", doy = "doy_366")

Theme options could be specified in the Python wrapper, with a Python implementation of the theme options (which is essentially just a list of lists). Then the Python wrapper can make an R call such as:

g <- inventory_plot(data = dodoma, date = "date", elements = "rain", year = "year", doy = "doy_366")
g + theme(legend.text = element_text(size = 8, colour = "red"))

@lloyddewit
Copy link
Collaborator

@dannyparsons
For the example above, I assume that:

  • the web component shall pass a list of themes to the Python wrapper function
  • the theme may be encapsulated in an object
    • the theme object may contain validity checks
  • the wrapper function shall apply each theme to the ggplot object in order
  • the wrapper function shall generate R calls similar to:
g <- inventory_plot(data = dodoma, date = "date", elements = "rain", year = "year", doy = "doy_366")
g + theme(legend.text = element_text(size = 8, colour = "red"))
g + theme(somethingElse.text = element_text(size = 10, colour = "blue"))
 :

Did I understand correctly?
Thanks

@dannyparsons
Copy link
Contributor Author

Thanks Stephen. Yes, this is broadly in line with my thinking. The slight difference is that I expect there to be a single theme object in Python which will be applied as follows:

g <- inventory_plot(data = dodoma, date = "date", elements = "rain", year = "year", doy = "doy_366")
g + theme(legend.text = element_text(size = 8, colour = "red"),
          somethingElse.text = element_text(size = 10, colour = "blue"))

All theme options can be done in a single theme function in R, but the general outline you give is the same, just a different R notation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants