Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wishlist: Support factors as x or y #2

Closed
2 of 4 tasks
zeileis opened this issue Apr 7, 2023 · 6 comments · Fixed by #233
Closed
2 of 4 tasks

Wishlist: Support factors as x or y #2

zeileis opened this issue Apr 7, 2023 · 6 comments · Fixed by #233
Labels
wishlist Features we'd like to support

Comments

@zeileis
Copy link
Collaborator

zeileis commented Apr 7, 2023

Thanks:

Grant @grantmcdermott, the project looks really nice and useful, thanks!

Wishlist:

I just wanted to put on the wishlist that in addition to numeric x and y, it would be great if the factor-based plot() flavors were also supported in plot2(). Especially the plots where y is a factor are in my opinion underappreciated in base R. (It's not surprising that I would say that because I implemented these...)

Plots:

In plot(x, y, ...) and plot(y ~ x, ...) we have the following:

x
Numeric
 
Factor
y   Numeric Scatterplot Parallel boxplots
Factor Spinogram
(Spineplot with histogram-style breaks in x)
Spineplot
(Flavor of mosaic plot)

Examples:

data("SwissLabor", package = "AER")
plot(income ~ age, data = SwissLabor)            ## Numeric ~ Numeric
plot(income ~ foreign, data = SwissLabor)        ## Numeric ~ Factor
plot(participation ~ age, data = SwissLabor)     ## Factor ~ Numeric
plot(participation ~ foreign, data = SwissLabor) ## Factor ~ Factor

It would be great if the plots above would work with plot2() as well - and if the coloring and grouping etc. would also be supported.

GM edit: adding checklist to track

@grantmcdermott
Copy link
Owner

Thanks Achim.

I can't promise I'll get to this soon, but I just invited you to the project. No pressure (I'm well aware of how many other plates you have in the air...), but think of it as in invitation to take a stab at these if you get a chance ;-)

@zeileis
Copy link
Collaborator Author

zeileis commented Apr 8, 2023

Thanks! I'll try...

@grantmcdermott grantmcdermott added the wishlist Features we'd like to support label Apr 15, 2023
@zeileis
Copy link
Collaborator Author

zeileis commented Apr 17, 2023

Idea:

I've taken another look at the current state of the code and my idea to tackle support of factor variables and also of facets is the following:

  • We set up internal functions that draw some type/class of x against some type/class of y, potentially also supporting a by grouping variable.
  • Multiple functions might be available for the same combination of x and y, e.g., cdplot vs. spineplot in case of a factor y and a numeric x.
  • The y variable might also be missing so that we can get density or histogram functions.
  • These functions can be called just once, corresponding to what is currently done in plot2.default, or several times in case of mfrow/facets.
  • The functions must have an argument axes = TRUE that can also be set to FALSE so that either axes are drawn or suppressed. But additionally specifications like axes = c(1, 2) or axes = 2 etc. should be possible so that only certain axes are drawn. The latter is needed if we don't want to repeat certain axes in facet displays.

Examples:

Functions would inlcude:

plot2_numeric_numeric_scatter(x, y, by = NULL, axes = TRUE, ...) ## current core of plot2.default
plot2_factor_numeric_boxplot(x, y, axes = TRUE, ..) ## no support for 'by' within the plot (only via facets)
plot2_factor_factor_spineplot(x, y, axes = TRUE, ...)
plot2_numeric_factor_spineplot(x, y, axes = TRUE, ...)
plot2_numeric_factor_cdplot(x, y, axes = TRUE, ...)

But also functions for a single numeric x variable (plus optional grouping):

plot2_numeric_none_histogram(x, y = NULL, by = NULL, axes = TRUE, ...) ## check that y is actually empty
plot2_numeric_none_density(x, y = NULL, by = NULL, axes = TRUE, ...)

Outer vs. inner function:

The plot2.default() function would then provide the "outer" skeleton which decides whether facets are needed/desired or not and then calls the plot2_x_y_type() functions as appropriate. It needs to know where the axes go and how the margins need to be set (which probably depends on the type of plot).

Open questions:

  • Who draws the legend? The "outer" plot2.default() function or the "inner" plot2_x_y_type() function?
  • Are there further common arguments to the "inner" functions?
  • Does the "outer" plot2.default() function know about other arguments or does it just pass these on to the "inner" functions?

@grantmcdermott
Copy link
Owner

This sounds great @zeileis.

(As an aside, I've thought before that we may need to offer more flexible axes control... Although, at least in the non-faceted cases that had in mind this could be done through some global par(las=<value>, col.axis=<value>, ...) options.)

Let me push me finish up a PR or two that address #19. Hopefully I'll have time after work today. Once those are merged, then I'll hold off making any other changes on the codebase until we have resolved the internal changes you have proposed above.

@zeileis
Copy link
Collaborator Author

zeileis commented Apr 17, 2023

OK, thanks, sounds good!

@grantmcdermott

This comment was marked as duplicate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wishlist Features we'd like to support
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants