New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
draw() error when trying to plot 2d smooths with bs='sz' or 'fs' #249
Comments
Thanks for the report and easy to implement examples to reproduce the issue. I think the fundamental issue is that I didn't consider 2d TPR or Duchon splines when I wrote the interval plotters for the sz and fs basis. Would you mind adding the output from I'll confirm what's going on later this week – both models should be expected to work with gratia – and see what needs to be changed to make them work. |
Thanks for the lightning-fast response Gavin. Sorry, I actually had the sessioninfo in my code but forgot to paste it!
|
Thanks for the report. As I suspected, the different errors were ultimately due the underlying code not knowing how to handle multivariate base smoothers.
|
That's funny- I normally put the factor last in the 'fs' smooths but I
vaguely remember there was a requirement somewhere along the way that I
needed to put it first to plot 'sz' smooths? I guess that backfired in the
'fs' case...
For the 'sz' case, I was thinking along the lines that you suggest above,
plotting the difference surface for each level of species. While I
conceptually like the idea of being able to see the differences
overlayed/plotted together (as it works so nicely in the 1-D 'sz' case),
I'm afraid that this could get messy in the 2D case (particularly when
there are many levels). So, from the perspective of
legibility/interpretability, it might make sense to keep these plots
independent (per-level), basically mimicking what you do in the 'by'
smooth?. In my particular situation, I don't have a ton of levels, so it's
not not an issue, but understanding that your code needs to be
generalizable, maybe adopt an approach like "if nlevels > X, then draw a
random sample of X levels", where X defaults to some sane value (8?) And
if people want to plot them all or look at specific levels, that can be
user-selectable, either by setting X higher, or requesting specific levels
via a vector?
I'd probably adopt a similar approach for 'fs', just mimicking the 'by'
smooth.
That's my 2 cents anyway - hope it's useful.
Thanks so much once again for all your help!
…On Fri, Feb 2, 2024 at 6:37 AM Gavin Simpson ***@***.***> wrote:
Thanks for the report. As I suspected, the different errors were
ultimately due the underlying code not knowing how to handle multivariate
base smoothers.
"fs" smooths
The "fs" smooth issue raised a different error because I was not careful
enough to handle users specifying the factor anywhere in the smooth
definition. I was assuming users would use s(x1, x2, f) so factor last. I
have fixed this particular issue such that if you use s(f, x1, x2) or s(x1,
f, x2) smooth_estimates() will now work correctly.
'"sz"` smooths
The error with that model was purely due to the code not anticipating a
multivariate base smoother.
I now catch this and do something rather than let the obscure error pass
through.
… however
None of this actually helps you as in both cases {gratia} can't currently
do anything useful from the plot side of things.
Right now I am just emitting messages for both your model examples:
r$> draw(i_fs)
Can't currently plot multivariate 'fs' smooths.
Skipping: s(Sepal.Length,Sepal.Width,Species)
r$> draw(i_sz)
Can't currently plot multivariate 'sz' smooths.
Skipping: s(Species,Sepal.Length,Sepal.Width)
but it will plot any other smooths it can plot.
It's not clear to me what to plot in the fs case. I could plot a few of
the smooth surfaces, randomly selecting which to plot? For the sz case, I
could plot the smooth "difference" surfaces, but how to actually do this in
the draw() output? Should I treat it like a by smooth and draw a separate
plot for each surface, or should I group them so that I only generate a
single plot but use facetting to show the difference-from-reference
surfaces for each level of the factor(s)?
A similar issue crops up with HGAMs with random tensor product smooths: t2(x1,
x2, f, bs = c("cr", "cr", "re"), full = TRUE). Which of these should I
plot and how?
So, back to you @chrishaak <https://github.com/chrishaak>. I assume you
weren't modelling the iris data in reality, so what would you like to see
plotted in both your model specifications?
I'll push fixed code shortly; you'll need to install from github or
r-universe (if it builds; there's an issues with the dependency on the
Matrix pkg which I need to figure out there) when I have pushed. Check that
you get version >= 0.8.2.58.
—
Reply to this email directly, view it on GitHub
<#249 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AORLQXDKJ3MPUJDDSQPVIL3YRTFZHAVCNFSM6AAAAABCTL6RJSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMRTGYZTEOBVGY>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
I also thought this was the case, but it turns out you can put it anywhere. I'll see what I can do. Right now, The solution I have used with trivariate and quadvariate smooths is to use facetting; the one "plot" would then have multiple facets, one per level of the factor. Otherwise I'll need to look at how to return multiple ggplot objects for a single smooth if I am to follow the by smooth convention... |
Gotcha... seems like facetting is the way to go then, if it is relatively
straightforward to implement.
While we're on the topic (feel free to let me know if you'd rather I start
a new issue here)...
I'm fitting to scaled covariate data (using 'scale'), but I'd like to see
the "real" (back-transformed) predictor values in my visualizations. Short
of going into each of the smooth estimates and manually replacing the x
values, do you see any intelligent way to do this "across the board" using
draw(), given the list of scaling info output by 'scale'? (I dug around a
little and didn't see anything, but may have overlooked something)?
…On Fri, Feb 2, 2024 at 10:50 AM Gavin Simpson ***@***.***> wrote:
...I vaguely remember there was a requirement somewhere along the way that
I needed to put it first to plot 'sz' smooths?
I also thought this was the case, but it turns out you can put it anywhere.
I'll see what I can do. Right now, by smooths work easily because from
mgcv's point of view these are really just entirely separate smooths - they
are a separate entry in the $smooth list that is return by gam(), bam()
etc. "fs" and "sz" smooths are different; there is just one element in
$smooth for the entire "fs" or "sz" smooth. This is fine with univariate
base smooths as I can produce one plot showing all levels (even if it gets
messy, with the sz basis in particular), but that won't work for surfaces
as I can't plot them on top of one another.
The solution I have used with trivariate and quadvariate smooths is to use
facetting; the one "plot" would then have multiple facets, one per level of
the factor.
Otherwise I'll need to look at how to return multiple ggplot objects for a
single smooth if I am to follow the by smooth convention...
—
Reply to this email directly, view it on GitHub
<#249 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AORLQXGY62FC6JD3HOZSHULYRUDN3AVCNFSM6AAAAABCTL6RJSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMRUGE2DONRZGM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
I assume you mean that you have Until then, you could modify the output from
You don't get things like partial residuals or a rug plot, but it is pretty close to what |
Yep that's correct. Thanks for the suggestion - that is more I less what
I've been doing (albeit less efficiently than your approach!).
Wonder if it would be feasible to allow one to supply x-scaling attributes
somehow in a call to draw(), so it could apply them "on the fly"?
…On Fri, Feb 2, 2024 at 11:22 AM Gavin Simpson ***@***.***> wrote:
I'm fitting to scaled covariate data (using 'scale'), but I'd like to see
the "real" (back-transformed) predictor values in my visualizations
I assume you mean that you have y ~ s(x) where x is the result of x <-
scale(x_orig)[,1]? There isn't a good solution to that currently. Right
now I would suggest that you just evaluate the smooths with
smooth_estimates(), and then build your own plots. I'm looking at ways to
make building your own plots that look like draw()s plots even easier.
Until then, you could modify the output from smooth_estimates() and then
use its draw() method:
library("dplyr")
library("mgcv")
library("gratia")
df <- data_sim("eg1") |>
mutate(x1_scl = (x1 - mean(x1)) / sd(x1))
m <- gam(y ~ s(x0) + s(x1_scl) + s(x2) + s(x3), data = df, method = "REML")
sm <- smooth_estimates(m)
# unscale x1_scl
x1_mean <- with(df, mean(x1))
x1_sd <- with(df, sd(x1))
sm <- sm |>
mutate(x1_scl = (x1_scl * x1_sd) + x1_mean)
draw(sm)
You don't get things like partial residuals or a rug plot, but it is
pretty close to what draw() will produce (as draw.gam() calls the
draw.smooth_estimates() for each smooth internally).
—
Reply to this email directly, view it on GitHub
<#249 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AORLQXBMTFCVZR4S7QSJ4ZTYRUHEDAVCNFSM6AAAAABCTL6RJSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMRUGIYDENBZGE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
I'm fitting an HGAM with a global 2d smooth and a factor smooth interaction of the 2d smooth (via bs='sz' or 'fs'). For example:
When attempting to plot these with draw(), I receive (altogether different) errors for both. For the 'sz' smooth, I am getting:
while for the 'fs' smooth I am getting:
If I omit the second (factor-smooth interaction) term via
select=c(-2)
in the call to draw(), the first term plots fine for both of the fits.Finally, using the 'by' argument (instead of 'fs' or 'sz'), both terms plot fine. for example:
...draws both terms as expected. So it seems to be an issue with the 'fs' and 'sz' in particular...
Apologies if I am missing something here?
Thanks!
Chris
The text was updated successfully, but these errors were encountered: