The aim of the LMMsolver
package is to provide an efficient and
flexible system to estimate variance components using restricted maximum
likelihood or REML (Patterson and Thompson 1971), for models where the
mixed model equations are sparse (Boer 2023). An example of an
application is using splines to model spatial (Rodríguez-Álvarez et al.
2018; Boer, Piepho, and Williams 2020) or temporal (Bustos-Korts et al.
2019) trends. Another example is mixed model Quantitative Trait Locus
(QTL) analysis for multiparental populations, allowing for heterogeneous
residual variance and design matrices with Identity-By-Descent (IBD)
probabilities (Li et al. 2021).
- Install from CRAN:
install.packages("LMMsolver")
- Install latest development version from GitHub (requires remotes package):
remotes::install_github("Biometris/LMMsolver", ref = "develop", dependencies = TRUE)
As an example of the functionality of the package we use a model defined
in Rodríguez-Álvarez et al. (2015). It uses the USprecip
data set in
the spam
package (Furrer and Sain 2010).
library(LMMsolver)
library(ggplot2)
## Get precipitation data from spam
data(USprecip, package = "spam")
## Only use observed data.
USprecip <- as.data.frame(USprecip)
USprecip <- USprecip[USprecip$infill == 1, ]
A two-dimensional P-spline can be defined with the spl2D()
function,
with longitude and latitude as covariates. The number of segments chosen
here is equal to the number of segments used in Rodríguez-Álvarez et al.
(2015).
obj1 <- LMMsolve(fixed = anomaly ~ 1,
spline = ~spl2D(x1 = lon, x2 = lat, nseg = c(41, 41)),
data = USprecip)
The summary function gives a table with the effective dimensions and the penalty parameters:
summary(obj1)
#> Table with effective dimensions and penalties:
#>
#> Term Effective Model Nominal Ratio Penalty
#> (Intercept) 1.00 1 1 1.00 0.00
#> lin(lon, lat) 3.00 3 3 1.00 0.00
#> s(lon) 302.60 1936 1932 0.16 0.26
#> s(lat) 409.09 1936 1932 0.21 0.08
#> residual 5190.31 5906 5902 0.88 13.53
#>
#> Total Effective Dimension: 5906
The spatial trend for the precipitation can now be plotted on the map of the USA.
plotDat <- obtainSmoothTrend(obj1, grid = c(200, 300), includeIntercept = TRUE)
plotDat <- sf::st_as_sf(plotDat, coords = c("lon", "lat"))
usa <- sf::st_as_sf(maps::map("usa", regions = "main", plot = FALSE))
sf::st_crs(usa) <- sf::st_crs(plotDat)
intersection <- sf::st_intersects(plotDat, usa)
plotDat <- plotDat[!is.na(as.numeric(intersection)), ]
ggplot(usa) +
geom_sf(color = NA) +
geom_tile(data = plotDat,
mapping = aes(geometry = geometry, fill = ypred),
linewidth = 0,
stat = "sf_coordinates") +
scale_fill_gradientn(colors = topo.colors(100))+
labs(title = "Precipitation (anomaly)",
x = "Longitude", y = "Latitude") +
coord_sf() +
theme(panel.grid = element_blank())
Further examples can be found in the vignette.
vignette("Solving_Linear_Mixed_Models")
Boer, Martin P. 2023. “Tensor Product P-Splines Using a Sparse Mixed Model Formulation.” Statistical Modelling 23 (5-6): 465–79. https://doi.org/https://doi.org/10.1177/1471082X231178591.
Boer, Martin P., Hans-Peter Piepho, and Emlyn R. Williams. 2020. “Linear Variance, P-splines and Neighbour Differences for Spatial Adjustment in Field Trials: How are they Related?” J. Agric. Biol. Environ. Stat. 25 (4): 676–98. https://doi.org/10.1007/S13253-020-00412-4.
Bustos-Korts, Daniela, Martin P. Boer, Marcos Malosetti, Scott Chapman, Karine Chenu, Bangyou Zheng, and Fred A. van Eeuwijk. 2019. “Combining Crop Growth Modeling and Statistical Genetic Modeling to Evaluate Phenotyping Strategies.” Front. Plant Sci. 10 (November). https://doi.org/10.3389/fpls.2019.01491.
Furrer, R, and SR Sain. 2010. “spam: A sparse matrix R package with emphasis on MCMC methods for Gaussian Markov random fields.” J. Stat. Softw. https://core.ac.uk/download/pdf/6340272.pdf.
Li, Wenhao, Martin P. Boer, Chaozhi Zheng, Ronny V. L. Joosen, and Fred A. van Eeuwijk. 2021. “An IBD-based mixed model approach for QTL mapping in multiparental populations.” Theor. Appl. Genet. 2021 1 (August): 1–18. https://doi.org/10.1007/S00122-021-03919-7.
Patterson, HD, and R Thompson. 1971. “Recovery of inter-block information when block sizes are unequal.” Biometrika. https://doi.org/10.1093/biomet/58.3.545.
Rodríguez-Álvarez, María Xosé, Martin P. Boer, Fred A. van Eeuwijk, and Paul H. C. Eilers. 2018. “Correcting for spatial heterogeneity in plant breeding experiments with P-splines.” Spat. Stat. 23 (March): 52–71. https://doi.org/10.1016/J.SPASTA.2017.10.003.
Rodríguez-Álvarez, María Xosé, Dae Jin Lee, Thomas Kneib, María Durbán, and Paul Eilers. 2015. “Fast smoothing parameter separation in multidimensional generalized P-splines: the SAP algorithm.” Stat. Comput. 25 (5): 941–57. https://doi.org/10.1007/S11222-014-9464-2.