Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Aggregation function missing: defaulting to length" error #16

Closed
mchimenti opened this issue Sep 28, 2018 · 4 comments
Closed

"Aggregation function missing: defaulting to length" error #16

mchimenti opened this issue Sep 28, 2018 · 4 comments

Comments

@mchimenti
Copy link

mchimenti commented Sep 28, 2018

Hello,
I am trying to run TcGSA.LR on normalized gene expression time course data, four replicates ("patients") and two conditions (disease/normal). I have invoked the command as:

res_lin <- TcGSA::TcGSA.LR(expr = mat_mat,
gmt = TCGSA_kegg_entrez,
design = exp_design,
subject_name = "Patient_ID",
time_name = "timepoint",
time_func = "linear",
group_name = "condition")

My design looks like (screenshot, full table not shown):

screen shot 2018-09-28 at 1 51 55 pm

I keep getting this "Aggregation function missing: defaulting to length" warning even though the method DOES run and completes.

What does it mean and does it affect the results? How can I fix it?

thanks,
Michael

sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: OS X El Capitan 10.11.6

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets methods base

other attached packages:
[1] bindrcpp_0.2.2 clusterProfiler_3.6.0 DOSE_3.4.0 TcGSA_0.11.1
[5] gage_2.28.2 forcats_0.3.0 stringr_1.3.0 dplyr_0.7.6
[9] purrr_0.2.5 readr_1.1.1 tidyr_0.8.1 tibble_1.4.2
[13] ggplot2_3.0.0 tidyverse_1.2.1 org.Hs.eg.db_3.5.0 AnnotationDbi_1.40.0
[17] biomaRt_2.34.2 DESeq2_1.18.1 SummarizedExperiment_1.8.1 DelayedArray_0.4.1
[21] matrixStats_0.53.1 Biobase_2.38.0 GenomicRanges_1.30.3 GenomeInfoDb_1.14.0
[25] IRanges_2.12.0 S4Vectors_0.16.0 BiocGenerics_0.24.0

loaded via a namespace (and not attached):
[1] minqa_1.2.4 fgsea_1.4.1 colorspace_1.3-2 qvalue_2.10.0 htmlTable_1.11.2
[6] XVector_0.18.0 base64enc_0.1-3 rstudioapi_0.7 bit64_0.9-7 lubridate_1.7.4
[11] xml2_1.2.0 splines_3.4.3 mnormt_1.5-5 GOSemSim_2.4.1 geneplotter_1.56.0
[16] knitr_1.20 Formula_1.2-2 jsonlite_1.5 nloptr_1.0.4 broom_0.4.4
[21] annotate_1.56.2 cluster_2.0.7-1 GO.db_3.5.0 png_0.1-7 graph_1.56.0
[26] compiler_3.4.3 httr_1.3.1 rvcheck_0.1.0 backports_1.1.2 assertthat_0.2.0
[31] Matrix_1.2-14 lazyeval_0.2.1 cli_1.0.0 acepack_1.4.1 htmltools_0.3.6
[36] prettyunits_1.0.2 tools_3.4.3 igraph_1.2.1 gtable_0.2.0 glue_1.2.0
[41] GenomeInfoDbData_1.0.0 reshape2_1.4.3 DO.db_2.9 fastmatch_1.1-0 Rcpp_0.12.17
[46] cellranger_1.1.0 Biostrings_2.46.0 multtest_2.34.0 gdata_2.18.0 nlme_3.1-137
[51] psych_1.8.3.3 lme4_1.1-17 rvest_0.3.2 gtools_3.5.0 XML_3.98-1.11
[56] MASS_7.3-49 zlibbioc_1.24.0 scales_0.5.0 hms_0.4.2 RColorBrewer_1.1-2
[61] yaml_2.1.18 memoise_1.1.0 gridExtra_2.3 rpart_4.1-13 latticeExtra_0.6-28
[66] stringi_1.2.3 RSQLite_2.1.0 genefilter_1.60.0 checkmate_1.8.5 caTools_1.17.1
[71] BiocParallel_1.12.0 GSA_1.03 rlang_0.2.1 pkgconfig_2.0.1 bitops_1.0-6
[76] lattice_0.20-35 bindr_0.1.1 htmlwidgets_1.2 cowplot_0.9.2 bit_1.1-12
[81] tidyselect_0.2.4 plyr_1.8.4 magrittr_1.5 R6_2.2.2 gplots_3.0.1
[86] Hmisc_4.1-1 DBI_1.0.0 withr_2.1.2 pillar_1.2.2 haven_1.1.1
[91] foreign_0.8-69 KEGGREST_1.18.1 survival_2.42-3 RCurl_1.95-4.10 nnet_7.3-12
[96] modelr_0.1.1 crayon_1.3.4 utf8_1.1.3 KernSmooth_2.23-15 progress_1.1.2
[101] locfit_1.5-9.1 grid_3.4.3 readxl_1.1.0 data.table_1.11.4 blob_1.1.1
[106] digest_0.6.15 xtable_1.8-2 munsell_0.4.3

@mchimenti
Copy link
Author

OK, I just saw this from another issue thread:

"Those warnings likely come from lme4, the package TcGSA relies upon to fit linear mixed effect models. The warnings you get are probably because you have so few replicates (only 2) to estimate the random effects, so lme4 cannot achieve convergence when fitting the mixed models. The results you get are therefore probably untrustworthy. I would advise against using TcGSA with so few replicates, because mixed models normality assumption on the random effects are then very questionnable. You could look into CAMERA or globalANCOVA that could be better suited to analyze your data"

So, with only four replicates I'm going to get this warning no matter what? Is four replicates too few to use the tcGSA model? The results I'm getting make absolute sense and confirm results from IPA analysis and other orthogonal methods.

thanks,
Michael

@borishejblum
Copy link
Collaborator

borishejblum commented Oct 1, 2018

Hello Michael,

First, thank you for your interest in TcGSA and our work.

This error is likely originating from the acast function from the reshape2 package (that TcGSA relies on for some data management tasks). More specifically, it is probably thrown when the Estimations element of the output is put together by the TcGSA.LR function. Although it is not directly impacting the tests results your are getting, this should not be happening and is an indication that something is probably wrong in the way the data are fed to the TcGSA.LR function...

Unfortunately, without a reproducible example of the error for me to run, it is hard to be of further help. If you could provide a reproducible script with some data to reproduce the error, I could look into it more.

Thank you,
Boris

@mchimenti
Copy link
Author

mchimenti commented Oct 2, 2018

If the issue is with the way outputs from the lme calculation are handled, why would that have anything to do with the way my data are input?

Here is the relevant line (247 in tcGSA.LR.R):
estims_tab <- acast(data=estims, formula = stats::as.formula(paste("probe", subject_name, "t1", sep="~")), value.var="fitted")

@borishejblum
Copy link
Collaborator

borishejblum commented Oct 2, 2018

Dear Michael,

Again, it is hard to pinpoint the cause of the error without being able to reproduce it.
However, when running TcGSA.LR on my data or on examples from the package itself, this error is not showing up. This indicates that your data are probably not handled properly by TcGSA.LR.

But without being able to generate the error myself, I am unable to accurately diagnose its cause, and cannot provide a fix/failsafe to prevent it from happening again.

Boris

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants