Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FindMotifs gives error if run with only one input feature #732

Closed
liz-is opened this issue Jul 22, 2021 · 2 comments
Closed

FindMotifs gives error if run with only one input feature #732

liz-is opened this issue Jul 22, 2021 · 2 comments
Labels
bug Something isn't working

Comments

@liz-is
Copy link

liz-is commented Jul 22, 2021

Problem: FindMotifs gives a (slightly cryptic) error message when run with only one feature

library("Signac")
library("JASPAR2020")
library("TFBSTools")
library("BSgenome.Hsapiens.UCSC.hg19")
pfm <- getMatrixSet(x = JASPAR2020, 
                    opts = list(species = 9606, all_versions = FALSE))

# add motif information
atac_small <- AddMotifs(object = atac_small, 
                 genome = BSgenome.Hsapiens.UCSC.hg19,
                 pfm = pfm)

motifs <- FindMotifs(object = atac_small, features = "chr1-9064752-9065614")
# Selecting background regions to match input sequence characteristics
# Matching GC.percent distribution
# Testing motif enrichment in 1 regions
# Error in base::colSums(x, na.rm = na.rm, dims = dims, ...) : 
#   'x' must be an array of at least two dimensions

Expected result: that is would be possible (even if it's probably not biologically meaningful!) to run FindMotifs on any number of features, or that it would fail with an error message that makes it clear what the issue is without reading the code

Proposed solution: If you add drop=FALSE to the matrix subsetting statement in this line and the one below, the objects returned will be matrices even if they only have one row, so colSums will work.

sessionInfo:
sessionInfo()
# R version 4.1.0 (2021-05-18)
# Platform: x86_64-pc-linux-gnu (64-bit)
# Running under: Ubuntu 20.04.2 LTS
# 
# Matrix products: default
# BLAS/LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.8.so
# 
# locale:
#   [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8   
# [6] LC_MESSAGES=C              LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C            
# [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
# 
# attached base packages:
#   [1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     
# 
# other attached packages:
#   [1] BSgenome.Hsapiens.UCSC.hg19_1.4.3 BSgenome_1.60.0                   rtracklayer_1.52.0                Biostrings_2.60.1                
# [5] XVector_0.32.0                    GenomicRanges_1.44.0              GenomeInfoDb_1.28.1               IRanges_2.26.0                   
# [9] S4Vectors_0.30.0                  BiocGenerics_0.38.0               TFBSTools_1.30.0                  JASPAR2020_0.99.10               
# [13] Signac_1.3.0                     
# 
# loaded via a namespace (and not attached):
#   [1] utf8_1.2.1                  reticulate_1.20             R.utils_2.10.1              tidyselect_1.1.1            poweRlaw_0.70.6            
# [6] RSQLite_2.2.7               AnnotationDbi_1.54.1        htmlwidgets_1.5.3           grid_4.1.0                  docopt_0.7.1               
# [11] BiocParallel_1.26.1         Rtsne_0.15                  munsell_0.5.0               codetools_0.2-18            ica_1.0-2                  
# [16] future_1.21.0               miniUI_0.1.1.1              colorspace_2.0-2            Biobase_2.52.0              knitr_1.33                 
# [21] rstudioapi_0.13             Seurat_4.0.3                ROCR_1.0-11                 tensor_1.5                  listenv_0.8.0              
# [26] MatrixGenerics_1.4.0        slam_0.1-48                 GenomeInfoDbData_1.2.6      polyclip_1.10-0             bit64_4.0.5                
# [31] farver_2.1.0                parallelly_1.26.1           vctrs_0.3.8                 generics_0.1.0              xfun_0.24                  
# [36] lsa_0.73.2                  ggseqlogo_0.1               R6_2.5.0                    bitops_1.0-7                spatstat.utils_2.2-0       
# [41] cachem_1.0.5                DelayedArray_0.18.0         assertthat_0.2.1            promises_1.2.0.1            BiocIO_1.2.0               
# [46] scales_1.1.1                gtable_0.3.0                globals_0.14.0              goftest_1.2-2               seqLogo_1.58.0             
# [51] rlang_0.4.11                RcppRoll_0.3.0              splines_4.1.0               lazyeval_0.2.2              spatstat.geom_2.2-2        
# [56] BiocManager_1.30.16         yaml_2.2.1                  reshape2_1.4.4              abind_1.4-5                 httpuv_1.6.1               
# [61] tools_4.1.0                 ggplot2_3.3.5               ellipsis_0.3.2              spatstat.core_2.2-0         RColorBrewer_1.1-2         
# [66] ggridges_0.5.3              Rcpp_1.0.7                  plyr_1.8.6                  zlibbioc_1.38.0             purrr_0.3.4                
# [71] RCurl_1.98-1.3              rpart_4.1-15                deldir_0.2-10               pbapply_1.4-3               cowplot_1.1.1              
# [76] zoo_1.8-9                   SeuratObject_4.0.2          SummarizedExperiment_1.22.0 ggrepel_0.9.1               cluster_2.1.2              
# [81] motifmatchr_1.14.0          magrittr_2.0.1              data.table_1.14.0           scattermore_0.7             lmtest_0.9-38              
# [86] RANN_2.6.1                  SnowballC_0.7.0             fitdistrplus_1.1-5          matrixStats_0.59.0          hms_1.1.0                  
# [91] patchwork_1.1.1             mime_0.11                   evaluate_0.14               xtable_1.8-4                XML_3.99-0.6               
# [96] sparsesvd_0.2               gridExtra_2.3               compiler_4.1.0              tibble_3.1.2                KernSmooth_2.23-20         
# [101] crayon_1.4.1                R.oo_1.24.0                 htmltools_0.5.1.1           mgcv_1.8-36                 later_1.2.0                
# [106] tidyr_1.1.3                 DBI_1.1.1                   tweenr_1.0.2                MASS_7.3-54                 Matrix_1.3-4               
# [111] readr_1.4.0                 R.methodsS3_1.8.1           igraph_1.2.6                pkgconfig_2.0.3             GenomicAlignments_1.28.0   
# [116] TFMPvalue_0.0.8             plotly_4.9.4.1              spatstat.sparse_2.0-0       annotate_1.70.0             DirichletMultinomial_1.34.0
# [121] stringr_1.4.0               digest_0.6.27               sctransform_0.3.2           RcppAnnoy_0.0.18            pracma_2.3.3               
# [126] CNEr_1.28.0                 spatstat.data_2.1-0         rmarkdown_2.9               leiden_0.3.8                fastmatch_1.1-0            
# [131] uwot_0.1.10                 restfulr_0.0.13             shiny_1.6.0                 Rsamtools_2.8.0             gtools_3.9.2               
# [136] rjson_0.2.20                lifecycle_1.0.0             nlme_3.1-152                jsonlite_1.7.2              viridisLite_0.4.0          
# [141] fansi_0.5.0                 pillar_1.6.1                lattice_0.20-44             KEGGREST_1.32.0             fastmap_1.1.0              
# [146] httr_1.4.2                  survival_3.2-11             GO.db_3.13.0                glue_1.4.2                  qlcMatrix_0.9.7            
# [151] png_0.1-7                   bit_4.0.4                   ggforce_0.3.3               stringi_1.7.2               blob_1.2.1                 
# [156] caTools_1.18.2              memoise_2.0.0               dplyr_1.0.7                 irlba_2.3.3                 future.apply_1.7.0         
@liz-is liz-is added the bug Something isn't working label Jul 22, 2021
@timoast
Copy link
Collaborator

timoast commented Jul 23, 2021

Thanks for the bug report. I've now fixed this on the develop branch, although it's really not recommended to do motif enrichment for only one region. I've also added a warning message to show a warning when running this function with <10 regions as input.

@timoast timoast closed this as completed Jul 23, 2021
@RaghadShu
Copy link

Hello,
A side question: why isn't it recommended to run FindMotifs with one region? For example, I found an interesting open chromatin region that has a high link score with a DEG, and that peak is differentially accessible between my clusters. I am really interested in knowing possible motifs/TF binding sites for this region, and possible infer regulation for that DEG. If I can't use FindMotifs on this one region, is there an alternative way?

Best,
Raghad

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants