Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cells argument in FeatureMatrix #803

Closed
plbaldoni opened this issue Sep 23, 2021 · 2 comments
Closed

Cells argument in FeatureMatrix #803

plbaldoni opened this issue Sep 23, 2021 · 2 comments
Labels
bug Something isn't working

Comments

@plbaldoni
Copy link

Hi there,

What exactly is the format of the cells argument in the function FeatureMatrix? I am passing the barcode names from CountFragments, and FeatureMatrix returns NULL.

In the example below, all barcodes that I am interested in filtering (target_cells) are present in the large matrix out_all. But whenever I pass cells = target_cells in FeatureMatrix, I get NULL as the output.

> library(Signac)
> library(GenomicRanges)
> frag_path <- '~/fragments.tsv.gz'
> peak_path <- '~/peaks.bed'
> frag_object <- CreateFragmentObject(frag_path,verbose = TRUE)
Computing hash
> 
> frag_counts <- CountFragments(fragments = frag_path)
> 
> target_cells <- frag_counts[frag_counts$frequency_count > 1000, "CB"]
> 
> head(target_cells)
[1] "CTAGGATTCTTGTGCC-1" "CAGGATTGTTACGGAG-1" "GCGCCAAGTCACAGTT-1" "GCACGCAAGTACGCGA-1" "TGCTATTCATGGCCCA-1"
[6] "CCCTGATCATCATCGA-1"
> length(target_cells)
[1] 57
> 
> feat <- GenomicRanges::makeGRangesFromDataFrame(df = read.table(peak_path),
+                                                 seqnames.field = 'V1',
+                                                 start.field = 'V2',
+                                                 end.field = 'V3')
> feat
GRanges object with 8024 ranges and 0 metadata columns:
           seqnames            ranges strand
              <Rle>         <IRanges>  <Rle>
     [1]       chr1   4807482-4808370      *
     [2]       chr1   4857304-4858176      *
     [3]       chr1   5022457-5023326      *
     [4]       chr1   7088433-7089350      *
     [5]       chr1   7397617-7398457      *
     ...        ...               ...    ...
  [8020]       chrY 90808457-90809289      *
  [8021]       chrY 90812575-90813471      *
  [8022] JH584304.1       59182-60057      *
  [8023] GL456216.1       15533-16423      *
  [8024] GL456216.1       16735-17567      *
  -------
  seqinfo: 23 sequences from an unspecified genome; no seqlengths
> 
> out_all <- FeatureMatrix(fragments = frag_object,
+                          features = feat,
+                          verbose = TRUE)
Extracting reads overlapping genomic regions
  |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=11s  
> 
> all(target_cells %in% colnames(out_all))
[1] TRUE
> 
> out_cells <- FeatureMatrix(fragments = frag_object,
+                            features = feat,
+                            verbose = TRUE,
+                            cells = target_cells)
> 
> out_cells
NULL
> 
> sessionInfo()
R version 4.1.1 (2021-08-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS:   /stornext/System/data/apps/R/R-4.1.1/lib64/R/lib/libRblas.so
LAPACK: /stornext/System/data/apps/R/R-4.1.1/lib64/R/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8       
 [4] LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] GenomicRanges_1.44.0 GenomeInfoDb_1.28.4  IRanges_2.26.0       S4Vectors_0.30.0     BiocGenerics_0.38.0 
[6] Signac_1.4.0         devtools_2.4.2       usethis_2.0.1       

loaded via a namespace (and not attached):
  [1] fastmatch_1.1-3        plyr_1.8.6             igraph_1.2.6           lazyeval_0.2.2        
  [5] splines_4.1.1          BiocParallel_1.26.2    listenv_0.8.0          SnowballC_0.7.0       
  [9] scattermore_0.7        ggplot2_3.3.5          digest_0.6.27          htmltools_0.5.2       
 [13] fansi_0.5.0            magrittr_2.0.1         memoise_2.0.0          tensor_1.5            
 [17] cluster_2.1.2          ROCR_1.0-11            remotes_2.4.0          globals_0.14.0        
 [21] Biostrings_2.60.2      matrixStats_0.61.0     docopt_0.7.1           spatstat.sparse_2.0-0 
 [25] prettyunits_1.1.1      colorspace_2.0-2       ggrepel_0.9.1          dplyr_1.0.7           
 [29] sparsesvd_0.2          callr_3.7.0            crayon_1.4.1           RCurl_1.98-1.5        
 [33] jsonlite_1.7.2         spatstat.data_2.1-0    survival_3.2-13        zoo_1.8-9             
 [37] glue_1.4.2             polyclip_1.10-0        gtable_0.3.0           zlibbioc_1.38.0       
 [41] XVector_0.32.0         leiden_0.3.9           pkgbuild_1.2.0         future.apply_1.8.1    
 [45] abind_1.4-5            scales_1.1.1           DBI_1.1.1              miniUI_0.1.1.1        
 [49] Rcpp_1.0.7             viridisLite_0.4.0      xtable_1.8-4           reticulate_1.22       
 [53] spatstat.core_2.3-0    htmlwidgets_1.5.4      httr_1.4.2             RColorBrewer_1.1-2    
 [57] ellipsis_0.3.2         Seurat_4.0.4           ica_1.0-2              farver_2.1.0          
 [61] pkgconfig_2.0.3        ggseqlogo_0.1          uwot_0.1.10            deldir_0.2-10         
 [65] utf8_1.2.2             tidyselect_1.1.1       rlang_0.4.11           reshape2_1.4.4        
 [69] later_1.3.0            munsell_0.5.0          tools_4.1.1            cachem_1.0.6          
 [73] cli_3.0.1              generics_0.1.0         ggridges_0.5.3         stringr_1.4.0         
 [77] fastmap_1.1.0          goftest_1.2-2          processx_3.5.2         fs_1.5.0              
 [81] fitdistrplus_1.1-5     purrr_0.3.4            RANN_2.6.1             pbapply_1.5-0         
 [85] future_1.22.1          nlme_3.1-153           mime_0.11              slam_0.1-48           
 [89] RcppRoll_0.3.0         compiler_4.1.1         rstudioapi_0.13        plotly_4.9.4.1        
 [93] png_0.1-7              testthat_3.0.4         spatstat.utils_2.2-0   tibble_3.1.4          
 [97] tweenr_1.0.2           stringi_1.7.4          ps_1.6.0               desc_1.3.0            
[101] lattice_0.20-44        Matrix_1.3-4           vctrs_0.3.8            pillar_1.6.2          
[105] lifecycle_1.0.0        spatstat.geom_2.2-2    lmtest_0.9-38          RcppAnnoy_0.0.19      
[109] data.table_1.14.0      cowplot_1.1.1          bitops_1.0-7           irlba_2.3.3           
[113] httpuv_1.6.3           patchwork_1.1.1        R6_2.5.1               promises_1.2.0.1      
[117] lsa_0.73.2             KernSmooth_2.23-20     gridExtra_2.3          parallelly_1.28.1     
[121] sessioninfo_1.1.1      codetools_0.2-18       MASS_7.3-54            assertthat_0.2.1      
[125] pkgload_1.2.2          rprojroot_2.0.2        withr_2.4.2            SeuratObject_4.0.2    
[129] qlcMatrix_0.9.7        sctransform_0.3.2      Rsamtools_2.8.0        GenomeInfoDbData_1.2.6
[133] mgcv_1.8-36            grid_4.1.1             rpart_4.1-15           tidyr_1.1.3           
[137] Rtsne_0.15             ggforce_0.3.3          shiny_1.7.0 

export.zip

@plbaldoni plbaldoni added the bug Something isn't working label Sep 23, 2021
@timoast
Copy link
Collaborator

timoast commented Sep 23, 2021

cells is a vector of cell names

The issue here is that you provide a Fragment object with 0 cells in it, since you don't set the cells parameter in CreateFragmentObject(). Since you supply a Fragment object to FeatureMatrix, it checks if any of the requested cells are in the object before running. If you set the cells information in the Fragment object, it will solve this issue (see https://satijalab.org/signac/articles/data_structures.html#getting-and-setting-fragment-data)

I'll update the documentation to be more clear here. We could also update the function to assume that if the cells information in the Fragment object is NULL, we can assume it hasn't been set and still check that fragment file.

@timoast
Copy link
Collaborator

timoast commented Sep 23, 2021

Should be fixed now on the develop branch

@timoast timoast closed this as completed Sep 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants