Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CreateChromatinAssay() error with cellranger-atac v2 fragment file #609

Closed
rk5400 opened this issue May 5, 2021 · 4 comments
Closed

CreateChromatinAssay() error with cellranger-atac v2 fragment file #609

rk5400 opened this issue May 5, 2021 · 4 comments
Labels
enhancement New feature or request

Comments

@rk5400
Copy link

rk5400 commented May 5, 2021

Hello, recently 10x genomics release a new version for Cell ranger ATAC (v2 May3 2021) with improved features. I wanted to re-run previous samples to test if my data would look different in Seurat/Signac. I realized the function CreateChromatinAssay() crashed with error caught segfault - cause 'memory not mapped'. However the same function using the fragment file previously generated through cell ranger atac v1.2 works. I tried to run on linux, macOS, same result. It is unclear whether the issue is coming from R or the fragment file generated by cell ranger atac v2. I also tried on a 10x data set here generated with cellranger atac v2 and CreateChromatinAssay() crashed.

here is the code i am using:

counts <- Read10X_h5(filename = "/path_to/filtered_peak_bc_matrix.h5")
metadata <- read.csv(
file = "/path_to/singlecell.csv",
header = TRUE,
row.names = 1)
chrom_assay <- CreateChromatinAssay(
counts = counts,
sep = c(":", "-"),
genome = 'hg38',
fragments = '/path_to/fragments.tsv.gz',
min.cells = 10,
min.features = 200
#Computing hash
Checking for 4006 cell barcodes
*** caught segfault ***
address (nil), cause 'memory not mapped'
Traceback:
1: validateCells(fragments = filepath, cells = cells, find_n = find_n, max_lines = max.lines, verbose = verbose)
2: ValidateCells(object = frags, verbose = verbose, ...)
3: CreateFragmentObject(path = fragments, cells = cells, validate.fragments = validate.fragments, verbose = verbose, ...)
4: CreateChromatinAssay(counts = counts, sep = c(":", "-"), genome = "hg38", fragments = "/path_to/fragments.tsv.gz", min.cells = 10, min.features = 200)

sessionInfo()
R version 4.0.5 (2021-03-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.2 LTS

Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0

locale:
[1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8
[4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8
[7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats4 parallel stats graphics grDevices utils datasets
[8] methods base

other attached packages:
[1] patchwork_1.1.1 ggplot2_3.3.3
[3] EnsDb.Hsapiens.v86_2.99.0 ensembldb_2.14.1
[5] AnnotationFilter_1.14.0 GenomicFeatures_1.42.3
[7] AnnotationDbi_1.52.0 Biobase_2.50.0
[9] GenomicRanges_1.42.0 GenomeInfoDb_1.26.7
[11] IRanges_2.24.1 S4Vectors_0.28.1
[13] BiocGenerics_0.36.1 SeuratObject_4.0.0
[15] Seurat_4.0.1 Signac_1.2.0

loaded via a namespace (and not attached):
[1] fastmatch_1.1-0 BiocFileCache_1.14.0
[3] plyr_1.8.6 igraph_1.2.6
[5] lazyeval_0.2.2 splines_4.0.5
[7] BiocParallel_1.24.1 listenv_0.8.0
[9] scattermore_0.7 SnowballC_0.7.0
[11] digest_0.6.27 htmltools_0.5.1.1
[13] fansi_0.4.2 magrittr_2.0.1
[15] memoise_2.0.0 tensor_1.5
[17] cluster_2.1.1 ROCR_1.0-11
[19] globals_0.14.0 Biostrings_2.58.0
[21] matrixStats_0.58.0 docopt_0.7.1
[23] askpass_1.1 spatstat.sparse_2.0-0
[25] prettyunits_1.1.1 colorspace_2.0-1
[27] rappdirs_0.3.3 blob_1.2.1
[29] ggrepel_0.9.1 dplyr_1.0.6
[31] sparsesvd_0.2 crayon_1.4.1
[33] RCurl_1.98-1.3 jsonlite_1.7.2
[35] spatstat.data_2.1-0 survival_3.2-10
[37] zoo_1.8-9 glue_1.4.2
[39] polyclip_1.10-0 gtable_0.3.0
[41] zlibbioc_1.36.0 XVector_0.30.0
[43] leiden_0.3.7 DelayedArray_0.16.3
[45] future.apply_1.7.0 abind_1.4-5
[47] scales_1.1.1 DBI_1.1.1
[49] miniUI_0.1.1.1 Rcpp_1.0.6
[51] progress_1.2.2 viridisLite_0.4.0
[53] xtable_1.8-4 reticulate_1.20
[55] spatstat.core_2.1-2 bit_4.0.4
[57] htmlwidgets_1.5.3 httr_1.4.2
[59] RColorBrewer_1.1-2 ellipsis_0.3.2
[61] ica_1.0-2 XML_3.99-0.6
[63] pkgconfig_2.0.3 farver_2.1.0
[65] dbplyr_2.1.1 ggseqlogo_0.1
[67] uwot_0.1.10 deldir_0.2-10
[69] utf8_1.2.1 tidyselect_1.1.1
[71] rlang_0.4.11 reshape2_1.4.4
[73] later_1.2.0 munsell_0.5.0
[75] tools_4.0.5 cachem_1.0.4
[77] generics_0.1.0 RSQLite_2.2.7
[79] ggridges_0.5.3 stringr_1.4.0
[81] fastmap_1.1.0 goftest_1.2-2
[83] bit64_4.0.5 fitdistrplus_1.1-3
[85] purrr_0.3.4 RANN_2.6.1
[87] pbapply_1.4-3 future_1.21.0
[89] nlme_3.1-152 mime_0.10
[91] slam_0.1-48 RcppRoll_0.3.0
[93] xml2_1.3.2 biomaRt_2.46.3
[95] compiler_4.0.5 rstudioapi_0.13
[97] curl_4.3.1 plotly_4.9.3
[99] png_0.1-7 spatstat.utils_2.1-0
[101] tibble_3.1.1 tweenr_1.0.2
[103] stringi_1.5.3 lattice_0.20-41
[105] ProtGenerics_1.22.0 Matrix_1.3-2
[107] vctrs_0.3.8 pillar_1.6.0
[109] lifecycle_1.0.0 spatstat.geom_2.1-0
[111] lmtest_0.9-38 RcppAnnoy_0.0.18
[113] data.table_1.14.0 cowplot_1.1.1
[115] bitops_1.0-7 irlba_2.3.3
[117] rtracklayer_1.50.0 httpuv_1.6.0
[119] R6_2.5.0 promises_1.2.0.1
[121] KernSmooth_2.23-18 gridExtra_2.3
[123] lsa_0.73.2 parallelly_1.25.0
[125] codetools_0.2-18 MASS_7.3-53.1
[127] assertthat_0.2.1 SummarizedExperiment_1.20.0
[129] openssl_1.4.4 withr_2.4.2
[131] GenomicAlignments_1.26.0 qlcMatrix_0.9.7
[133] sctransform_0.3.2 Rsamtools_2.6.0
[135] GenomeInfoDbData_1.2.4 hms_1.0.0
[137] mgcv_1.8-33 grid_4.0.5
[139] rpart_4.1-15 tidyr_1.1.3
[141] MatrixGenerics_1.2.1 Rtsne_0.15
[143] ggforce_0.3.3 shiny_1.6.0

@rk5400 rk5400 added the bug Something isn't working label May 5, 2021
@Onero23
Copy link

Onero23 commented May 6, 2021

Hello,
I confirm the same issue using the same code.
Thanks!

Code:
counts1 <- Read10X_h5(filename = "/path_to/filtered_peak_bc_matrix.h5")
metadata <- read.csv(
file = "/path_to/singlecell.csv",
header = TRUE,
row.names = 1)
chrom_assay <- CreateChromatinAssay(
counts = counts1,
sep = c(":", "-"),
genome = 'hg38',
fragments = '/path_to/fragments.tsv.gz',
min.cells = 10,
min.features = 200

Error:
[63386:63386:20210506,113844,78921:ERROR elf_dynamic_array_reader.h:61] tag not found
[63386:63387:20210506,113844,807371:ERROR directory_reader_posix.cc:42] opendir: No such file or directory (2)

sessionInfo()
R version 4.0.5 (2021-03-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.2 LTS

Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0

locale:
[1] LC_CTYPE=de_DE.UTF-8 LC_NUMERIC=C LC_TIME=de_DE.UTF-8 LC_COLLATE=de_DE.UTF-8 LC_MONETARY=de_DE.UTF-8 LC_MESSAGES=de_DE.UTF-8
[7] LC_PAPER=de_DE.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats4 parallel stats graphics grDevices utils datasets methods base

other attached packages:
[1] ggplot2_3.3.3 BSgenome.Hsapiens.UCSC.hg38_1.4.3 BSgenome_1.58.0 rtracklayer_1.50.0 Biostrings_2.58.0
[6] XVector_0.30.0 EnsDb.Hsapiens.v86_2.99.0 ensembldb_2.14.0 AnnotationFilter_1.14.0 GenomicFeatures_1.42.2
[11] AnnotationDbi_1.52.0 Biobase_2.50.0 GenomicRanges_1.42.0 GenomeInfoDb_1.26.4 IRanges_2.24.1
[16] S4Vectors_0.28.1 BiocGenerics_0.36.0 Signac_1.2.0 SeuratObject_4.0.0 Seurat_4.0.1

loaded via a namespace (and not attached):
[1] fastmatch_1.1-0 BiocFileCache_1.14.0 plyr_1.8.6 igraph_1.2.6 lazyeval_0.2.2 splines_4.0.5
[7] BiocParallel_1.24.1 listenv_0.8.0 scattermore_0.7 SnowballC_0.7.0 digest_0.6.27 htmltools_0.5.1.1
[13] fansi_0.4.2 magrittr_2.0.1 memoise_2.0.0 tensor_1.5 cluster_2.1.2 ROCR_1.0-11
[19] globals_0.14.0 matrixStats_0.58.0 docopt_0.7.1 askpass_1.1 spatstat.sparse_2.0-0 prettyunits_1.1.1
[25] colorspace_2.0-1 rappdirs_0.3.3 blob_1.2.1 ggrepel_0.9.1 xfun_0.22 dplyr_1.0.6
[31] sparsesvd_0.2 crayon_1.4.1 RCurl_1.98-1.3 jsonlite_1.7.2 spatstat.data_2.1-0 survival_3.2-10
[37] zoo_1.8-9 glue_1.4.2 polyclip_1.10-0 gtable_0.3.0 zlibbioc_1.36.0 leiden_0.3.7
[43] DelayedArray_0.16.2 future.apply_1.7.0 abind_1.4-5 scales_1.1.1 DBI_1.1.1 miniUI_0.1.1.1
[49] Rcpp_1.0.6 progress_1.2.2 viridisLite_0.4.0 xtable_1.8-4 reticulate_1.20 spatstat.core_2.1-2
[55] bit_4.0.4 htmlwidgets_1.5.3 httr_1.4.2 RColorBrewer_1.1-2 ellipsis_0.3.2 ica_1.0-2
[61] XML_3.99-0.6 pkgconfig_2.0.3 farver_2.1.0 dbplyr_2.1.1 ggseqlogo_0.1 uwot_0.1.10
[67] deldir_0.2-10 utf8_1.2.1 tidyselect_1.1.1 rlang_0.4.11 reshape2_1.4.4 later_1.2.0
[73] munsell_0.5.0 tools_4.0.5 cachem_1.0.4 generics_0.1.0 RSQLite_2.2.7 ggridges_0.5.3
[79] stringr_1.4.0 fastmap_1.1.0 goftest_1.2-2 bit64_4.0.5 fitdistrplus_1.1-3 purrr_0.3.4
[85] RANN_2.6.1 pbapply_1.4-3 future_1.21.0 nlme_3.1-152 mime_0.10 slam_0.1-48
[91] RcppRoll_0.3.0 xml2_1.3.2 biomaRt_2.46.3 compiler_4.0.5 rstudioapi_0.13 curl_4.3.1
[97] plotly_4.9.3 png_0.1-7 spatstat.utils_2.1-0 tibble_3.1.1 tweenr_1.0.2 stringi_1.5.3
[103] lattice_0.20-41 ProtGenerics_1.22.0 Matrix_1.3-2 vctrs_0.3.8 pillar_1.6.0 lifecycle_1.0.0
[109] spatstat.geom_2.1-0 lmtest_0.9-38 RcppAnnoy_0.0.18 data.table_1.14.0 cowplot_1.1.1 bitops_1.0-7
[115] irlba_2.3.3 httpuv_1.6.0 patchwork_1.1.1 R6_2.5.0 promises_1.2.0.1 KernSmooth_2.23-18
[121] gridExtra_2.3 lsa_0.73.2 parallelly_1.25.0 codetools_0.2-18 MASS_7.3-54 assertthat_0.2.1
[127] SummarizedExperiment_1.20.0 openssl_1.4.4 withr_2.4.2 GenomicAlignments_1.26.0 qlcMatrix_0.9.7 sctransform_0.3.2
[133] Rsamtools_2.6.0 GenomeInfoDbData_1.2.4 hms_1.0.0 mgcv_1.8-35 grid_4.0.5 rpart_4.1-15
[139] tidyr_1.1.3 MatrixGenerics_1.2.1 Rtsne_0.15 ggforce_0.3.3 shiny_1.6.0 tinytex_0.31

@timoast
Copy link
Collaborator

timoast commented May 6, 2021

Thanks for pointing this out, it looks like they have changed the fragment file format in the new cellranger-atac version to include a header. This is not currently supported in Signac, but we will work on adding support as soon as we can. As a workaround, you can either use the previous cellranger version, or remove the header lines from the fragment file produced by cellranger-atac v2.

@timoast timoast added enhancement New feature or request and removed bug Something isn't working labels May 6, 2021
@timoast
Copy link
Collaborator

timoast commented May 10, 2021

Support for cellranger-atac v2 fragment files has now been added to the develop branch. See installation instructions here: https://satijalab.org/signac/articles/install.html#development-version-1

@1061047021lql-AI
Copy link

Thanks for pointing this out, it looks like they have changed the fragment file format in the new cellranger-atac version to include a header. This is not currently supported in Signac, but we will work on adding support as soon as we can. As a workaround, you can either use the previous cellranger version, or remove the header lines from the fragment file produced by cellranger-atac v2.

Hi, I had a problem when I was analyzing scATAC-seq data while running the CreateChromatinAssay()
Thank you in advance.

Error in download.file(url, destfile, quiet = TRUE) :
cannot open URL 'http://hgdownload.cse.ucsc.edu/goldenPath/hg38/database/chromInfo.txt.gz'
Calls: CreateChromatinAssay ... fetch_table_from_UCSC_database -> fetch_table_from_url
Execution halted

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants