-
Notifications
You must be signed in to change notification settings - Fork 891
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error while running PCA on 1.3 Million Brain Cells from E18 Mice #1649
Comments
Hello @BSharmi I wonder if the problem is that each worker is requesting too much memory (since the size of the dataset will "increase" for each core used), have you tried running it in a single thread fashion to see if that is the problem? Also, since the problem is a refused request in port 80, have you checked that your firewall or equivalent is not blocking the outgoing connection to the workers? Best, |
I have checked firewall and it does not seem to be blocking. Did you mean to say the error is not reproducible?
Thank you.
Sharmi
… On Jun 13, 2019, at 3:31 PM, J. Sebastian Paez ***@***.***> wrote:
Hello @BSharmi <https://github.com/BSharmi>
I wonder if the problem is that each worker is requesting too much memory (since the size of the dataset will "increase" for each core used), have you tried running it in a single thread fashion to see if that is the problem?
Also, since the problem is a refused request in port 80, have you checked that your firewall or equivalent is not blocking the outgoing connection to the workers?
Best,
Sebastian
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub <#1649?email_source=notifications&email_token=ABRRGXEKOJBM4R44ME7TUDLP2KOAZA5CNFSM4HVNYLZ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODXUZMMQ#issuecomment-501847602>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABRRGXCC6V43N7GITYGRZOTP2KOAZANCNFSM4HVNYLZQ>.
|
Unfortunately this does not appear to be related to Seurat, as we can certainly handle dataset sizes of 30k cells. note that you receive an identical error when trying to use runPCA on the SCE object - so this appears to be something more specific to your computational setup as opposed to the Seurat converter (apologies) |
Thank you for closing the issue.
While I could be wrong, and Seurat is able to handle huge data sets, I have encountered problems while using the ‘Read10X_h5’ function on the 1.3 million dataset (open issue #1644, #1644 <#1644>).
Since it appears the problem is neither related to Seurat or 10X genomics, I guess I have to look into alternative options of loading large matrices.
Best,
Sharmi
… On Jun 14, 2019, at 11:33 AM, satijalab ***@***.***> wrote:
Closed #1649 <#1649>.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub <#1649?email_source=notifications&email_token=ABRRGXHPSYJ2ZN733E7VJZ3P2O235A5CNFSM4HVNYLZ2YY3PNVWWK3TUL52HS4DFWZEXG43VMVCXMZLOORHG65DJMZUWGYLUNFXW5KTDN5WW2ZLOORPWSZGOR7RZ3GI#event-2414058905>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABRRGXCVSVH7ZSVMUEG7EY3P2O235ANCNFSM4HVNYLZQ>.
|
@paupuigdevall @BSharmi were you able to find any workaround to this problem? |
Hello,
I was wondering if anyone encountered this error with Seurat and the 10X million cell data.
I am trying to analyze 1.3 Million Brain Cells from E18 Mice from 10X using R (https://support.10xgenomics.com/single-cell-gene-expression/datasets/1.3.0/1M_neurons) . Due to data size, I used graph clustering results containing 60 clusters and tested just one cluster. I am getting an error while running PCA on the SingleCellExperiment object.
Please find below my code:
I get an error at the PCA step:
Error in curl::curlfetchmemory(url, handle = handle) : Failed to connect to hsdshdflab.hdfgroup.org port 80: Connection refused
I get the following error if I try to create a Seurat object bypassing the PCA step:
Error in curl::curlfetchmemory(url, handle = handle) : Failed to connect to hsdshdflab.hdfgroup.org port 80: Connection refused Calls: as.Seurat ... requestfetch -> requestfetch.write_memory -> Execution halted
The size of this cluster is not very big (27998 genes and 18919 cells) so I am wondering why is it failing. If I use the randomly sampled 20k cells generated by 10X I do not have any problem creating the Seurat object. Can someone please let me know how to solve this problem?
Thank you very much
sessionInfo() R version 3.5.1 (2018-07-02) Platform: x86_64-pc-linux-gnu (64-bit) Running under: CentOS Linux 7 (Core)
Matrix products: default BLAS/LAPACK: /apps/easybuild/software/pegasus-sandybridge/OpenBLAS/0.3.1-GCC-7.3.0-2.30/lib/libopenblassandybridgep-r0.3.1.so
locale: [1] LCCTYPE=enUS.UTF-8 LCNUMERIC=C
[3] LCTIME=enUS.UTF-8 LCCOLLATE=enUS.UTF-8
[5] LCMONETARY=enUS.UTF-8 LCMESSAGES=enUS.UTF-8
[7] LCPAPER=enUS.UTF-8 LCNAME=C
[9] LCADDRESS=C LCTELEPHONE=C
[11] LCMEASUREMENT=enUS.UTF-8 LC_IDENTIFICATION=C
attached base packages: [1] parallel stats4 stats graphics grDevices utils datasets [8] methods base
other attached packages: [1] restfulSEData1.4.0 ExperimentHub1.8.0
[3] AnnotationHub2.14.5 loomR0.2.1.9000
[5] hdf5r1.2.0 R62.4.0
[7] scater1.10.1 dplyr0.8.1
[9] zinbwave1.4.2 biomaRt2.38.0
[11] ggplot23.1.1 magrittr1.5
[13] scRNAseq1.8.0 Seurat3.0.1
[15] TENxGenomics0.0.27 Matrix1.2-14
[17] BiocFileCache1.6.0 dbplyr1.2.2
[19] SingleCellExperiment1.4.1 restfulSE1.4.1
[21] SummarizedExperiment1.12.0 DelayedArray0.8.0
[23] BiocParallel1.16.6 matrixStats0.54.0
[25] Biobase2.42.0 GenomicRanges1.34.0
[27] GenomeInfoDb1.18.2 IRanges2.16.0
[29] S4Vectors0.20.1 BiocGenerics0.28.0
loaded via a namespace (and not attached): [1] copula0.999-19.1 bigrquery1.1.1
[3] plyr1.8.4 igraph1.2.4.1
[5] lazyeval0.2.2 splines3.5.1
[7] pspline1.0-18 listenv0.7.0
[9] digest0.6.19 foreach1.4.4
[11] htmltools0.3.6 viridis0.5.1
[13] GO.db3.7.0 gdata2.18.0
[15] memoise1.1.0 cluster2.0.7-1
[17] ROCR1.0-7 limma3.38.3
[19] annotate1.60.1 globals0.12.4
[21] stabledist0.7-1 R.utils2.8.0
[23] prettyunits1.0.2 colorspace1.3-2
[25] blob1.1.1 rappdirs0.3.1
[27] ggrepel0.8.1 crayon1.3.4
[29] RCurl1.95-4.11 jsonlite1.6
[31] genefilter1.64.0 iterators1.0.10
[33] survival2.44-1.1 zoo1.8-6
[35] ape5.3 glue1.3.1
[37] gtable0.3.0 zlibbioc1.28.0
[39] XVector0.22.0 Rhdf5lib1.4.3
[41] future.apply1.2.0 HDF5Array1.10.1
[43] scales1.0.0 mvtnorm1.0-10
[45] edgeR3.24.3 DBI1.0.0
[47] bibtex0.4.2 Rcpp1.0.1
[49] metap1.1 viridisLite0.3.0
[51] xtable1.8-2 progress1.2.0
[53] reticulate1.12 bit1.1-14
[55] rsvd1.0.1 SDMTools1.1-221.1
[57] rhdf5client1.4.1 tsne0.1-3
[59] glmnet2.0-16 htmlwidgets1.3
[61] httr1.4.0 gplots3.0.1.1
[63] RColorBrewer1.1-2 ica1.0-2
[65] pkgconfig2.0.2 XML3.98-1.15
[67] R.methodsS31.7.1 locfit1.5-9.1
[69] softImpute1.4 tidyselect0.2.5
[71] rlang0.3.4 reshape21.4.3
[73] later0.7.4 AnnotationDbi1.44.0
[75] munsell0.5.0 tools3.5.1
[77] RSQLite2.1.1 ggridges0.5.1
[79] stringr1.4.0 yaml2.2.0
[81] npsurv0.4-0 bit640.9-7
[83] fitdistrplus1.0-14 caTools1.17.1.2
[85] purrr0.3.2 RANN2.6.1
[87] pbapply1.4-0 future1.13.0
[89] nlme3.1-137 mime0.6
[91] R.oo1.22.0 compiler3.5.1
[93] beeswarm0.2.3 plotly4.9.0
[95] curl3.3 png0.1-7
[97] interactiveDisplayBase1.20.0 lsei1.2-0
[99] tibble2.1.2 pcaPP1.9-73
[101] stringi1.4.3 gsl2.1-6
[103] lattice0.20-35 pillar1.4.1
[105] ADGofTest0.3 BiocManager1.30.4
[107] Rdpack0.11-0 lmtest0.9-37
[109] data.table1.12.2 cowplot0.9.4
[111] bitops1.0-6 irlba2.3.3
[113] gbRd0.4-11 httpuv1.4.5
[115] promises1.0.1 KernSmooth2.23-15
[117] gridExtra2.3 vipor0.4.5
[119] codetools0.2-15 MASS7.3-50
[121] gtools3.8.1 assertthat0.2.1
[123] rhdf52.26.2 rjson0.2.20
[125] withr2.1.2 sctransform0.2.0
[127] GenomeInfoDbData1.2.0 hms0.4.2
[129] grid3.5.1 tidyr0.8.3
[131] DelayedMatrixStats1.4.0 Rtsne0.15
[133] numDeriv2016.8-1 shiny1.1.0
[135] ggbeeswarm_0.6.0 `
The text was updated successfully, but these errors were encountered: