Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loading 10x Genomics Data: Error in step_subset #214

Open
mdozmorov opened this issue Feb 13, 2022 · 8 comments
Open

Loading 10x Genomics Data: Error in step_subset #214

mdozmorov opened this issue Feb 13, 2022 · 8 comments
Assignees

Comments

@mdozmorov
Copy link

Hello. I started with the Loading 10x Genomics Data tutorial, downloaded the CSV files from 10X website and ran immdata_10x <- repLoad(file_path). It results in error, reproducible with the data I actually want to analyze:

== Step 1/3: loading repertoire files... ==

Processing "/Users/mdozmorov/Documents/Data/VCU_work/Sawalha/2021-06.scRNA_scATAC/test_immunarch/data" ...
  -- [1/5] Parsing "/Users/mdozmorov/Documents/Data/VCU_work/Sawalha/2021-06.scRNA_scATAC/test_immunarch/data/vdj_v1_mm_c57bl6_pbmc_t_all_contig_annotations.csv" -- 10x (filt.contigs)
  [!] Removed 2917 clonotypes with no nucleotide and amino acid CDR3 sequence.                                                             
Error in step_subset(parent, vars = vars, groups = groups, arrange = arrange,  : 
  is.null(j) || is_expression(j) is not TRUE
In addition: Warning message:
The following named parsers don't match the column names: barcode,is_cell,contig_id,high_confidence,length,chain,v_gene,d_gene,j_gene,c_gene,full_length,productive,cdr3,cdr3_nt,reads,umis,raw_clonotype_id,raw_consensus_id 

The files I downloaded and put in a separate file_path folder are:

vdj_v1_mm_c57bl6_pbmc_t_all_contig_annotations.csv
vdj_v1_mm_c57bl6_pbmc_t_clonotypes.csv
vdj_v1_mm_c57bl6_pbmc_t_consensus_annotations.csv
vdj_v1_mm_c57bl6_pbmc_t_filtered_contig_annotations.csv
vdj_v1_mm_c57bl6_pbmc_t_metrics_summary.csv

I'm using Immunarch v.0.6.7 on a Mac. What may be wrong?

@MVolobueva
Copy link
Collaborator

MVolobueva commented Mar 28, 2022

Hi, @mdozmorov!
My name is Maria Volobueva, I am a developer of the Immunarch package.

We have managed to reproduce your issue. Now we are working on fixing it.

I will get back to you with any updates.

Thank you so much for drawing our attention to this.

Good luck,
Maria Volobueva

@MVolobueva
Copy link
Collaborator

Hello, @mdozmorov

I've figured out what the bug was. We have already fixed it in the dev-branch of Immunarch.

To install this branch you can utilize the following commands:

install.packages(c("devtools", "pkgload"))
devtools::install_github("immunomind/immunarch", ref="dev")
devtools::reload(pkgload::inst("immunarch"))

If you are working in Rstudio and the bug bothers you again, you need to go to Tools -> Project Options -> Restore .Rdata into workspace at startup -> No and then start your new project.

Do not hesitate to contact us with any questions further along.

Good luck,
Maria Volobueva

@mdozmorov
Copy link
Author

Thanks, Maria, I followed your instructions verbatim, but the problem still persists. I did reinstall immunarch from the dev branch, updated all packages, ensured global and local workspace restoring is disabled. I'm copy-pasting the code, the error is identical.

# 1.1) Load the package into R:
# devtools::install_github("immunomind/immunarch", ref="dev")
library(immunarch)

# 1.2) Replace with the path to your processed 10x data or to the clonotypes file
file_path = "/Users/mdozmorov/Documents/Data/VCU_work/test_immunarch/data"

# 1.3) Load 10x data with repLoad
immdata_10x <- repLoad(file_path)

@Alexander230
Copy link
Collaborator

Hi, @mdozmorov!

My name is Aleksandr Popov, I am a developer of the Immunarch package.

When I tried to reproduce this bug, I noticed that it appears only when there are remains of old version of Immunarch, or there are function name conflicts in R environment. Please try to run R from terminal with R --vanilla command (to start it with empty environment) and run these commands:

install.packages(c("devtools", "pkgload"))
devtools::install_github("immunomind/immunarch", ref="dev")
devtools::reload(pkgload::inst("immunarch"))
file_path = "/Users/mdozmorov/Documents/Data/VCU_work/test_immunarch/data"
immdata_10x <- repLoad(file_path)

I hope this will help to load the data correctly.

Best regards,
Aleksandr

@mdozmorov
Copy link
Author

It didn't help. The R --vanilla session still senses the installation and

Skipping install of 'immunarch' from a github remote, the SHA1 (37d06bef) has not changed since last install.
  Use `force = TRUE` to force installation

Manually removing it

rm -r /Users/mdozmorov/Library/R/x86_64/4.1/library/immunarch

and reinstalling still results in the same error.

@MVolobueva
Copy link
Collaborator

Hello, @mdozmorov!

I suppose that error persits as you try to load all files from your folder in Immunarch. But Immunarch could load only files with proper format.
Files that names end with contig_annotations.csv should be loaded correctly.

Please try to replace the file_path variable in your script:

file_path = "/Users/mdozmorov/Documents/Data/VCU_work/test_immunarch/data/vdj_v1_mm_c57bl6_pbmc_t_all_contig_annotations.csv"

Do not hesitate to contact us with any questions further along.

Good luck,
Maria

@shanshenbing
Copy link

shanshenbing commented May 5, 2022

Hello, I get the same error when loading my 10x genomics results. I installed your dev version package and restart my Rstudio and get same error.
immdata <- repLoad(.path = './BM01/tcr/run_count/outs/all_contig_annotations.csv')

my error is like this:

== Step 1/3: loading repertoire files... ==

Processing "" ...
-- [1/1] Parsing "/BM01/tcr/run_count/outs/all_contig_annotations.csv" -- 10x (filt.contigs)
[!] Removed 1415 clonotypes with no nucleotide and amino acid CDR3 sequence.
Error in step_subset(parent, vars = vars, groups = groups, arrange = arrange, :
is.null(j) || is_expression(j) is not TRUE
In addition: Warning message:
The following named parsers don't match the column names: barcode,is_cell,contig_id,high_confidence,length,chain,v_gene,d_gene,j_gene,c_gene,full_length,productive,fwr1,fwr1_nt,cdr1,cdr1_nt,fwr2,fwr2_nt,cdr2,cdr2_nt,fwr3,fwr3_nt,cdr3,cdr3_nt,fwr4,fwr4_nt,reads,umis,raw_clonotype_id,raw_consensus_id,exact_subclonotype_id

#version
#you can see I am using the latest version.
packageVersion('immunarch')
[1] ‘0.6.8’

session info
R version 4.0.5 (2021-03-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS/LAPACK: /usr/lib64/libopenblas-r0.3.3.so

locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] immunarch_0.6.8 patchwork_1.1.1 data.table_1.14.0 dtplyr_1.1.0
[5] dplyr_1.0.8 ggplot2_3.3.3

loaded via a namespace (and not attached):
[1] rappdirs_0.3.3 prabclus_2.3-2
[3] R.methodsS3_1.8.1 tidyr_1.1.3
[5] bit64_4.0.5 knitr_1.33
[7] DelayedArray_0.16.3 R.utils_2.10.1
[9] RCurl_1.98-1.5 doParallel_1.0.16
[11] generics_0.1.0 BiocGenerics_0.36.1
[13] callr_3.7.0 usethis_2.0.1
[15] RSQLite_2.2.7 shadowtext_0.0.8
[17] rlist_0.4.6.2 tzdb_0.2.0
[19] bit_4.0.4 enrichplot_1.15.3
[21] xml2_1.3.2 httpuv_1.6.1
[23] SummarizedExperiment_1.20.0 assertthat_0.2.1
[25] viridis_0.6.2 xfun_0.23
[27] hms_1.0.0 celldex_1.0.0
[29] babelgene_21.4 evaluate_0.14
[31] promises_1.2.0.1 DEoptimR_1.0-8
[33] fansi_0.4.2 dbplyr_2.1.1
[35] readxl_1.3.1 igraph_1.2.11
[37] DBI_1.1.1 geneplotter_1.68.0
[39] htmlwidgets_1.5.3 stringdist_0.9.6.3
[41] stats4_4.0.5 purrr_0.3.4
[43] ellipsis_0.3.2 ggpubr_0.4.0
[45] backports_1.2.1 annotate_1.68.0
[47] sparseMatrixStats_1.2.1 MatrixGenerics_1.2.1
[49] ggalluvial_0.12.3 vctrs_0.3.8
[51] Biobase_2.50.0 remotes_2.3.0
[53] Cairo_1.5-12.2 abind_1.4-5
[55] cachem_1.0.5 withr_2.4.2
[57] ggforce_0.3.3 robustbase_0.93-7
[59] vroom_1.5.6 treeio_1.14.4
[61] prettyunits_1.1.1 mclust_5.4.9
[63] cluster_2.1.2 DOSE_3.16.0
[65] ExperimentHub_1.16.1 ape_5.5
[67] lazyeval_0.2.2 crayon_1.4.1
[69] genefilter_1.72.1 pkgconfig_2.0.3
[71] tweenr_1.0.2 GenomeInfoDb_1.26.7
[73] nlme_3.1-152 pkgload_1.2.4
[75] nnet_7.3-16 devtools_2.4.3
[77] diptest_0.76-0 rlang_1.0.1
[79] lifecycle_1.0.1 downloader_0.4
[81] BiocFileCache_1.14.0 AnnotationHub_2.22.1
[83] cellranger_1.1.0 rprojroot_2.0.2
[85] polyclip_1.10-0 matrixStats_0.61.0
[87] flextable_0.6.5 phangorn_2.7.1
[89] ggseqlogo_0.1 Matrix_1.3-3
[91] aplot_0.0.6 carData_3.0-4
[93] base64enc_0.1-3 GlobalOptions_0.1.2
[95] processx_3.5.2 pheatmap_1.0.12
[97] png_0.1-7 viridisLite_0.4.0
[99] rjson_0.2.20 bitops_1.0-7
[101] R.oo_1.24.0 blob_1.2.1
[103] DelayedMatrixStats_1.12.3 shape_1.4.6
[105] stringr_1.4.0 qvalue_2.22.0
[107] readr_2.1.2 rstatix_0.7.0
[109] gridGraphics_0.5-1 ggsignif_0.6.1
[111] S4Vectors_0.28.1 scales_1.1.1
[113] memoise_2.0.0 magrittr_2.0.1
[115] plyr_1.8.6 zlibbioc_1.36.0
[117] compiler_4.0.5 scatterpie_0.1.6
[119] factoextra_1.0.7 RColorBrewer_1.1-2
[121] clue_0.3-60 DESeq2_1.30.1
[123] cli_3.2.0 XVector_0.30.0
[125] ps_1.6.0 MASS_7.3-54
[127] tidyselect_1.1.1 forcats_0.5.1
[129] stringi_1.7.6 yaml_2.2.1
[131] GOSemSim_2.16.1 locfit_1.5-9.4
[133] ggrepel_0.9.1 grid_4.0.5
[135] fastmatch_1.1-0 tools_4.0.5
[137] rio_0.5.26 parallel_4.0.5
[139] rvg_0.2.5 circlize_0.4.13
[141] rstudioapi_0.13 uuid_0.1-4
[143] foreign_0.8-81 foreach_1.5.1
[145] gridExtra_2.3 devEMF_4.0-2
[147] farver_2.1.0 ggraph_2.0.5
[149] digest_0.6.27 rvcheck_0.1.8
[151] BiocManager_1.30.16 shiny_1.6.0
[153] quadprog_1.5-8 fpc_2.2-9
[155] Rcpp_1.0.6 car_3.0-10
[157] GenomicRanges_1.42.0 broom_0.7.6
[159] BiocVersion_3.12.0 R.devices_2.17.0
[161] later_1.2.0 httr_1.4.2
[163] gdtools_0.2.3 AnnotationDbi_1.52.0
[165] ComplexHeatmap_2.6.2 kernlab_0.9-29
[167] colorspace_2.0-1 job_0.3.0
[169] XML_3.99-0.9 fs_1.5.0
[171] IRanges_2.24.1 splines_4.0.5
[173] yulab.utils_0.0.4 tidytree_0.3.4
[175] graphlayouts_0.7.1 shinythemes_1.2.0
[177] flexmix_2.3-17 ggplotify_0.0.7
[179] plotly_4.9.3 sessioninfo_1.1.1
[181] systemfonts_1.0.2 xtable_1.8-4
[183] jsonlite_1.7.2 ggtree_2.4.2
[185] tidygraph_1.2.0 UpSetR_1.4.0
[187] modeltools_0.2-23 testthat_3.0.2
[189] R6_2.5.0 pillar_1.6.1
[191] htmltools_0.5.2 mime_0.10
[193] glue_1.6.0 fastmap_1.1.0
[195] clusterProfiler_3.18.1 BiocParallel_1.24.1
[197] class_7.3-19 interactiveDisplayBase_1.28.0
[199] codetools_0.2-18 fgsea_1.16.0
[201] pkgbuild_1.2.0 utf8_1.2.1
[203] lattice_0.20-44 tibble_3.1.2
[205] curl_4.3.1 officer_0.3.18
[207] magick_2.7.2 openxlsx_4.2.3
[209] zip_2.1.1 GO.db_3.12.1
[211] survival_3.2-11 rmarkdown_2.8
[213] desc_1.3.0 munsell_0.5.0
[215] DO.db_2.9 GetoptLong_1.0.5
[217] GenomeInfoDbData_1.2.4 iterators_1.0.13
[219] haven_2.4.1 reshape2_1.4.4
[221] gtable_0.3.0 msigdbr_7.4.1
[223] eoffice_0.2.1

Then I tested several versions of the package and only versions before 0.6.5 can load 10x data correctly. That means 0.6.4 can load 10x genomics data, but 0.6.5 0.6.7 not.

Hope you can give me some suggestions. Thank you!

@MVolobueva
Copy link
Collaborator

Hi, @shanshenbing!

Thank you for contacting us. I suppose that error persits as package versions conflict in Rstudio. To test it, write on the command line:

R --vanilla

Than install proper version of immunarch again and repeat your command (on the command line too). If everything will be ok, just update Rstudio projects, otherwise let us know.

Do not hesitate to contact us with any questions further along.

Good luck,
Maria Samokhina

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants