Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

flowjo_to_gatingset does not accept data.frame as path #151

Closed
nranthony opened this issue Apr 29, 2023 · 5 comments
Closed

flowjo_to_gatingset does not accept data.frame as path #151

nranthony opened this issue Apr 29, 2023 · 5 comments

Comments

@nranthony
Copy link

Passing a data.frame to the path argument for flowjo_to_gatingset as detailed in the docs (pasted below) returns an error Error in path.expand(path) : invalid 'path' argument. This appears to be thrown on line 77 (whilst in debug, not sure in actual source) in creating the args list: path = suppressWarnings(normalizePath(path)) This expects a single path or character vector of paths, not a data.frame.

Description of path argument in docs:
either a character scalar or data.frame. When character, it is a path to the fcs files that are to be imported. The code will search recursively, so you can point it to a location above the files. When it is a data.frame, it is expected to contain two columns:'sampleID' and 'file', which is used as the mapping between 'sampleID' and FCS file (absolute) path. When such mapping is provided, the file system searching is avoided.

SessionInfo:
R version 4.3.0 (2023-04-21 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19045)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.utf8 LC_CTYPE=English_United States.utf8
[3] LC_MONETARY=English_United States.utf8 LC_NUMERIC=C
[5] LC_TIME=English_United States.utf8

time zone: America/New_York
tzcode source: internal

attached base packages:
[1] tools stats graphics grDevices datasets utils methods base

other attached packages:
[1] CytoML_2.12.0 R6_2.5.1 tictoc_1.2 SamSPECTRAL_1.54.0
[5] ggpubr_0.6.0 gtools_3.9.4 PeacoQC_1.10.0 matrixStats_0.63.0
[9] flowSpecs_1.14.0 flowWorkspace_4.12.0 xml2_1.3.3 ggridges_0.5.4
[13] reshape2_1.4.4 lubridate_1.9.2 forcats_1.0.0 stringr_1.5.0
[17] dplyr_1.1.2 purrr_1.0.1 readr_2.1.4 tidyr_1.3.0
[21] tibble_3.2.1 ggplot2_3.4.2 tidyverse_2.0.0 flowCore_2.12.0

loaded via a namespace (and not attached):
[1] tidyselect_1.2.0 XML_3.99-0.14 digest_0.6.31 timechange_0.2.0
[5] lifecycle_1.0.3 cluster_2.1.4 magrittr_2.0.3 compiler_4.3.0
[9] rlang_1.1.0 utf8_1.2.3 yaml_2.3.7 data.table_1.14.8
[13] ggsignif_0.6.4 plyr_1.8.8 RColorBrewer_1.1-3 abind_1.4-5
[17] BiocParallel_1.34.0 withr_2.5.0 RProtoBufLib_2.12.0 BiocGenerics_0.46.0
[21] grid_4.3.0 stats4_4.3.0 fansi_1.0.4 colorspace_2.1-0
[25] scales_1.2.1 iterators_1.0.14 cli_3.6.1 crayon_1.5.2
[29] ncdfFlow_2.46.0 generics_0.1.3 rstudioapi_0.14 tzdb_0.3.0
[33] rjson_0.2.21 zlibbioc_1.46.0 parallel_4.3.0 BiocManager_1.30.20
[37] vctrs_0.6.2 jsonlite_1.8.4 carData_3.0-5 cytolib_2.12.0
[41] car_3.1-2 IRanges_2.34.0 hms_1.1.3 GetoptLong_1.0.5
[45] S4Vectors_0.38.0 RBGL_1.76.0 rstatix_0.7.2 clue_0.3-64
[49] Rgraphviz_2.44.0 foreach_1.5.2 hexbin_1.28.3 glue_1.6.2
[53] codetools_0.2-19 stringi_1.7.12 gtable_0.3.3 shape_1.4.6
[57] ggcyto_1.28.0 ComplexHeatmap_2.16.0 munsell_0.5.0 pillar_1.9.0
[61] graph_1.78.0 circlize_0.4.15 doParallel_1.0.17 Biobase_2.60.0
[65] lattice_0.21-8 png_0.1-8 backports_1.4.1 broom_1.0.4
[69] renv_0.17.3 Rcpp_1.0.10 gridExtra_2.3 zoo_1.8-12
[73] pkgconfig_2.0.3 GlobalOptions_0.1.2

mikejiang pushed a commit that referenced this issue May 1, 2023
@mikejiang
Copy link
Member

right, the feature of passing path as a data.frame was deprecated as we re-factored entire parsing code into c++, because it wasn't widely used functionality and we didn't think it was worthwhile to port it.
I have updated documentation to reflect the current state of this parameter.

@nranthony
Copy link
Author

Wonderful, thanks for the info and the update.

I have a situation where I need to explicitly define the fcs files, as there are duplicates of the same filename in two folder in the folder structure from the wsp.
Without this functionality, what is the best way to open the wsp? Can I iterate over the fcs files one by one, or would you suggest something else?

@Close-your-eyes
Copy link

For my purposes, I do iterate over fcs file paths and import them separately as gating sets with CytoML::flowjo_to_gatingset.

When your files (or now gatingsets) are consistent and you want to continue with the gatingset format, you may use flowWorkspace::merge_list_to_gs() to merge a list gatingsets into one gatingset object.

Is it only the filenames that are duplicated or also the meta data of respective FCS files? If it is only the filenames but meta data are unique to each file then you may use the "subset" argument of CytoML::flowjo_to_gatingset to explicitly direct the function to the desired fcs file. The "path" argument may then be the root folder of your fcs files.

There may be a few more details to consider and it may still be a bit tricky though ...

@nranthony
Copy link
Author

Thanks for the input, much appreciated. I'll run with your suggestions and figure it out.
As such, no longer an issue. Closing.

@mikejiang
Copy link
Member

type ?flowjo_to_gatingset and look for section on additional.keys parameter that can be used to address fcs file matching problem

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants