-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
reading in many (old) files after copying them over to my linux drive fails #110
Comments
Hmm when I read the file that's giving errors separately, I also get a bunch of errors:
here's the file: |
The above errors have made me look into the raw files themselves. Currently trying to do an |
Yep, seems like this was an issue with half-copied files, since it does work now. I still get this |
Thanks for testing so carefully. Could you send me a small example file and
code to reproduce the warning? Some of it might be recent changes in dplyr
(version 1.0 coming up fast which will likely break a few things).
…On Thu, Apr 2, 2020 at 6:49 AM Ilja Kocken ***@***.***> wrote:
Yep, seems like this was an issue with half-copied files, since it does
work now.
I still get this Column `path` has different attributes on LHS and RHS of
join warning though
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#110 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABJ6QVUWJNMA6CGFIKRYKTLRKSCXLANCNFSM4LZCY2OA>
.
|
unzip that file to wherever, then |
I get the following output without any other warnings. Could you share your
|
log of running it on one file, quietly, with cachinglibrary(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(isoreader)
#>
#> Attaching package: 'isoreader'
#> The following object is masked from 'package:stats':
#>
#> filter
setwd("~/Downloads")
cafs <- iso_read_dual_inlet("170126_170124_Sibren_run29-1426.caf",
cache = TRUE,
quiet = TRUE,
discard_duplicates = FALSE,
parallel = TRUE)
#> Warning: Column `path` has different attributes on LHS and RHS of join
iso_get_problems(cafs)
#> # A tibble: 1 x 4
#> file_id type func details
#> <chr> <chr> <chr> <chr>
#> 1 170126_170124_Sib… error extract_isodat_ol… "Assigned data `file_info$value` …
sessionInfo()
#> R version 3.6.3 (2020-02-29)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Arch Linux
#>
#> Matrix products: default
#> BLAS: /usr/lib/libopenblasp-r0.3.9.so
#> LAPACK: /usr/lib/liblapack.so.3.9.0
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] isoreader_1.1.4 dplyr_0.8.5
#>
#> loaded via a namespace (and not attached):
#> [1] zip_2.0.4 Rcpp_1.0.4 pillar_1.4.3 compiler_3.6.3
#> [5] highr_0.8 prettyunits_1.1.1 progress_1.2.2 R.methodsS3_1.8.0
#> [9] R.utils_2.9.2 base64enc_0.1-3 tools_3.6.3 digest_0.6.25
#> [13] rhdf5_2.30.1 lubridate_1.7.4 evaluate_0.14 lifecycle_0.2.0
#> [17] tibble_3.0.0 pkgconfig_2.0.3 rlang_0.4.5 openxlsx_4.1.4
#> [21] cli_2.0.2 yaml_2.2.1 parallel_3.6.3 xfun_0.12
#> [25] xml2_1.2.5 stringr_1.4.0 knitr_1.28 vctrs_0.2.4
#> [29] globals_0.12.5 hms_0.5.3 tidyselect_1.0.0 glue_1.3.2
#> [33] listenv_0.8.0 R6_2.4.1 fansi_0.4.1 rmarkdown_2.1
#> [37] tidyr_1.0.2 Rhdf5lib_1.8.0 readr_1.3.1 purrr_0.3.3
#> [41] magrittr_1.5 feather_0.3.5 codetools_0.2-16 ellipsis_0.3.0
#> [45] htmltools_0.4.0 assertthat_0.2.1 future_1.16.0 UNF_2.0.6
#> [49] utf8_1.1.4 stringi_1.4.6 crayon_1.3.4 R.oo_1.23.0 Created on 2020-04-03 by the reprex package (v0.3.0) |
Looks like there are some issues with the old caf files now, which is very unfortunate. They are suddenly ALL marked as problematic files. When I run
|
Ok I think I'm being stupid. I had this issue for my newest files first, then it was fixed after I rsync'd without the |
sounds good. I do think there might be some dplyr issues with 0.8.5 (and the upcoming 1.0) that we need to address. The newest dplyr has implements |
Aww unfortunately that was not the problem. All my old caf files don't work anymore, even after double-checking that they were copied over correctly. So log of reading in the combined big cafs file with all 4928 caf files, resulting in 6300 errorslibrary(isoreader)
#>
#> Attaching package: 'isoreader'
#> The following object is masked from 'package:stats':
#>
#> filter
setwd("~/SurfDrive/PhD/programming/dataprocessing")
cafs <- iso_read_dual_inlet("out/cafs.di.rds")
#> Info: preparing to read 1 data files (all will be cached)...
#> Info: reading file 'out/cafs.di.rds' with '.di.rds' reader
#> Info: loaded data for 4928 data files from R Data Storage - checking loaded...
#> Info: finished reading 1 files in 13.08 secs
#> Warning: Column `path` has different attributes on LHS and RHS of join
#> Info: encountered 6300 problems in total
#> # A tibble: 6,300 x 4
#> file_id type func details
#> <chr> <chr> <chr> <chr>
#> 1 170126_170124_Sib… error extract_isodat_ol… "Assigned data `file_info$value`…
#> 2 170126_170124_Sib… error extract_isodat_ol… "Assigned data `file_info$value`…
#> 3 170126_170124_Sib… error extract_isodat_ol… "Assigned data `file_info$value`…
#> 4 170126_170124_Sib… error extract_isodat_ol… "Assigned data `file_info$value`…
#> 5 170126_170124_Sib… error extract_isodat_ol… "Assigned data `file_info$value`…
#> 6 170126_170124_Sib… error extract_isodat_ol… "Assigned data `file_info$value`…
#> 7 170127_170124_Sib… error extract_isodat_ol… "Assigned data `file_info$value`…
#> 8 170127_170124_Sib… error extract_isodat_ol… "Assigned data `file_info$value`…
#> 9 170127_170124_Sib… error extract_isodat_ol… "Assigned data `file_info$value`…
#> 10 170127_170124_Sib… error extract_isodat_ol… "Assigned data `file_info$value`…
#> # … with 6,290 more rows
iso_get_file_info(cafs)
#> Info: aggregating file info from 4928 data file(s)
#> Error: No common type for `170126_170124_Sibren_run29-1426.caf$file_datetime` <datetime<Europe/Amsterdam>> and `170621_170522_Guido_Magda_ETH-1-0000.caf$file_datetime` <integer>.
iso_get_raw_data(cafs)
#> Info: aggregating raw data from 4928 data file(s)
#> # A tibble: 74,674 x 9
#> file_id type cycle v44.mV v45.mV v46.mV v47.mV v48.mV v49.mV
#> <chr> <chr> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 170126_170124_Sibren_… stand… 0 13091. 15594. 18622. 2078. 216. -0.528
#> 2 170126_170124_Sibren_… stand… 1 12177. 14506. 17323. 1933. 201. -0.503
#> 3 170126_170124_Sibren_… stand… 2 11329. 13497. 16119. 1799. 187. -0.456
#> 4 170126_170124_Sibren_… stand… 3 10556. 12576. 15019. 1677. 174. -0.431
#> 5 170126_170124_Sibren_… stand… 4 9845. 11729. 14008. 1564. 163. -0.397
#> 6 170126_170124_Sibren_… stand… 5 9192. 10952. 13080. 1461. 152. -0.363
#> 7 170126_170124_Sibren_… stand… 6 8591. 10236. 12224. 1366. 142. -0.339
#> 8 170126_170124_Sibren_… stand… 7 8034. 9572. 11431. 1278. 133. -0.308
#> 9 170126_170124_Sibren_… stand… 8 7521. 8961. 10702. 1197. 124. -0.282
#> 10 170126_170124_Sibren_… sample 1 12661. 14953. 17854. 1974. 206. -0.509
#> # … with 74,664 more rows Created on 2020-04-03 by the reprex package (v0.3.0) |
Also, just using |
I cannot reproduce your error even with your versions of dplyr and vctrs. Could this be an issue with the cached files? Can you run an example with |
Hmm that's very weird. I've just updated my system and new log running it without caching for one filelibrary(isoreader)
#>
#> Attaching package: 'isoreader'
#> The following object is masked from 'package:stats':
#>
#> filter
cafs <- iso_read_dual_inlet("~/Downloads/170126_170124_Sibren_run29-1426.caf",
cache = FALSE,
read_cache = FALSE,
quiet = FALSE,
discard_duplicates = FALSE,
parallel = FALSE)
#> Info: preparing to read 1 data files...
#> Info: reading file '170126_170124_Sibren_run29-1426.caf' with '.caf' reader
#> Warning: caught error - Assigned data `file_info$value` must be compatible ...
#> Info: finished reading 1 files in 4.06 secs
#> Warning: Column `path` has different attributes on LHS and RHS of join
#> Info: encountered 1 problems in total
#> # A tibble: 1 x 4
#> file_id type func details
#> <chr> <chr> <chr> <chr>
#> 1 170126_170124_Sib… error extract_isodat_ol… "Assigned data `file_info$value` …
iso_get_problems(cafs)
#> # A tibble: 1 x 4
#> file_id type func details
#> <chr> <chr> <chr> <chr>
#> 1 170126_170124_Sib… error extract_isodat_ol… "Assigned data `file_info$value` …
iso_get_file_info(cafs)
#> Info: aggregating file info from 1 data file(s)
#> # A tibble: 1 x 7
#> file_id file_root file_path file_subpath file_datetime file_size
#> <chr> <chr> <chr> <chr> <dttm> <int>
#> 1 170126… /home/ja… 170126_1… <NA> 2017-01-26 20:29:47 651810
#> # … with 1 more variable: MS_integration_time.s <dbl>
sessionInfo()
#> R version 3.6.3 (2020-02-29)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Arch Linux
#>
#> Matrix products: default
#> BLAS: /usr/lib/libopenblasp-r0.3.9.so
#> LAPACK: /usr/lib/liblapack.so.3.9.0
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] isoreader_1.1.4
#>
#> loaded via a namespace (and not attached):
#> [1] zip_2.0.4 Rcpp_1.0.4 pillar_1.4.3 compiler_3.6.3
#> [5] highr_0.8 prettyunits_1.1.1 progress_1.2.2 R.methodsS3_1.8.0
#> [9] R.utils_2.9.2 base64enc_0.1-3 tools_3.6.3 digest_0.6.25
#> [13] rhdf5_2.30.1 lubridate_1.7.8 evaluate_0.14 lifecycle_0.2.0
#> [17] tibble_3.0.0 pkgconfig_2.0.3 rlang_0.4.5 openxlsx_4.1.4
#> [21] cli_2.0.2 yaml_2.2.1 parallel_3.6.3 xfun_0.12
#> [25] xml2_1.3.0 dplyr_0.8.5 stringr_1.4.0 knitr_1.28
#> [29] generics_0.0.2 vctrs_0.2.4 globals_0.12.5 hms_0.5.3
#> [33] tidyselect_1.0.0 glue_1.4.0 listenv_0.8.0 R6_2.4.1
#> [37] fansi_0.4.1 rmarkdown_2.1 tidyr_1.0.2 Rhdf5lib_1.8.0
#> [41] readr_1.3.1 purrr_0.3.3 magrittr_1.5 feather_0.3.5
#> [45] codetools_0.2-16 ellipsis_0.3.0 htmltools_0.4.0 assertthat_0.2.1
#> [49] future_1.16.0 UNF_2.0.6 utf8_1.1.4 stringi_1.4.6
#> [53] crayon_1.3.4 R.oo_1.23.0 Created on 2020-04-07 by the reprex package (v0.3.0) |
Another long log running it for 22 caf files, resulting in 24 errorslibrary(isoreader)
#>
#> Attaching package: 'isoreader'
#> The following object is masked from 'package:stats':
#>
#> filter
dir.create("/tmp/rtmp")
setwd("/tmp/rtmp")
cafs <- iso_read_dual_inlet("~/Documents/archive/pacman/cafs/180522_Stds/",
cache = FALSE,
read_cache = FALSE,
quiet = FALSE,
discard_duplicates = FALSE,
parallel = FALSE)
#> Info: preparing to read 22 data files...
#> Info: reading file '180522_Std_ETH-1_1.caf' with '.caf' reader
#> Warning: caught error - Assigned data `file_info$value` must be compatible ...
#> Info: reading file '180522_Std_ETH-1_2.caf' with '.caf' reader
#> Warning: caught error - Assigned data `file_info$value` must be compatible ...
#> Info: reading file '180522_Std_ETH-1_7.caf' with '.caf' reader
#> Warning: caught error - Assigned data `file_info$value` must be compatible ...
#> Info: reading file '180522_Std_ETH-1_8.caf' with '.caf' reader
#> Warning: caught error - Assigned data `file_info$value` must be compatible ...
#> Info: reading file '180522_Std_ETH-2_10.caf' with '.caf' reader
#> Warning: caught error - Assigned data `file_info$value` must be compatible ...
#> Info: reading file '180522_Std_ETH-2_3.caf' with '.caf' reader
#> Warning: caught error - Assigned data `file_info$value` must be compatible ...
#> Info: reading file '180522_Std_ETH-2_4.caf' with '.caf' reader
#> Warning: caught error - Assigned data `file_info$value` must be compatible ...
#> Info: reading file '180522_Std_ETH-2_9.caf' with '.caf' reader
#> Warning: caught error - Assigned data `file_info$value` must be compatible ...
#> Info: reading file '180522_Std_ETH-3_11.caf' with '.caf' reader
#> Warning: caught error - Assigned data `file_info$value` must be compatible ...
#> Info: reading file '180522_Std_ETH-3_12.caf' with '.caf' reader
#> Warning: caught error - Assigned data `file_info$value` must be compatible ...
#> Info: reading file '180522_Std_ETH-3_5.caf' with '.caf' reader
#> Warning: caught error - Assigned data `file_info$value` must be compatible ...
#> Info: reading file '180522_Std_ETH-3_6.caf' with '.caf' reader
#> Warning: caught error - Assigned data `file_info$value` must be compatible ...
#> Info: reading file '180523_Std_ETH-1_13.caf' with '.caf' reader
#> Warning: caught error - Assigned data `file_info$value` must be compatible ...
#> Info: reading file '180523_Std_ETH-1_14.caf' with '.caf' reader
#> Warning: caught error - Assigned data `file_info$value` must be compatible ...
#> Info: reading file '180523_Std_ETH-1_19.caf' with '.caf' reader
#> Warning: caught error - Assigned data `file_info$value` must be compatible ...
#> Info: reading file '180523_Std_ETH-1_20.caf' with '.caf' reader
#> Warning: caught error - Assigned data `file_info$value` must be compatible ...
#> Info: reading file '180523_Std_ETH-2_15.caf' with '.caf' reader
#> Warning: caught error - Assigned data `file_info$value` must be compatible ...
#> Info: reading file '180523_Std_ETH-2_16.caf' with '.caf' reader
#> Warning: caught error - Assigned data `file_info$value` must be compatible ...
#> Info: reading file '180523_Std_ETH-2_21.caf' with '.caf' reader
#> Warning: caught error - Assigned data `file_info$value` must be compatible ...
#> Info: reading file '180523_Std_ETH-2_22.caf' with '.caf' reader
#> Warning: caught error - Assigned data `file_info$value` must be compatible ...
#> Warning: caught error - cannot identify measured masses - block 'CResultDat...
#> Warning: caught error - cannot process vendor data table - block 'CResultDa...
#> Info: reading file '180523_Std_ETH-3_17.caf' with '.caf' reader
#> Warning: caught error - Assigned data `file_info$value` must be compatible ...
#> Info: reading file '180523_Std_ETH-3_18.caf' with '.caf' reader
#> Warning: caught error - Assigned data `file_info$value` must be compatible ...
#> Info: finished reading 22 files in 1.03 mins
#> Warning: Column `path` has different attributes on LHS and RHS of join
#> Info: encountered 24 problems in total
#> # A tibble: 24 x 4
#> file_id type func details
#> <chr> <chr> <chr> <chr>
#> 1 180522_Std_ET… error extract_isodat_old… "Assigned data `file_info$value` mu…
#> 2 180522_Std_ET… error extract_isodat_old… "Assigned data `file_info$value` mu…
#> 3 180522_Std_ET… error extract_isodat_old… "Assigned data `file_info$value` mu…
#> 4 180522_Std_ET… error extract_isodat_old… "Assigned data `file_info$value` mu…
#> 5 180522_Std_ET… error extract_isodat_old… "Assigned data `file_info$value` mu…
#> 6 180522_Std_ET… error extract_isodat_old… "Assigned data `file_info$value` mu…
#> 7 180522_Std_ET… error extract_isodat_old… "Assigned data `file_info$value` mu…
#> 8 180522_Std_ET… error extract_isodat_old… "Assigned data `file_info$value` mu…
#> 9 180522_Std_ET… error extract_isodat_old… "Assigned data `file_info$value` mu…
#> 10 180522_Std_ET… error extract_isodat_old… "Assigned data `file_info$value` mu…
#> # … with 14 more rows
iso_get_problems(cafs)
#> # A tibble: 24 x 4
#> file_id type func details
#> <chr> <chr> <chr> <chr>
#> 1 180522_Std_ET… error extract_isodat_old… "Assigned data `file_info$value` mu…
#> 2 180522_Std_ET… error extract_isodat_old… "Assigned data `file_info$value` mu…
#> 3 180522_Std_ET… error extract_isodat_old… "Assigned data `file_info$value` mu…
#> 4 180522_Std_ET… error extract_isodat_old… "Assigned data `file_info$value` mu…
#> 5 180522_Std_ET… error extract_isodat_old… "Assigned data `file_info$value` mu…
#> 6 180522_Std_ET… error extract_isodat_old… "Assigned data `file_info$value` mu…
#> 7 180522_Std_ET… error extract_isodat_old… "Assigned data `file_info$value` mu…
#> 8 180522_Std_ET… error extract_isodat_old… "Assigned data `file_info$value` mu…
#> 9 180522_Std_ET… error extract_isodat_old… "Assigned data `file_info$value` mu…
#> 10 180522_Std_ET… error extract_isodat_old… "Assigned data `file_info$value` mu…
#> # … with 14 more rows
iso_get_file_info(cafs)
#> Info: aggregating file info from 22 data file(s)
#> # A tibble: 22 x 7
#> file_id file_root file_path file_subpath file_datetime file_size
#> <chr> <chr> <chr> <chr> <dttm> <int>
#> 1 180522… /home/ja… 180522_S… <NA> 2018-05-22 17:24:52 651970
#> 2 180522… /home/ja… 180522_S… <NA> 2018-05-22 18:03:27 668650
#> 3 180522… /home/ja… 180522_S… <NA> 2018-05-22 21:14:55 668682
#> 4 180522… /home/ja… 180522_S… <NA> 2018-05-22 21:54:16 668678
#> 5 180522… /home/ja… 180522_S… <NA> 2018-05-22 23:11:42 669030
#> 6 180522… /home/ja… 180522_S… <NA> 2018-05-22 18:42:25 668992
#> 7 180522… /home/ja… 180522_S… <NA> 2018-05-22 19:21:29 669014
#> 8 180522… /home/ja… 180522_S… <NA> 2018-05-22 22:33:02 668970
#> 9 180522… /home/ja… 180522_S… <NA> 2018-05-22 23:47:26 652032
#> 10 180522… /home/ja… 180522_S… <NA> 2018-05-23 00:26:31 668710
#> # … with 12 more rows, and 1 more variable: MS_integration_time.s <dbl>
sessionInfo()
#> R version 3.6.3 (2020-02-29)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Arch Linux
#>
#> Matrix products: default
#> BLAS: /usr/lib/libopenblasp-r0.3.9.so
#> LAPACK: /usr/lib/liblapack.so.3.9.0
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] isoreader_1.1.4
#>
#> loaded via a namespace (and not attached):
#> [1] zip_2.0.4 Rcpp_1.0.4 pillar_1.4.3 compiler_3.6.3
#> [5] highr_0.8 prettyunits_1.1.1 progress_1.2.2 R.methodsS3_1.8.0
#> [9] R.utils_2.9.2 base64enc_0.1-3 tools_3.6.3 digest_0.6.25
#> [13] rhdf5_2.30.1 lubridate_1.7.8 evaluate_0.14 lifecycle_0.2.0
#> [17] tibble_3.0.0 pkgconfig_2.0.3 rlang_0.4.5 openxlsx_4.1.4
#> [21] cli_2.0.2 yaml_2.2.1 parallel_3.6.3 xfun_0.12
#> [25] xml2_1.3.0 dplyr_0.8.5 stringr_1.4.0 knitr_1.28
#> [29] generics_0.0.2 vctrs_0.2.4 globals_0.12.5 hms_0.5.3
#> [33] tidyselect_1.0.0 glue_1.4.0 listenv_0.8.0 R6_2.4.1
#> [37] fansi_0.4.1 rmarkdown_2.1 tidyr_1.0.2 Rhdf5lib_1.8.0
#> [41] readr_1.3.1 purrr_0.3.3 magrittr_1.5 feather_0.3.5
#> [45] codetools_0.2-16 ellipsis_0.3.0 htmltools_0.4.0 assertthat_0.2.1
#> [49] future_1.16.0 UNF_2.0.6 utf8_1.1.4 stringi_1.4.6
#> [53] crayon_1.3.4 R.oo_1.23.0 Created on 2020-04-07 by the reprex package (v0.3.0) also: I edited all above posts to use the |
Maybe it's because you ran it on the |
found it, it's tibble 3.0!! |
Hi @japhir , can you try if |
That's great! Thanks for implementing a fix so soon. I've updated to the re-import of one folder with caf fileslibrary(isoreader)
#>
#> Attaching package: 'isoreader'
#> The following object is masked from 'package:stats':
#>
#> filter
dir.create("/tmp/rtmp")
setwd("/tmp/rtmp")
cafs <- iso_read_dual_inlet("~/Documents/archive/pacman/cafs/180522_Stds/",
cache = FALSE,
read_cache = FALSE,
quiet = FALSE,
discard_duplicates = FALSE,
parallel = FALSE)
#> Info: preparing to read 22 data files...
#> Info: reading file '180522_Std_ETH-1_1.caf' with '.caf' reader
#> Info: reading file '180522_Std_ETH-1_2.caf' with '.caf' reader
#> Info: reading file '180522_Std_ETH-1_7.caf' with '.caf' reader
#> Info: reading file '180522_Std_ETH-1_8.caf' with '.caf' reader
#> Info: reading file '180522_Std_ETH-2_10.caf' with '.caf' reader
#> Info: reading file '180522_Std_ETH-2_3.caf' with '.caf' reader
#> Info: reading file '180522_Std_ETH-2_4.caf' with '.caf' reader
#> Info: reading file '180522_Std_ETH-2_9.caf' with '.caf' reader
#> Info: reading file '180522_Std_ETH-3_11.caf' with '.caf' reader
#> Info: reading file '180522_Std_ETH-3_12.caf' with '.caf' reader
#> Info: reading file '180522_Std_ETH-3_5.caf' with '.caf' reader
#> Info: reading file '180522_Std_ETH-3_6.caf' with '.caf' reader
#> Info: reading file '180523_Std_ETH-1_13.caf' with '.caf' reader
#> Info: reading file '180523_Std_ETH-1_14.caf' with '.caf' reader
#> Info: reading file '180523_Std_ETH-1_19.caf' with '.caf' reader
#> Info: reading file '180523_Std_ETH-1_20.caf' with '.caf' reader
#> Info: reading file '180523_Std_ETH-2_15.caf' with '.caf' reader
#> Info: reading file '180523_Std_ETH-2_16.caf' with '.caf' reader
#> Info: reading file '180523_Std_ETH-2_21.caf' with '.caf' reader
#> Info: reading file '180523_Std_ETH-2_22.caf' with '.caf' reader
#> Warning: caught error - cannot identify measured masses - block 'CResultDat...
#> Warning: caught error - cannot process vendor data table - block 'CResultDa...
#> Info: reading file '180523_Std_ETH-3_17.caf' with '.caf' reader
#> Info: reading file '180523_Std_ETH-3_18.caf' with '.caf' reader
#> Info: finished reading 22 files in 57.35 secs
#> Info: encountered 2 problems in total
#> # A tibble: 2 x 4
#> file_id type func details
#> <chr> <chr> <chr> <chr>
#> 1 180523_Std_ETH… error extract_caf_raw… cannot identify measured masses - bloc…
#> 2 180523_Std_ETH… error extract_caf_ven… cannot process vendor data table - blo…
iso_get_problems(cafs)
#> # A tibble: 2 x 4
#> file_id type func details
#> <chr> <chr> <chr> <chr>
#> 1 180523_Std_ETH… error extract_caf_raw… cannot identify measured masses - bloc…
#> 2 180523_Std_ETH… error extract_caf_ven… cannot process vendor data table - blo…
iso_get_file_info(cafs)
#> Info: aggregating file info from 22 data file(s)
#> # A tibble: 22 x 22
#> file_id file_root file_path file_subpath file_datetime file_size Line
#> <chr> <chr> <chr> <chr> <dttm> <int> <chr>
#> 1 180522… /home/ja… 180522_S… <NA> 2018-05-22 17:24:52 651970 1
#> 2 180522… /home/ja… 180522_S… <NA> 2018-05-22 18:03:27 668650 2
#> 3 180522… /home/ja… 180522_S… <NA> 2018-05-22 21:14:55 668682 1
#> 4 180522… /home/ja… 180522_S… <NA> 2018-05-22 21:54:16 668678 2
#> 5 180522… /home/ja… 180522_S… <NA> 2018-05-22 23:11:42 669030 2
#> 6 180522… /home/ja… 180522_S… <NA> 2018-05-22 18:42:25 668992 1
#> 7 180522… /home/ja… 180522_S… <NA> 2018-05-22 19:21:29 669014 2
#> 8 180522… /home/ja… 180522_S… <NA> 2018-05-22 22:33:02 668970 1
#> 9 180522… /home/ja… 180522_S… <NA> 2018-05-22 23:47:26 652032 1
#> 10 180522… /home/ja… 180522_S… <NA> 2018-05-23 00:26:31 668710 2
#> # … with 12 more rows, and 15 more variables: `Peak Center` <chr>,
#> # Pressadjust <chr>, Background <chr>, `Reference Refill` <chr>, `Weight
#> # [mg]` <chr>, Sample <chr>, `Identifier 1` <chr>, `Identifier 2` <chr>,
#> # Analysis <chr>, Comment <chr>, Preparation <chr>, `Pre Script` <chr>, `Post
#> # Script` <chr>, Method <chr>, MS_integration_time.s <dbl>
sessionInfo()
#> R version 3.6.3 (2020-02-29)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Arch Linux
#>
#> Matrix products: default
#> BLAS: /usr/lib/libopenblasp-r0.3.9.so
#> LAPACK: /usr/lib/liblapack.so.3.9.0
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] isoreader_1.1.5
#>
#> loaded via a namespace (and not attached):
#> [1] zip_2.0.4 Rcpp_1.0.4 pillar_1.4.3 compiler_3.6.3
#> [5] highr_0.8 prettyunits_1.1.1 progress_1.2.2 R.methodsS3_1.8.0
#> [9] R.utils_2.9.2 base64enc_0.1-3 tools_3.6.3 digest_0.6.25
#> [13] rhdf5_2.30.1 lubridate_1.7.8 evaluate_0.14 lifecycle_0.2.0
#> [17] tibble_3.0.0 pkgconfig_2.0.3 rlang_0.4.5 openxlsx_4.1.4
#> [21] cli_2.0.2 yaml_2.2.1 parallel_3.6.3 xfun_0.12
#> [25] xml2_1.3.0 dplyr_0.8.5 stringr_1.4.0 knitr_1.28
#> [29] generics_0.0.2 vctrs_0.2.4 globals_0.12.5 hms_0.5.3
#> [33] tidyselect_1.0.0 glue_1.4.0 listenv_0.8.0 R6_2.4.1
#> [37] fansi_0.4.1 rmarkdown_2.1 tidyr_1.0.2 Rhdf5lib_1.8.0
#> [41] readr_1.3.1 purrr_0.3.3 magrittr_1.5 feather_0.3.5
#> [45] codetools_0.2-16 ellipsis_0.3.0 htmltools_0.4.0 assertthat_0.2.1
#> [49] future_1.16.0 UNF_2.0.6 utf8_1.1.4 stringi_1.4.6
#> [53] crayon_1.3.4 R.oo_1.23.0 Created on 2020-04-08 by the reprex package (v0.3.0) |
It's just finished re-reading the 4928 caf files! It has now found 1376 files with problems, of which a lot are duplicate files. I get the below warning when I saved the aggregate file to rds with
Reading in the newly created summary rds file is still slow (20.43 secs! vs
logs of importing the summarized caf di filelibrary(isoreader)
#>
#> Attaching package: 'isoreader'
#> The following object is masked from 'package:stats':
#>
#> filter
setwd("~/SurfDrive/PhD/programming/dataprocessing")
cafs <- iso_read_dual_inlet("out/cafs.di.rds")
#> Info: preparing to read 1 data files (all will be cached)...
#> Info: reading file 'out/cafs.di.rds' with '.di.rds' reader
#> Info: loaded data for 4928 data files from R Data Storage - checking loaded...
#> Info: finished reading 1 files in 19.39 secs
#> Info: encountered 1376 problems in total
#> # A tibble: 1,376 x 4
#> file_id type func details
#> <chr> <chr> <chr> <chr>
#> 1 170127_170124_Sibren_… error extract_caf_r… cannot identify measured masses …
#> 2 170127_170124_Sibren_… error extract_caf_v… cannot process vendor data table…
#> 3 170127_170124_Sibren_… error extract_caf_r… cannot identify measured masses …
#> 4 170127_170124_Sibren_… error extract_caf_v… cannot process vendor data table…
#> 5 170127_170124_Sibren_… error extract_caf_r… cannot identify measured masses …
#> 6 170127_170124_Sibren_… error extract_caf_v… cannot process vendor data table…
#> 7 170127_170124_Sibren_… error extract_caf_r… cannot identify measured masses …
#> 8 170127_170124_Sibren_… error extract_caf_v… cannot process vendor data table…
#> 9 170127_170127_170124_… error extract_caf_r… cannot identify measured masses …
#> 10 170127_170127_170124_… error extract_caf_v… cannot process vendor data table…
#> # … with 1,366 more rows
iso_get_problems(cafs)
#> # A tibble: 1,376 x 4
#> file_id type func details
#> <chr> <chr> <chr> <chr>
#> 1 170127_170124_Sibren_… error extract_caf_r… cannot identify measured masses …
#> 2 170127_170124_Sibren_… error extract_caf_v… cannot process vendor data table…
#> 3 170127_170124_Sibren_… error extract_caf_r… cannot identify measured masses …
#> 4 170127_170124_Sibren_… error extract_caf_v… cannot process vendor data table…
#> 5 170127_170124_Sibren_… error extract_caf_r… cannot identify measured masses …
#> 6 170127_170124_Sibren_… error extract_caf_v… cannot process vendor data table…
#> 7 170127_170124_Sibren_… error extract_caf_r… cannot identify measured masses …
#> 8 170127_170124_Sibren_… error extract_caf_v… cannot process vendor data table…
#> 9 170127_170127_170124_… error extract_caf_r… cannot identify measured masses …
#> 10 170127_170127_170124_… error extract_caf_v… cannot process vendor data table…
#> # … with 1,366 more rows
iso_get_file_info(cafs)
#> Info: aggregating file info from 4928 data file(s)
#> Error: No common type for `170126_170124_Sibren_run29-1426.caf$file_datetime` <datetime<Europe/Amsterdam>> and `170621_170522_Guido_Magda_ETH-1-0000.caf$file_datetime` <integer>.
sessionInfo()
#> R version 3.6.3 (2020-02-29)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Arch Linux
#>
#> Matrix products: default
#> BLAS: /usr/lib/libopenblasp-r0.3.9.so
#> LAPACK: /usr/lib/liblapack.so.3.9.0
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] isoreader_1.1.5
#>
#> loaded via a namespace (and not attached):
#> [1] zip_2.0.4 Rcpp_1.0.4 pillar_1.4.3 compiler_3.6.3
#> [5] highr_0.8 prettyunits_1.1.1 progress_1.2.2 R.methodsS3_1.8.0
#> [9] R.utils_2.9.2 base64enc_0.1-3 tools_3.6.3 digest_0.6.25
#> [13] rhdf5_2.30.1 lubridate_1.7.8 evaluate_0.14 lifecycle_0.2.0
#> [17] tibble_3.0.0 pkgconfig_2.0.3 rlang_0.4.5 openxlsx_4.1.4
#> [21] cli_2.0.2 yaml_2.2.1 parallel_3.6.3 xfun_0.12
#> [25] xml2_1.3.0 dplyr_0.8.5 stringr_1.4.0 knitr_1.28
#> [29] generics_0.0.2 vctrs_0.2.4 globals_0.12.5 hms_0.5.3
#> [33] tidyselect_1.0.0 glue_1.4.0 listenv_0.8.0 R6_2.4.1
#> [37] fansi_0.4.1 rmarkdown_2.1 tidyr_1.0.2 Rhdf5lib_1.8.0
#> [41] readr_1.3.1 purrr_0.3.3 magrittr_1.5 feather_0.3.5
#> [45] codetools_0.2-16 ellipsis_0.3.0 htmltools_0.4.0 assertthat_0.2.1
#> [49] future_1.16.0 UNF_2.0.6 utf8_1.1.4 stringi_1.4.6
#> [53] crayon_1.3.4 R.oo_1.23.0 Created on 2020-04-08 by the reprex package (v0.3.0) |
Of course I should have just limited it to the two files that are actually indicated to be the error message. That would have saved me 2 hours of unnecessary computation ;-). Anyway, here they are included reprex ran on only those two fileslibrary(isoreader)
#>
#> Attaching package: 'isoreader'
#> The following object is masked from 'package:stats':
#>
#> filter
setwd("~/Downloads")
cafs <- iso_read_dual_inlet("problematic_2_files",
cache = FALSE,
read_cache = FALSE,
quiet = FALSE,
discard_duplicates = FALSE,
parallel = FALSE)
#> Info: preparing to read 2 data files...
#> Info: reading file 'problematic_2_files/170126_170124_Sibren_run29-1426.caf...
#> Info: reading file 'problematic_2_files/170621_170522_Guido_Magda_ETH-1-000...
#> Warning: caught error - no C_blocks available
#> Warning: caught error - no C_blocks available
#> Warning: Unknown or uninitialised column: `block`.
#> Warning: caught error - no C_blocks available
#> Warning: caught error - no C_blocks available
#> Warning: caught error - no C_blocks available
#> Warning: caught error - no C_blocks available
#> Warning: caught error - no C_blocks available
#> Info: finished reading 2 files in 3.38 secs
#> Info: encountered 7 problems in total
#> # A tibble: 7 x 4
#> file_id type func details
#> <chr> <chr> <chr> <chr>
#> 1 170621_170522_Guido_Magda_ET… error extract_isodat_old_seque… no C_blocks ava…
#> 2 170621_170522_Guido_Magda_ET… error extract_isodat_datetime no C_blocks ava…
#> 3 170621_170522_Guido_Magda_ET… error extract_MS_integration_t… no C_blocks ava…
#> 4 170621_170522_Guido_Magda_ET… error extract_caf_raw_voltage_… no C_blocks ava…
#> 5 170621_170522_Guido_Magda_ET… error extract_isodat_reference… no C_blocks ava…
#> 6 170621_170522_Guido_Magda_ET… error extract_isodat_resistors no C_blocks ava…
#> 7 170621_170522_Guido_Magda_ET… error extract_caf_vendor_data_… no C_blocks ava…
iso_get_problems(cafs)
#> # A tibble: 7 x 4
#> file_id type func details
#> <chr> <chr> <chr> <chr>
#> 1 170621_170522_Guido_Magda_ET… error extract_isodat_old_seque… no C_blocks ava…
#> 2 170621_170522_Guido_Magda_ET… error extract_isodat_datetime no C_blocks ava…
#> 3 170621_170522_Guido_Magda_ET… error extract_MS_integration_t… no C_blocks ava…
#> 4 170621_170522_Guido_Magda_ET… error extract_caf_raw_voltage_… no C_blocks ava…
#> 5 170621_170522_Guido_Magda_ET… error extract_isodat_reference… no C_blocks ava…
#> 6 170621_170522_Guido_Magda_ET… error extract_isodat_resistors no C_blocks ava…
#> 7 170621_170522_Guido_Magda_ET… error extract_caf_vendor_data_… no C_blocks ava…
iso_get_file_info(cafs)
#> Info: aggregating file info from 2 data file(s)
#> Error: No common type for `170126_170124_Sibren_run29-1426.caf$file_datetime` <datetime<Europe/Amsterdam>> and `170621_170522_Guido_Magda_ETH-1-0000.caf$file_datetime` <integer>.
sessionInfo()
#> R version 3.6.3 (2020-02-29)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Arch Linux
#>
#> Matrix products: default
#> BLAS: /usr/lib/libopenblasp-r0.3.9.so
#> LAPACK: /usr/lib/liblapack.so.3.9.0
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] isoreader_1.1.5
#>
#> loaded via a namespace (and not attached):
#> [1] zip_2.0.4 Rcpp_1.0.4 pillar_1.4.3 compiler_3.6.3
#> [5] highr_0.8 prettyunits_1.1.1 progress_1.2.2 R.methodsS3_1.8.0
#> [9] R.utils_2.9.2 base64enc_0.1-3 tools_3.6.3 digest_0.6.25
#> [13] rhdf5_2.30.1 lubridate_1.7.8 evaluate_0.14 lifecycle_0.2.0
#> [17] tibble_3.0.0 pkgconfig_2.0.3 rlang_0.4.5 openxlsx_4.1.4
#> [21] cli_2.0.2 yaml_2.2.1 parallel_3.6.3 xfun_0.12
#> [25] xml2_1.3.0 dplyr_0.8.5 stringr_1.4.0 knitr_1.28
#> [29] generics_0.0.2 vctrs_0.2.4 globals_0.12.5 hms_0.5.3
#> [33] tidyselect_1.0.0 glue_1.4.0 listenv_0.8.0 R6_2.4.1
#> [37] fansi_0.4.1 rmarkdown_2.1 tidyr_1.0.2 Rhdf5lib_1.8.0
#> [41] readr_1.3.1 purrr_0.3.3 magrittr_1.5 feather_0.3.5
#> [45] codetools_0.2-16 ellipsis_0.3.0 htmltools_0.4.0 assertthat_0.2.1
#> [49] future_1.16.0 UNF_2.0.6 utf8_1.1.4 stringi_1.4.6
#> [53] crayon_1.3.4 R.oo_1.23.0 Created on 2020-04-08 by the reprex package (v0.3.0) |
This seems resolved now, also in the master branch! Read/save speeds of the raw files are back to before and I don't get errors on saving the rds! Running |
Hi @japhir . The whole cache file system is actually revamped in the release yesterday (1.2.0) so cache files can be copied, have more useful files names to know which is which and allow skipping the data integrity checks for files that are up to date (should make reading By the way, notifications about isoverse are no in a repo for this purpose, take a look: isoverse/news#2 |
Hi @sebkopf, thanks for the notice. I've just updated to R 4.0 and the newest isoreader, but I think something must have gone wrong somewhere… re-reading the whole database took about twice as long as last time (with very few new files, as you can imagine) and while How can I help debug this? |
Hi @japhir, can you send a small excerpt of your entire collection? Nothing has changed in |
Just finished reading in everything. Didn't get any particular warnings on the newer did files, but got these on the caf files: (again, much slower than before).
All of the previously shared files in this thread should be good, the raw data haven't chaged. How big of a subset were you thinking? I was hesitant to share many earlier, but just asked my supervisor and he says it shouldn't be a problem to share some files. |
that's great! I was actually thinking not the raw files since they don't cause trouble for me but just parts of the isofile collection, so something like this:
as for that
|
Ok @sebkopf, here's the test file with 100 standards! I generated them like this: seb_sub <- dids %>%
iso_filter_files(Comment == "STD") # for standard
# evenly spaced throughout the record, not sure if it's sorted by file_datetime though,
# so could still be random.
seb_sub <- seb_sub[(floor(seq(1, length(seb_sub), length.out = 100)))] %>%
iso_save("out/for_seb.di.rds") I tried to have a look at where it's getting slow with profvis, but I don't really understand the graph so I'll leave that up to you ;-) library(profvis)
library(isoreader)
profvis({
dids <- iso_read_dual_inlet("out/for_seb.di.rds")
didinfo <- dids %>%
iso_get_file_info()
rawdata <- dids %>%
iso_get_raw_data()
}) output on my machineAttaching package: ‘isoreader’
The following object is masked from ‘package:stats’:
filter
Progress: [-----------------------------------------------------------------------------------------] 0/1 ( 0%) 0s
Info: preparing to read 1 data files (all will be cached)...
Progress: [-----------------------------------------------------------------------------------------] 0/1 ( 0%) 0s
Info: reading file 'out/for_seb.di.rds' with '.di.rds' reader...
Progress: [-----------------------------------------------------------------------------------------] 0/1 ( 0%) 0s
Info: loaded 100 data files from R Data Storage
Progress: [-----------------------------------------------------------------------------------------] 0/1 ( 0%) 0s
Progress: [=========================================================================================] 1/1 (100%) 0s
Info: finished reading 1 files in 0.19 secs
Info: encountered 19 problems in total
�[90m# A tibble: 19 x 4�[39m
file_id type func details
�[3m�[90m<chr>�[39m�[23m �[3m�[90m<chr>�[39m�[23m �[3m�[90m<chr>�[39m�[23m �[3m�[90m<chr>�[39m�[23m
�[90m 1�[39m 180223_1_Kiel Std tes… error extract_did_raw_v… cannot locate voltage data - block 'CTwoDoublesArrayData' not fo…
�[90m 2�[39m 180223_1_Kiel Std tes… error extract_did_vendo… cannot process vendor computed data table - block 'CDualInletEva…
�[90m 3�[39m 180517_29_RobinV_5_ET… warning iso_as_file_list duplicate files kept but with recoded file IDs: 180517_29_RobinV…
�[90m 4�[39m 180621_47_Chris_14_ET… error extract_did_raw_v… cannot locate voltage data - block 'CTwoDoublesArrayData' not fo…
�[90m 5�[39m 180621_47_Chris_14_ET… error extract_did_vendo… cannot process vendor computed data table - block 'CDualInletEva…
�[90m 6�[39m 180621_47_Chris_14_ET… warning iso_as_file_list duplicate files kept but with recoded file IDs: 180621_47_Chris_…
�[90m 7�[39m 180903_83_Cas_19_ETH-… warning iso_as_file_list duplicate files kept but with recoded file IDs: 180903_83_Cas_19…
�[90m 8�[39m 180915_88_WuyunCas_39… warning iso_as_file_list duplicate files kept but with recoded file IDs: 180915_88_WuyunC…
�[90m 9�[39m 180929_94_Ilja_37_ETH… warning iso_as_file_list duplicate files kept but with recoded file IDs: 180929_94_Ilja_3…
�[90m10�[39m 190514_195_NdW_25_ETH… warning iso_as_file_list duplicate files kept but with recoded file IDs: 190514_195_NdW_2…
�[90m11�[39m 190805_237_RvdP_5_ETH… error extract_did_raw_v… cannot locate voltage data - block 'CTwoDoublesArrayData' not fo…
�[90m12�[39m 190805_237_RvdP_5_ETH… error extract_did_vendo… cannot process vendor computed data table - block 'CDualInletEva…
�[90m13�[39m 191125_295_MM_16_ETH-… error extract_did_raw_v… cannot locate voltage data - block 'CTwoDoublesArrayData' not fo…
�[90m14�[39m 191125_295_MM_16_ETH-… error extract_did_vendo… cannot process vendor computed data table - block 'CDualInletEva…
�[90m15�[39m 200110_311_NdW_43_ETH… warning iso_as_file_list duplicate files kept but with recoded file IDs: 200110_311_NdW_4…
�[90m16�[39m 180316_4_Std test_6_E… warning iso_as_file_list duplicate files kept but with recoded file IDs: 180316_4_Std tes…
�[90m17�[39m 180831_83_Cas_9_ETH-3… error extract_did_raw_v… cannot locate voltage data - block 'CTwoDoublesArrayData' not fo…
�[90m18�[39m 180831_83_Cas_9_ETH-3… error extract_did_vendo… cannot process vendor computed data table - block 'CDualInletEva…
�[90m19�[39m 180831_83_Cas_9_ETH-3… warning iso_as_file_list duplicate files kept but with recoded file IDs: 180831_83_Cas_9_…
Info: aggregating file info from 100 data file(s)
Info: aggregating raw data from 100 data file(s) |
regarding the debugging request: this doesn't work because of the duplicated files options(warn = 2)
isoreader:::iso_turn_debug_on(catch_errors = FALSE)
setwd("~/Documents/archive/")
isoreader::iso_read_dual_inlet("~/Documents/archive/pacman/cafs",
discard_duplicates = FALSE) outputInfo: debug mode turned on, error catching turned off, caching turned off
Error: (converted from warning) some files from different folders have identical file names:
~/Documents/archive/pacman/cafs/170402_Sibren_8.2 event/170402_Sibren_8(1).caf
~/Documents/archive/pacman/cafs/170402_Sibren_8.2 event/170402_Sibren_8(2).caf
~/Documents/archive/pacman/cafs/170402_Sibren_8.2 event/170402_Sibren_8(3).caf
~/Documents/archive/pacman/cafs/170402_Sibren_8.2 event/170402_Sibren_8(4).caf
~/Documents/archive/pacman/cafs/170402_Sibren_8.2 event/170402_Sibren_8(5).caf
~/Documents/archive/pacman/cafs/170402_Sibren_8.2 event/170402_Sibren_8(6).caf
~/Documents/archive/pacman/cafs/170402_Sibren_8.2 event/170402_Sibren_8(7).caf
~/Documents/archive/pacman/cafs/170402_Sibren_8.2 event/170402_Sibren_8(8).caf
~/Documents/archive/pacman/cafs/170402_Sibren_8.2 event/170402_Sibren_8(9).caf
~/Documents/archive/pacman/cafs/170402_Sibren_8.2 event/deel2/170402_Sibren_8(1).caf
~/Documents/archive/pacman/cafs/170402_Sibren_8.2 event/deel2/170402_Sibren_8(2).caf
~/Documents/a |
Having to work from home got me quite frustrated with the extremely slow vpn connection I have to the rawdata samba drive, so I copied everything over with some nice
rsync
scripts. I used the-t
flag in rsync, which is supposed to preserve modification times. This seems to have gone wrong however:Now I did manage to read in all the data, but when I try to
iso_get_file_info()
for all ~15k files, it results in the below errors:Running any of the other isoreader functions is also extremely slow: just reading in the 104MB rds file with
dids <- iso_read_dual_inlet("out/dids.di.rds")
takes ~2.11 minutes, probably because it's performing some checks?read_rds("out/dids.di.rds")
takes approximately 7 seconds.iso_filter_files()
is also non-functional on the whole dataset.Any ideas on how to fix this?
The text was updated successfully, but these errors were encountered: