radiator::filter_rad generating individual stats error #58

SarahEmel · 2019-08-20T04:23:49Z

Hi Thierry,

I am attempting to use radiator::filter_rad to filter a populations.snps.vcf file from Stacks, but I am consistently getting an error in ''generating individual stats" and then I do not get the output files I've specified. Based on testing with different arguments and versions of the input file (including after filtering out some missing data in vcftools) and on reading other issues here, I suspect there might be a problem with my vcf header that I'm unaware of.

I appreciate any help you can offer. Thanks for making this package and I hope to be able to make it part of my workflow!

-Sarah

the exact command (function, arguments, values) used

All of these lead to the same error:
myfiltereddata <- filter_rad(
data = "populations.snps.vcf",
filter.short.ld = "random",
filter.long.ld = 0.5,
filter.hwe = TRUE,
strata = "strata_locyear.txt",
output = c("vcf","genind"),
filename = "filter_rad_output")

myfiltereddata <- filter_rad(
data = "populations.snps.vcf",
filter.short.ld = "random",
filter.long.ld = 0.5,
filter.hwe = TRUE,
strata = NULL,
output = c("vcf","genind"),
filename = "filter_rad_output")

myfiltereddata <- filter_rad(
data = "populations.snps.vcf",
strata = NULL,
interactive.filter = TRUE)

myfiltereddata <- filter_rad(
data = "populations.snps.vcftools.recode",
strata = NULL,
interactive.filter = TRUE)

myfiltereddata <- filter_rad(
data = "RH_subset.vcf",
strata = NULL,
interactive.filter = TRUE)

the complete error message you're getting

myfiltereddata <- filter_rad(

data = "RH_subset.vcf",
strata = NULL,
interactive.filter = TRUE)
################################################################################
############################# radiator::filter_rad #############################
################################################################################
The function arguments names have changed: please read documentation

Execution date@time: 20190820@0008
Folder created: filter_rad_20190820@0008
Function call and arguments stored in: radiator_filter_rad_args_20190820@0008.tsv
File written: random.seed (563656)
Filters parameters file generated: filters_parameters_20190820@0008.tsv

Reading VCF
Data summary:
number of samples: 94
number of markers: 185
done! timing: 0 sec

Generating individual stats...
Error in if (stats::sd(id.info$COVERAGE_MEAN) != 0) { :
missing value where TRUE/FALSE needed

Computation time, overall: 1 sec
############################# completed filter_rad #############################
RH_subset.vcf.zip

the output of devtools::session_info()

############################# completed filter_rad #############################

devtools::session_info()
─ Session info ──────────────────────────────────────────────────────────────────────────────────
setting value
version R version 3.6.0 (2019-04-26)
os macOS High Sierra 10.13.6
system x86_64, darwin15.6.0
ui RStudio
language (EN)
collate en_US.UTF-8
ctype en_US.UTF-8
tz America/New_York
date 2019-08-20

─ Packages ──────────────────────────────────────────────────────────────────────────────────────
package * version date lib source
assertthat 0.2.1 2019-03-21 [1] CRAN (R 3.6.0)
backports 1.1.4 2019-04-10 [1] CRAN (R 3.6.0)
Biobase 2.44.0 2019-05-02 [1] Bioconductor
BiocGenerics 0.30.0 2019-05-02 [1] Bioconductor
Biostrings 2.52.0 2019-05-02 [1] Bioconductor
bitops 1.0-6 2013-08-17 [1] CRAN (R 3.6.0)
boot 1.3-23 2019-07-05 [1] CRAN (R 3.6.0)
broom 0.5.2 2019-04-07 [1] CRAN (R 3.6.0)
callr 3.3.1 2019-07-18 [1] CRAN (R 3.6.0)
cli 1.1.0 2019-03-19 [1] CRAN (R 3.6.0)
crayon 1.3.4 2017-09-16 [1] CRAN (R 3.6.0)
data.table 1.12.2 2019-04-07 [1] CRAN (R 3.6.0)
desc 1.2.0 2018-05-01 [1] CRAN (R 3.6.0)
devtools 2.1.0 2019-07-06 [1] CRAN (R 3.6.0)
digest 0.6.20 2019-07-04 [1] CRAN (R 3.6.0)
dplyr 0.8.3 2019-07-04 [1] CRAN (R 3.6.0)
fs 1.3.1 2019-05-06 [1] CRAN (R 3.6.0)
gdsfmt 1.20.0 2019-05-02 [1] Bioconductor
generics 0.0.2 2018-11-29 [1] CRAN (R 3.6.0)
GenomeInfoDb 1.20.0 2019-05-02 [1] Bioconductor
GenomeInfoDbData 1.2.1 2019-08-02 [1] Bioconductor
GenomicRanges 1.36.0 2019-05-02 [1] Bioconductor
glue 1.3.1 2019-03-12 [1] CRAN (R 3.6.0)
GWASExactHW 1.01 2013-01-05 [1] CRAN (R 3.6.0)
hms 0.5.0 2019-07-09 [1] CRAN (R 3.6.0)
IRanges 2.18.1 2019-05-31 [1] Bioconductor
jomo 2.6-9 2019-07-29 [1] CRAN (R 3.6.0)
lattice 0.20-38 2018-11-04 [1] CRAN (R 3.6.0)
lme4 1.1-21 2019-03-05 [1] CRAN (R 3.6.0)
logistf 1.23 2018-07-19 [1] CRAN (R 3.6.0)
magrittr 1.5 2014-11-22 [1] CRAN (R 3.6.0)
MASS 7.3-51.4 2019-03-31 [1] CRAN (R 3.6.0)
Matrix 1.2-17 2019-03-22 [1] CRAN (R 3.6.0)
memoise 1.1.0 2017-04-21 [1] CRAN (R 3.6.0)
mgcv 1.8-28 2019-03-21 [1] CRAN (R 3.6.0)
mice 3.6.0 2019-07-10 [1] CRAN (R 3.6.0)
minqa 1.2.4 2014-10-09 [1] CRAN (R 3.6.0)
mitml 0.3-7 2019-01-07 [1] CRAN (R 3.6.0)
nlme 3.1-140 2019-05-12 [1] CRAN (R 3.6.0)
nloptr 1.2.1 2018-10-03 [1] CRAN (R 3.6.0)
nnet 7.3-12 2016-02-02 [1] CRAN (R 3.6.0)
pan 1.6 2018-06-29 [1] CRAN (R 3.6.0)
pillar 1.4.2 2019-06-29 [1] CRAN (R 3.6.0)
pkgbuild 1.0.3 2019-03-20 [1] CRAN (R 3.6.0)
pkgconfig 2.0.2 2018-08-16 [1] CRAN (R 3.6.0)
pkgload 1.0.2 2018-10-29 [1] CRAN (R 3.6.0)
prettyunits 1.0.2 2015-07-13 [1] CRAN (R 3.6.0)
processx 3.4.1 2019-07-18 [1] CRAN (R 3.6.0)
ps 1.3.0 2018-12-21 [1] CRAN (R 3.6.0)
purrr 0.3.2 2019-03-15 [1] CRAN (R 3.6.0)
R6 2.4.0 2019-02-14 [1] CRAN (R 3.6.0)
radiator * 1.1.2 2019-08-20 [1] Github (53a137d)
Rcpp 1.0.2 2019-07-25 [1] CRAN (R 3.6.0)
RCurl 1.95-4.12 2019-03-04 [1] CRAN (R 3.6.0)
readr 1.3.1 2018-12-21 [1] CRAN (R 3.6.0)
remotes 2.1.0 2019-06-24 [1] CRAN (R 3.6.0)
rlang 0.4.0 2019-06-25 [1] CRAN (R 3.6.0)
rpart 4.1-15 2019-04-12 [1] CRAN (R 3.6.0)
rprojroot 1.3-2 2018-01-03 [1] CRAN (R 3.6.0)
rstudioapi 0.10 2019-03-19 [1] CRAN (R 3.6.0)
S4Vectors 0.22.0 2019-05-02 [1] Bioconductor
SeqArray 1.24.2 2019-07-12 [1] Bioconductor
SeqVarTools 1.22.0 2019-05-02 [1] Bioconductor
sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 3.6.0)
stringi 1.4.3 2019-03-12 [1] CRAN (R 3.6.0)
survival 2.44-1.1 2019-04-01 [1] CRAN (R 3.6.0)
testthat 2.2.1 2019-07-25 [1] CRAN (R 3.6.0)
tibble 2.1.3 2019-06-06 [1] CRAN (R 3.6.0)
tidyr 0.8.3 2019-03-01 [1] CRAN (R 3.6.0)
tidyselect 0.2.5 2018-10-11 [1] CRAN (R 3.6.0)
usethis 1.5.1 2019-07-04 [1] CRAN (R 3.6.0)
vctrs 0.2.0 2019-07-05 [1] CRAN (R 3.6.0)
withr 2.1.2 2018-03-15 [1] CRAN (R 3.6.0)
XVector 0.24.0 2019-05-02 [1] Bioconductor
yaml 2.2.0 2018-07-25 [1] CRAN (R 3.6.0)
zeallot 0.1.0 2018-01-28 [1] CRAN (R 3.6.0)
zlibbioc 1.30.0 2019-05-02 [1] Bioconductor

[1] /Library/Frameworks/R.framework/Versions/3.6/Resources/library

complete data required to reproduce the problem, a subset of it. The data remains confidential.

See attached subset

The text was updated successfully, but these errors were encountered:

thierrygosselin · 2019-08-20T19:21:50Z

If your subset is a good representation of the original dataset, the problem is that your VCF as individual's that are missing all of their genotypes. My code was not prepared to handle this and the code throws an error when it tries to do a standard SD with base R code (it doesn't accept NA, NaN , etc).

The new commit should fix that.

I suggest looking at the file that start with individuals.qc.stats and highlight the bad samples.

SarahEmel · 2019-09-04T17:33:11Z

Thank you Thierry! It took me a while to get everything working with my dataset, but I am able to run filter_rad now without errors. We had a lot of variation in reads per individual and there was one bad sample that had to be removed in addition to other filtering.

thierrygosselin · 2019-09-04T18:30:28Z

ok glad I could help!
lots of variation in reads per individuals can be lowered by...

Prior to sequencing (the most important steps):

good DNA
a puch or a tool that takes similar amount of tissue
using a robot for DNA extractions
having experienced people behind the wet lab bench
having the same, experienced, lab tech behind the wet bench
not having a student behind the wet lab bench
you get the idea ;)

bioinformatically:

normalization...if the variation generates weird individual heterozygosity pattern (check the Manhattan plot in the folder detect_mixed_genomes

thierrygosselin closed this as completed Aug 20, 2019

greavess mentioned this issue May 31, 2024

Work around for .DynamicClusterCall error #189

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

radiator::filter_rad generating individual stats error #58

radiator::filter_rad generating individual stats error #58

SarahEmel commented Aug 20, 2019 •

edited

Loading

thierrygosselin commented Aug 20, 2019 •

edited

Loading

SarahEmel commented Sep 4, 2019

thierrygosselin commented Sep 4, 2019

radiator::filter_rad generating individual stats error #58

radiator::filter_rad generating individual stats error #58

Comments

SarahEmel commented Aug 20, 2019 • edited Loading

thierrygosselin commented Aug 20, 2019 • edited Loading

SarahEmel commented Sep 4, 2019

thierrygosselin commented Sep 4, 2019

SarahEmel commented Aug 20, 2019 •

edited

Loading

thierrygosselin commented Aug 20, 2019 •

edited

Loading