Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

updated file conversion? #17

Closed
devonorourke opened this issue Aug 27, 2018 · 9 comments
Closed

updated file conversion? #17

devonorourke opened this issue Aug 27, 2018 · 9 comments

Comments

@devonorourke
Copy link

devonorourke commented Aug 27, 2018

Hi guys,
In following your first tutorial it appears that newer versions of pcadapt require vcf --> bed format conversion for large files. I'm wondering where I'm going wrong. The vcf --> bed file conversion was straightforward in PLINK, and the read.pcadapt function appears to accept my modified file:

SEpre <- read.pcadapt(SEpreBedPath, type = "bed")

> class(SEpre)
[1] "pcadapt_bed"

> str(SEpre)
Class 'pcadapt_bed'  atomic [1:1] /mnt/lustre/macmaneslab/devon/exome/vcfs/pcadapt/SEpre.ped
  ..- attr(*, "n")= int 31
  ..- attr(*, "p")= int 31

However the pcadapt function appears to fail any time I load the object (SEpre) resulting form the read.pcadapt function:

> x.SEpre.K2 <- pcadapt(input = SEpre, K = 2)
Error in bedXPtr(input, n, p) : File is not a binary PED file.

Any idea where I'm going wrong?

Perhaps you can also update your documentation regarding these file conversions and note that for large files you are now required to convert the vcf to a PLINK-formatted .bed file? Your program provides error messages that are helpful at discerning this is the case, but these weren't posted in your tutorial/manual.
In addition it would be great if you can add to the documentation that you must specify the new file as type = "bed" when running the read.pcadapt function.

I've posted the R sessionInfo() below in case that's helpful.
Thanks for your help

R version 3.4.2 (2017-09-28)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Scientific Linux 7.3 (Nitrogen)

Matrix products: default
BLAS: /mnt/lustre/software/linuxbrew/colsa-20171116/Cellar/r/3.4.2/lib/R/lib/libRblas.so
LAPACK: /mnt/lustre/software/linuxbrew/colsa-20171116/Cellar/r/3.4.2/lib/R/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] plyr_1.8.4    pcadapt_4.0.3

loaded via a namespace (and not attached):
 [1] pinfsc50_1.1.0    Rcpp_0.12.17      pillar_1.2.1      compiler_3.4.2   
 [5] vcfR_1.8.0        bindr_0.1.1       tools_3.4.2       digest_0.6.15    
 [9] jsonlite_1.5      tibble_1.4.2      gtable_0.2.0      nlme_3.1-131     
[13] viridisLite_0.3.0 lattice_0.20-35   mgcv_1.8-23       pkgconfig_2.0.1  
[17] rlang_0.2.1       Matrix_1.2-12     parallel_3.4.2    bindrcpp_0.2.2   
[21] cluster_2.0.6     dplyr_0.7.6       httr_1.3.1        htmlwidgets_0.9  
[25] grid_3.4.2        tidyselect_0.2.4  glue_1.2.0        data.table_1.11.4
[29] R6_2.2.2          plotly_4.7.1      ggplot2_3.0.0     purrr_0.2.5      
[33] tidyr_0.7.2       magrittr_1.5      scales_0.5.0      htmltools_0.3.6  
[37] MASS_7.3-48       assertthat_0.2.0  permute_0.9-4     colorspace_1.3-2 
[41] ape_5.1           lazyeval_0.2.1    munsell_0.4.3     vegan_2.5-2  
@devonorourke devonorourke changed the title update file conversion updated file conversion? Aug 27, 2018
@privefl
Copy link
Member

privefl commented Aug 27, 2018

From the extension, it seems that you have a .ped (plain text file) rather than a .bed (binary file).

@devonorourke
Copy link
Author

Sorry for the delayed reply. It does indeed seem odd that the what I've uploaded is a plain text file instead of a bed, but I can assure you it's not a .ped file. That's what's confusing me! :)

I take a .vcf file (which was too big to load) and run this PLINK (PLINK v1.90b5.2) command:

plink 
  --allow-extra-chr
  --out SE
  --vcf SEcommon.vcf

This produces a trio of .fam, .bed, and .bim files, including that specious binary file...

To confirm this is indeed a binary, I ran another PLINK test:

plink --adjust --allow-extra-chr --allow-no-sex --assoc \
--bfile SE

This works great. However, if I swap out the last line to --file SE instead of --bfile SE the command fails. The file I'm using is a binary file, as best as I can tell.

I've attached this fairly small file to this email in case that helps your troubleshooting.

Thanks again for your help

SE.bed.zip

@privefl
Copy link
Member

privefl commented Aug 30, 2018

readBin("~/Téléchargements/SE.bed", what = 1L, n = 3, size = 1, signed = FALSE)
[1] 108  27   1

Seems OK.
Do you have the bim and fam files too?

@devonorourke
Copy link
Author

Sure thing (see below). Note I end up manually editing the .fam file because the info gets transposed incorrectly in that file.

Archive.zip

@privefl
Copy link
Member

privefl commented Aug 30, 2018

I'm able to run pcadapt() succesfully with the example you provided.

@devonorourke
Copy link
Author

devonorourke commented Aug 30, 2018 via email

@privefl
Copy link
Member

privefl commented Aug 30, 2018

bed <- "~/Téléchargements/SE.bed"
readBin(bed, what = 1L, n = 3, size = 1, signed = FALSE)

# devtools::install_github("bcm-uga/pcadapt")
library(pcadapt)
SEpre <- read.pcadapt(bed, type = "bed")
obj <- pcadapt(input = SEpre, K = 2)
plot(obj)

@devonorourke
Copy link
Author

thanks very much!
I'm starting to think there might have been a version history issue?
I've been testing this on our compute cluster (as that's where all the data lives) but when transferring this snippet of data to my laptop, which has the most recent CRAN version of pcadapt loaded, things worked without issue.
so strange.
thanks for the support!

@privefl
Copy link
Member

privefl commented Aug 30, 2018

Don't worry, thanks for reporting.
Don't hesitate if you have any further question.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants