Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The bold_seqspec function is returning a weird data frame with only 1 variable and 7 occurrences #66

Closed
tadeu95 opened this issue Jun 27, 2019 · 4 comments
Milestone

Comments

@tadeu95
Copy link

tadeu95 commented Jun 27, 2019

I'm not being able to run the function because it returns a data frame with only 7 observations and 1 variable. The thing is I've ran the function times and times again and it worked fine.
The function I used was this:

list_species<-function(groups){
  groups=c("Actinopterygii","Sarcopterygii","Elasmobranchii","Holocephali","Cyclostomata")
  fish<-bold_seqspec(taxon=groups, format = "tsv", marker="COI-5P")
  fish2<-fish[(!(is.na(fish$lat)) | fish$country!="") & (fish$species_name!=""),]
  fish2$number<-str_count(fish2$nucleotides, pattern="[A-Z]")
  fish3<-fish2[(fish2$number>499),]
  assign('fish3',fish3,envir=.GlobalEnv)
}
list_species(groups)

My session info:

R version 3.5.1 (2018-07-02)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=Portuguese_Portugal.1252  LC_CTYPE=Portuguese_Portugal.1252    LC_MONETARY=Portuguese_Portugal.1252 LC_NUMERIC=C                        
[5] LC_TIME=Portuguese_Portugal.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] ggplot2_3.2.0     dplyr_0.8.1       fingerprint_3.5.7 readr_1.3.1       stringr_1.4.0     worms_0.2.2       plyr_1.8.4        httr_1.4.0       
 [9] rentrez_1.2.2     data.table_1.12.2 bold_0.8.6        seqRFLP_1.0.1    

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.1       pillar_1.4.1     compiler_3.5.1   tools_3.5.1      jsonlite_1.6     tibble_2.1.3     gtable_0.3.0     pkgconfig_2.0.2 
 [9] rlang_0.3.4      rstudioapi_0.10  crul_0.7.4       curl_3.3         yaml_2.2.0       withr_2.1.2      xml2_1.2.0       hms_0.4.2       
[17] triebeard_0.3.0  grid_3.5.1       tidyselect_0.2.5 reshape_0.8.8    glue_1.3.1       httpcode_0.2.0   R6_2.4.0         XML_3.98-1.20   
[25] purrr_0.3.2      magrittr_1.5     urltools_1.7.3   scales_1.0.0     assertthat_0.2.1 colorspace_1.4-1 stringi_1.4.3    lazyeval_0.2.2  
[33] munsell_0.5.0    crayon_1.3.4  

Thank you so much in advance for any response

@sckott sckott added this to the v1.0 milestone Jun 27, 2019
sckott added a commit that referenced this issue Jun 27, 2019
@sckott
Copy link
Contributor

sckott commented Jun 27, 2019

thanks for your question @tadeu95 !

First, what you're doing is not what you think you're doing.

There is no parameter fish in the function bold_seqspec(). So if we do

bold_seqspec(fish=groups, format = "tsv", marker="COI-5P", verbose = TRUE)

We can see that the actual request is

http://v4.boldsystems.org/index.php/API_Public/combined?marker=COI-5P&combined_download=tsv

Without any of your taxon names.

Second, if you do use the taxon = group parameter the query is too large and the BOLD server returns an error but they don't do it correctly so the function can't return that information to the user. I know it's not ideal, but if you can use bold_specimens and/or bold_seq those seem to work fine on most of your taxa, maybe for the bigger one Actinopterygii you can break that up into smaller taxonomic groups that will work - to get smaller groups you can try e.g., downstream("Actinopterygii", db = "ncbi", downto = "class") where downto is the rank you want to get, so put some rank in there you want to get

@tadeu95
Copy link
Author

tadeu95 commented Jun 27, 2019

So I corrected the script, it was a mistake, it's really "taxon=groups", I automatically replaced the word fish and forgot it was there. I didn't have the parameter "verbose" before and it worked well. What does verbose do?
The thing is I've ran this function many times with those taxonomic groups and it worked perfectly, until yesterday when it started giving me these problems. And I've also tried it with smaller groups and it's still not working. I discovered that the BOLD API in the BOLD website isn't working well, do you think that could be the reason?
Thank you for the answer

@sckott
Copy link
Contributor

sckott commented Jun 27, 2019

verbose is one of many curl options you can pass in to the http request call. see ?curl::curl_options

I've ran this function many times with those taxonomic groups and it worked perfectly, until yesterday when it started giving me these problems.

that probably means BOLD website is having problems - and at some point it will be fine again - of course this is out of our control unforunately

@tadeu95
Copy link
Author

tadeu95 commented Jun 27, 2019

Ok, thank you for the answers, I'm sure it will be up again soon

@sckott sckott closed this as completed Jan 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants