Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error (?) - strata_besthits #7

Closed
Piergiorge opened this issue Dec 24, 2018 · 2 comments
Closed

Error (?) - strata_besthits #7

Piergiorge opened this issue Dec 24, 2018 · 2 comments

Comments

@Piergiorge
Copy link

Piergiorge commented Dec 24, 2018

Hello!
I am trying to use phylostratr to analyze a subset of 2000 human proteins. I can load the BLAST results, but I find an error when I try to recover the best hits.

My code:

library(devtools)
source("https://bioconductor.org/biocLite.R")
library(phylostratr)
library(magrittr)
library(plotly)

weights=uniprot_weight_by_ref()
focal_taxid <- '9606'
strata <-
  uniprot_strata(focal_taxid, from=2L) %>%
  strata_apply(f=diverse_subtree, n=5L, weights=weights) %>%
  use_recommended_prokaryotes %>%
  uniprot_fill_strata
strata@data$faa[['9606']] <- "my_subset.faa"
strata_blast(strata, blast_args=list(nthreads=8)) %>%
  strata_besthits %>%
  merge_besthits

#Error in mutate_impl(.data, dots) : 
#Evaluation error: Column `staxid` not found in `.data`
#Call `rlang::last_error()` to see a backtrace.

rlang::last_error()
#<error>
#message: Column `staxid` not found in `.data`
#class:   `rlang_data_pronoun_not_found`
#backtrace:
#-phylostratr::strata_blast(strata, blast_args = list(nthreads = 4))
#-phylostratr::strata_besthits(.)
#Call `summary(rlang::last_error())` to see the full backtrace

summary(rlang::last_error())

#<error>
#message: Column `staxid` not found in `.data`
#class:   `rlang_data_pronoun_not_found`
#fields:  `message`, `trace` and `parent`
#backtrace:
#x
#+-`%>%`(...)
#| +-base::withVisible(eval(quote(`_fseq`(`_lhs`)), env, env))
#| \-base::eval(quote(`_fseq`(`_lhs`)), env, env)
#|   \-base::eval(quote(`_fseq`(`_lhs`)), env, env)
#|     \-global::`_fseq`(`_lhs`)
#|       \-magrittr::freduce(value, `_function_list`)
#|         \-function_list[[i]](value)
#|           \-phylostratr::strata_besthits(.)
#|             \-base::lapply(taxa, get_besthit, strata = strata)
#|               \-phylostratr:::FUN(X[[i]], ...)
#|                 \-`%>%`(...)
#|                   +-base::withVisible(eval(quote(`_fseq`(`_lhs`)), env, env))
#|                   \-base::eval(quote(`_fseq`(`_lhs`)), env, env)
#|                     \-base::eval(quote(`_fseq`(`_lhs`)), env, env)
#|                       \-phylostratr:::`_fseq`(`_lhs`)
#|                         \-magrittr::freduce(value, `_function_list`)
#|                           +-base::withVisible(function_list[[k]](value))
#|                           \-function_list[[k]](value)
#|                             \-phylostratr::get_max_hit(.)
#|                               \-`%>%`(...)
#|                                 +-base::withVisible(eval(quote(`_fseq`(`_lhs`)), env, env))
#|                                 \-base::eval(quote(`_fseq`(`_lhs`)), env, env)
#|                                   \-base::eval(quote(`_fseq`(`_lhs`)), env, env)
#|                                     \-phylostratr:::`_fseq`(`_lhs`)
#|                                       \-magrittr::freduce(value, `_function_list`)
#|                                         \-function_list[[i]](value)
#|                                           +-dplyr::group_by(., .data$qseqid, .data$staxid)
#|                                           \-dplyr:::group_by.data.frame(., .data$qseqid, .data$staxid)
#|                                             \-dplyr::group_by_prepare(.data, ..., add = add)
#|                                               \-dplyr:::add_computed_columns(.data, new_groups)
#|                                                 +-dplyr::mutate(.data, !!!mutate_vars)
#|                                                 \-dplyr:::mutate.tbl_df(.data, !!!mutate_vars)
#|                                                   \-dplyr:::mutate_impl(.data, dots)
#+-base::tryCatch(...)
#| \-base:::tryCatchList(expr, classes, parentenv, handlers)
#|   +-base:::tryCatchOne(...)
#|   | \-base:::doTryCatch(return(expr), name, parentenv, handler)
#|   \-base:::tryCatchList(expr, names[-nh], parentenv, handlers[-nh])
#|     \-base:::tryCatchOne(expr, names, parentenv, handlers[[1L]])
#|       \-base:::doTryCatch(return(expr), name, parentenv, handler)
#+-base::evalq(.data$staxid, <environment>)
#| \-base::evalq(.data$staxid, <environment>)
#|   +-staxid
#|   \-rlang:::`$.rlang_data_pronoun`(.data, staxid)
#|     \-rlang:::data_pronoun_get(x, nm)
#\-rlang:::abort_data_pronoun(x)

Thanks,

Rafael

@arendsee
Copy link
Owner

Thanks for the report @Piergiorge !

This new commit should fix the problem. Previously phylostratr required you add an staxid column to your BLAST files. This was not really necessary, since phylostratr already knows the taxonomy IDs and should be able to generate the column for you. Now it can.

I've also added a bunch of new tests and runtime assertions, so you should get more intuitive error messages if the BLAST files are not formatted the way phylostratr wants them.

If this problem isn't fixed, feel free to reopen the issue.

Hope this helps!

@arendsee
Copy link
Owner

@Piergiorge btw, congrats on your first GitHub issue report! Just a little style note: you can paste syntax-highlighted code blocks like this

``` R
foo <- function(x) x
```

which will render nicely as so:

foo <- function(x) x

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants