Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dfm Error in qatd_cpp_tokens_replace #1765

Closed
almogsi opened this issue Oct 29, 2019 · 0 comments
Closed

dfm Error in qatd_cpp_tokens_replace #1765

almogsi opened this issue Oct 29, 2019 · 0 comments
Assignees
Labels

Comments

@almogsi
Copy link

almogsi commented Oct 29, 2019

Describe the bug

quanteda::dfm() get's stuck in certain tokenized tweets. I'm getting:

"Error in qatd_cpp_tokens_replace(x, type, ids_pat, ids_repl) :
Not compatible with requested type: [type=NULL; target=double]."

Reproducible code

stucked_tweet <-  "@POTUS Lol too funny yes we're all tired of Jim #Acosta too President Trump Gets Angry At Jim Acosta of CNN… https://t.co/gzUw4PF4s8"
t <- tokens(tolower(stucked_tweet), remove_numbers = T, remove_punct = T, remove_url=T)
tt <- tokens_remove(t, pattern = "^@\\b", valuetype = "regex")
t.dfm <- dfm(tt, remove_twitter = T)

## System information

Please run sessionInfo() and paste the output.

sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18362)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] quanteda_1.5.1

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.2         rstudioapi_0.10    magrittr_1.5       stopwords_1.0      tidyselect_0.2.5   munsell_0.5.0     
 [7] colorspace_1.4-1   lattice_0.20-38    R6_2.4.0           rlang_0.4.1        fastmatch_1.1-0    stringr_1.4.0     
[13] dplyr_0.8.3        tools_3.6.1        grid_3.6.1         data.table_1.12.6  gtable_0.3.0       spacyr_1.2        
[19] RcppParallel_4.4.4 lazyeval_0.2.2     assertthat_0.2.1   tibble_2.1.3       crayon_1.3.4       Matrix_1.2-17     
[25] purrr_0.3.3        ggplot2_3.2.1      rsconnect_0.8.15   glue_1.3.1         stringi_1.4.3      compiler_3.6.1    
[31] pillar_1.4.2       scales_1.0.0       lubridate_1.7.4    pkgconfig_2.0.3  

Any idea? thanks in advance!

@kbenoit kbenoit added this to the Last 1.5.x release milestone Oct 31, 2019
@koheiw koheiw self-assigned this Nov 1, 2019
@koheiw koheiw added the bug label Nov 1, 2019
koheiw added a commit that referenced this issue Nov 1, 2019
@koheiw koheiw mentioned this issue Nov 1, 2019
@kbenoit kbenoit closed this as completed Nov 22, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants