Skip to content

wordscores's predicted value become NA #1380

@koheiw

Description

@koheiw

Describe the bug

Wordscore fails to predict

Reproducible code

Please paste minimal code that reproduces the bug. If possible, please upload the data file as .rds.

ws <- textmodel_wordscores(data_dfm_lbgexample, c(NA, 1.5, 1.0, NA, NA, NA))
predict(ws, newdata = data_dfm_lbgexample, se.fit = TRUE)

Its ouput

$fit
      R1       R2       R3       R4       R5       V1 
     NaN 1.385582 1.114418      NaN      NaN 1.279662 

$se.fit
[1] NaN NaN NaN NaN NaN NaN

Expected behavior

Predict scores for all the documents

System information

Please run sessionInfo() and paste the output.

> sessionInfo()
R version 3.4.4 (2018-03-15)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: KDE neon User Edition 5.12

Matrix products: default
BLAS: /usr/lib/openblas-base/libblas.so.3
LAPACK: /usr/lib/libopenblasp-r0.2.18.so

locale:
 [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C               LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8     LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8   
 [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] testthat_2.0.0 quanteda_1.3.1

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.17       lubridate_1.7.4    lattice_0.20-35    R6_2.2.2           grid_3.4.4         plyr_1.8.4         gtable_0.2.0       magrittr_1.5       scales_0.5.0      
[10] RcppParallel_4.4.0 ggplot2_2.2.1      pillar_1.2.3       stringi_1.2.3      rlang_0.2.1        lazyeval_0.2.1     data.table_1.11.4  Matrix_1.2-14      stopwords_0.9.0   
[19] fastmatch_1.1-0    tools_3.4.4        stringr_1.3.1      munsell_0.5.0      yaml_2.1.19        spacyr_0.9.9       compiler_3.4.4     colorspace_1.3-2   tibble_1.4.2      

Additional info

NA values in word parameters seem to be the cause of the problem

> ws$wordscores
       A        B        C        D        E        F        G        H        I        J        K        L        M        N        O        P        Q        R        S 
     NaN      NaN      NaN      NaN      NaN 1.500000 1.500000 1.500000 1.500000 1.500000 1.487500 1.487288 1.467949 1.438889 1.382199 1.297927 1.202073 1.117801 1.061111 
       T        U        V        W        X        Y        Z       ZA       ZB       ZC       ZD       ZE       ZF       ZG       ZH       ZI       ZJ       ZK 
1.032051 1.012712 1.012500 1.000000 1.000000 1.000000 1.000000 1.000000      NaN      NaN      NaN      NaN      NaN      NaN      NaN      NaN      NaN      NaN 

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions