dfm_lookup error: intI(j, n = x@Dim[2], dn[[2]], give.dn = FALSE) : invalid character indexing #946

pbradl42 · 2017-09-06T16:40:17Z

I found this weird thing. dfm_lookup is crashing with the above error in certain cases when 'exclusive=TRUE.' Here's a minimum working example. I don't know why "featured_story_content_h2" is important, but it appears to be.

I'm using quanteda 0.99.

 corpus(c("featured_story_content_h2", "aaaaa", "bbbbb", "ccccc")) -> testCorpus 
 #corpus(c("aaaaa", "bbbbb", "ccccc")) -> testCorpus # Works
 dictionary(list(foo = c("xxxxx"), bar = c("yyyyy", "zzzzz"))) -> controlDict
 dictionary(list(foo = c("aaaaa"), bar = c("yyyyy", "zzzzz"))) -> testDict
 dfm(testCorpus, tolower = TRUE, 
 remove_numbers = TRUE, remove_punct = TRUE, remove_separators = TRUE,
 remove_twitter = FALSE, stem = FALSE, ngrams=c(1:2)) -> myDFM

 dfm_lookup(myDFM, dictionary=controlDict, exclusive=FALSE) # Succeeds
 dfm_lookup(myDFM, dictionary=testDict, exclusive=TRUE) # Succeeds
 dfm_lookup(myDFM, dictionary=testDict, exclusive=FALSE) # Fails
#**Error in intI(j, n = x@Dim[2], dn[[2]], give.dn = FALSE) : 
#  invalid character indexing**

Any ideas?

pbradl42 · 2017-09-06T19:01:17Z

Found another phrase that causes the same error in dfm_lookup: "features_archive20172016201520142013media" -- Could this be because of 'feature' in these phrases?

pbradl42 · 2017-09-06T21:05:32Z

Yup - if you run the DFM through

dfm_remove(myDFM, c('feature*')) -> myDFM

first, dfm_lookup will complete successfully.

…_options("base_featname")` - Fixes #946 - affected `dfm_select()` - affected `dfm_lookup()`

kbenoit added a commit that referenced this issue Sep 7, 2017

Fix cbind.dfm problem caused by feature names starting with `quanteda…

77d4cd4

…_options("base_featname")` - Fixes #946 - affected `dfm_select()` - affected `dfm_lookup()`

kbenoit mentioned this issue Sep 7, 2017

Fix cbind.dfm problem caused by feature names starting with "feat" #948

Merged

kbenoit closed this as completed in #948 Sep 7, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dfm_lookup error: intI(j, n = x@Dim[2], dn[[2]], give.dn = FALSE) : invalid character indexing #946

dfm_lookup error: intI(j, n = x@Dim[2], dn[[2]], give.dn = FALSE) : invalid character indexing #946

pbradl42 commented Sep 6, 2017 •

edited by koheiw

Loading

pbradl42 commented Sep 6, 2017

pbradl42 commented Sep 6, 2017

dfm_lookup error: intI(j, n = x@Dim[2], dn[[2]], give.dn = FALSE) : invalid character indexing #946

dfm_lookup error: intI(j, n = x@Dim[2], dn[[2]], give.dn = FALSE) : invalid character indexing #946

Comments

pbradl42 commented Sep 6, 2017 • edited by koheiw Loading

pbradl42 commented Sep 6, 2017

pbradl42 commented Sep 6, 2017

pbradl42 commented Sep 6, 2017 •

edited by koheiw

Loading