Skip to content

textstat_collocations's **collocation** column is a factor rather than a character #736

@trinker

Description

@trinker

textstat_collocations's collocation column is a factor rather than a character. I think the user would expect a character output but may be wrong in the intent here.

MWE:

library(quanteda)

txt <- c("This is software testing: looking for (word) pairs the dog! peas and carrots  
         This [is] a software testing again. For. the dog want the book peas and carrots ",
         "Here: this is more Software Testing, want the book looking again for word pairs. peas and carrots ")

toks <- quanteda::tokens(txt, remove_punct = TRUE)

## does 2 and 3 grams (keep or use old behavior)?
ngrams <- quanteda::textstat_collocations(toks, method = 'dice')

class(ngrams$collocation)
[1] "factor"

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions