You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Converting a dfm to a data.frame using 'convert' creates an issue if one of the features in your dfm is also called 'document'.
Perhaps rename the document column to something more likely to be unique, like "doc_id" in line with ropensci text interchange formats.
Reproducible code
Please paste minimal code that reproduces the bug. If possible, please upload the data file as .rds.
library(magrittr)
library(quanteda)
# convert dfm to data.framedfm_df<- dfm(c("this is a fine document")) %>% convert(to='data.frame')
# fix
names(dfm_df)[1] <-"doc_id"
Expected behavior
If you create a data.frame with two 'document' columns, R throws an Rlang error if you try to make use of that column: for example:
Call `rlang::last_error()` to see a backtrace.```
## System information
R version 3.6.1 (2019-07-05)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18362)
- Adds a `docid_field = "doc_id"` as the default to `convert(x, to = "data.frame")`
- Checks for collisions with the `docid_field` and a named feature
- Re-implements the deprecated `as.data.frame.dfm()` to use the same internal function as `convert(x, to = "data.frame")`
- Updates tests
Converting a dfm to a data.frame using 'convert' creates an issue if one of the features in your dfm is also called 'document'.
Perhaps rename the document column to something more likely to be unique, like "doc_id" in line with ropensci text interchange formats.
Reproducible code
Please paste minimal code that reproduces the bug. If possible, please upload the data file as
.rds
.Expected behavior
If you create a data.frame with two 'document' columns, R throws an Rlang error if you try to make use of that column: for example:
R version 3.6.1 (2019-07-05)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18362)
Matrix products: default
locale:
[1] LC_COLLATE=English_United Kingdom.1252 LC_CTYPE=English_United Kingdom.1252
[3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] quanteda_2.0.1 magrittr_1.5
loaded via a namespace (and not attached):
[1] Rcpp_1.0.3 rstudioapi_0.10 stopwords_1.0 tidyselect_0.2.5
[5] munsell_0.5.0 colorspace_1.4-1 lattice_0.20-38 R6_2.4.1
[9] rlang_0.4.1 fastmatch_1.1-0 dplyr_0.8.3 tools_3.6.1
[13] grid_3.6.1 data.table_1.12.8 gtable_0.3.0 lazyeval_0.2.2
[17] RcppParallel_5.0.0 assertthat_0.2.1 tibble_2.1.3 lifecycle_0.1.0
[21] crayon_1.3.4 Matrix_1.2-17 purrr_0.3.3 ggplot2_3.2.1
[25] glue_1.3.1 stringi_1.4.3 compiler_3.6.1 pillar_1.4.2
[29] scales_1.1.0 pkgconfig_2.0.3
The text was updated successfully, but these errors were encountered: