New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create standard texts and a dictionary for examples #592
Comments
Would be a nice addition to include a |
I agree. The library(tidytext)
tidytext::sentiments
# > tidytext::sentiments
# # A tibble: 27,314 x 4
# word sentiment lexicon score
# <chr> <chr> <chr> <int>
# 1 abacus trust nrc NA
# 2 abandon fear nrc NA
# 3 abandon negative nrc NA
# 4 abandon sadness nrc NA
# 5 abandoned anger nrc NA
# 6 abandoned fear nrc NA
# 7 abandoned negative nrc NA
# 8 abandoned sadness nrc NA
# 9 abandonment anger nrc NA
# 10 abandonment fear nrc NA
# # ... with 27,304 more rows
table(sentiments$sentiment)
# > table(sentiments$sentiment)
# anger anticipation constraining disgust
# 1247 839 184 1058
# fear joy litigious negative
# 1476 689 903 10461
# positive sadness superfluous surprise
# 4672 1191 56 534
# trust uncertainty
# 1231 297 |
I have asked them! but not received a reply yet. I will try again. |
Have Lexicoder now, just need some texts. Why not use the inaugural texts? |
Yes, that's probably the easiest way. We could also consider adding the positive and negative movie reviews from the readtext package. With this corpus we could show to what degree the estimated sentiment is in line with the movie evaluation. But adding more text would increase the size of the package which might be problematic for CRAN. |
I thought a bit more about it, and some form of a sentiment corpus might be useful for the manuals of several functions:
We could add
|
That's a good idea, but we'd want to include the whole set of movie reviews. I thought of this before, but moved the n=2000 Pang and Lee set to quantedaData because it was too large to distribute with the package. Once we pare down the package for v1.0 we could revisit this, however.
|
Sounds good, @kbenoit – and yes, a large set of movie reviews would be even better if possible. Once |
We have the |
This is from
dfm_select
's example. Texts and dictionaries should be more meaningful in examples (these looks almost like unit tests).I like examples in
dfm
, but all about taxes. Better to have more place names.Also, we should also have a naming rule for examples.
dfm
,mydfm
ormat
?texts
,mytexts
,txts
? Standard texts and consistent naming will make the manual easier to understand.The text was updated successfully, but these errors were encountered: