Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Confusing colors on the textplot_keyness plot. #1233

Closed
brousseauj opened this issue Feb 13, 2018 · 4 comments
Closed

Confusing colors on the textplot_keyness plot. #1233

brousseauj opened this issue Feb 13, 2018 · 4 comments

Comments

@brousseauj
Copy link

when plotting keyness, the colors used are red and blue, blue being the target. The target bars are blue but the text is red. Why the difference? Also, can we pass ggplot arguments to textplot to alter the style?

image

@kbenoit
Copy link
Collaborator

kbenoit commented Feb 13, 2018

Thanks for the feedback. @koheiw and I are in the process of fixing this plot, and will add to #1211 the issue about the words. I think they should be black by default. Also, when there just two classes, we will have the legend label read the name of the reference document, rather than the generic "Reference".

Yes, you should be able to change the plot using ggplot2 + additions.

@brousseauj
Copy link
Author

Awesome! Off topic, thank you for building this package. It's truly powerful and has been the basis of building out my companies text analytics platform! I will be posting all the code in my GitHub!

@kbenoit
Copy link
Collaborator

kbenoit commented Feb 13, 2018

Thanks! I look forward to seeing that. Feel free to leave a testimonial at #461.

@kbenoit
Copy link
Collaborator

kbenoit commented Feb 21, 2018

Two issues we have now discovered, in the course of investigating this:

  1. Colors for text labels can be flipped, as in the OP example, or as in:
corpus_subset(data_corpus_inaugural, President %in% c("Obama", "Trump")) %>%
    dfm(remove = stopwords("english"), remove_punct = TRUE) %>%
    textstat_keyness(target = "2017-Trump") %>% 
    textplot_keyness() 
  1. For a dfm with just two documents, we should label the reference document as the doc name in the legend, not the more generic "Reference". (The previous code illustrates this problem.)

  2. In the legend, the target group should be first. It is currently alphabetical.

corpus_subset(data_corpus_inaugural, President %in% c("Obama", "Trump")) %>%
    dfm(remove = stopwords("english"), remove_punct = TRUE, groups = "President") %>%
    textstat_keyness(target = "Trump") %>% 
    textplot_keyness() 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants