### `Exercise`
# Word clouds
Word clouds are quite often over used, but for a good reason. They are a simple and intuitive way to visualizing text data. 

A common convention is to map the size of the words in the word cloud to represent the frequency in which a word appears. In the context of sentiment analysis, you can encode the sentiment using a divergent color (e.g. green for positive and red for negative). 

Not only do you get to see which words are the most prominent in a corpus of text, but you get an idea of the sentiment. 
In this exercise, we will create a word cloud using the wordcloud package. 

The `songs_sentiment` dataframe is loaded into your workspace.

### `Instructions`

* Load the `wordcloud` package.
* Create a `wordcould()` using the columns `word` and `n`. 
* Add a 500 as the maximum of words. 

### `Workspace`

The workspace should contain a dataframe called songs_sentiment. 
Students created this dataframe in chapter 3 of the course and the whole thing should be familiar by now.

In [54]:
# Preloaded packages
library(tidyverse)
library(tidytext)

# This is how the dataset, that the students built in 3rd chapter, should look like.
df <- read_csv('datasets/songs_sentiment.csv')

# This is a file I made a long time ago for a pet project. 
# For this pet project, it was important to keep rows with and without sentiment.
# Also... I know about the spaces in the column names. I will change them to snake case.
songs_sentiment <- df %>%
    filter(!is.na(`sentiment afinn`))

glimpse(songs_sentiment)

Parsed with column specification:
cols(
  rank = col_integer(),
  song = col_character(),
  artist = col_character(),
  year = col_integer(),
  `song ID` = col_integer(),
  word = col_character(),
  `sentiment afinn` = col_integer(),
  `sentiment bing` = col_character(),
  `sentiment nrc` = col_character(),
  `sentiment loughran` = col_character()
)
"1524 parsing failures.
row # A tibble: 5 x 5 col      row col     expected               actual file                           expected    <int> <chr>   <chr>                  <chr>  <chr>                          actual 1 247715 song ID no trailing characters e3     'datasets/songs_sentiment.csv' file 2 247716 song ID no trailing characters e3     'datasets/songs_sentiment.csv' row 3 247717 song ID no trailing characters e3     'datasets/songs_sentiment.csv' col 4 247718 song ID no trailing characters e3     'datasets/songs_sentiment.csv' expected 5 247719 song ID no trailing characters e3     'datasets/songs_sentiment.csv'
... ..........

Observations: 277,218
Variables: 10
$ rank                 <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1...
$ song                 <chr> "wooly bully", "wooly bully", "wooly bully", "...
$ artist               <chr> "sam the sham and the pharaohs", "sam the sham...
$ year                 <int> 1965, 1965, 1965, 1965, 1965, 1965, 1965, 1965...
$ `song ID`            <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1...
$ word                 <chr> "bully", "bully", "bully", "bully", "bully", "...
$ `sentiment afinn`    <int> -2, -2, -2, -2, -2, -2, 1, -2, -2, -2, -2, -2,...
$ `sentiment bing`     <chr> "negative", "negative", "negative", "negative"...
$ `sentiment nrc`      <chr> "anger", "fear", "negative", "anger", "fear", ...
$ `sentiment loughran` <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...


### `Coding Exercise`

This is the exercise the students have to do.

In [55]:
library(...)

# Here we specify how we want to encode the sentiment afinn color to a color palette
color_sentiment <- brewer.pal(n, "RdBu")[factor(songs_sentiment$`sentiment afinn`)] #RdBu is color-blind friendly

songs_sentiment %>%
    count(word) %>%
    # Use with 
    with(
      wordcloud(
          # Column containing the words
          ... 
          # Size of the words
          , ...
          # Maximum amount of words
          , max.words = ...
          # Color palette
          , colors = color_sentiment
      )   
)

ERROR: Error in eval(expr, envir, enclos): '...' used in an incorrect context


### `Answer`

This is how the final code will look like

In [None]:
library(wordcloud)

# Here we specify how we want to encode the sentiment afinn color to a color palette
color_sentiment <- brewer.pal(n, "RdBu")[factor(songs_sentiment$`sentiment afinn`)] #RdBu is color-blind friendly

songs_sentiment %>%
    count(word) %>%
    # Use with 
    with(
      wordcloud(
          # Column containing the words
          word
          # Size of the words
          , n
          # Maximum amount of words
          , max.words = 500
          # Color palette
          , colors = color_sentiment
      )   
)