- The tweets/ text data is imported first.
- The imported data is cleaned using the R function called gsub() which is used to replace all the matches of a pattern from a string. If the pattern is not found the string will be returned as it is.
- After cleaning the text data, the sentiment of each and every tweet is calculated using the R function called get_nrc_sentiment() which calculates the presence of eight different emotions and their corresponding valence in the tweets/ text data. Then, the tweets are concatenated with their corresponding emotion.
- Once the sentiments are calculated, I have found the most positive and negative tweets in the data set.
- A pie chart is generated to visually see the distribution and number of positive, negative and neutral tweets in the data set.
- After creating the pie chart, I have created a bar chart to visually classify the 8 different emotions (anger, fear, anticipation, trust, surprise, sadness, joy, and disgust) and two sentiments (negative and positive) present in our data set.
- After calculating the sentiments and emotions, I have create the the Term Document Matrix to count the occurrence of each word, to identify popular or trending topics using the R function called TermDocumentMatrix(). This step generates a table containing the frequency of words.
- Using the TermDocumentMatrix, I have created a Word Cloud which is one of the most popular ways to visualize and analyze qualitative data. It’s an image composed of keywords found within a body of text, where the size of each word indicates its frequency in that body of text. The word cloud is generated using the R function called WordCloud .
1.1 Sentiment Pie Chart:
1.2 Emotions Bar Chart:
1.3 Word Cloud:
2.1 Sentiment Pie Chart:
2.2 Emotions Bar Chart:
2.3 Word Cloud:
r/VaccineMyths subreddit posts and comments
3.1 Sentiment Pie Chart:
3.2 Emotions Bar Chart:
3.3 Word Cloud:
r/VaccineMyths subreddit posts and comments
4.1 Sentiment Pie Chart:
4.2 Word Cloud:
5.1 Sentiment Pie Chart:
5.2 Word Cloud:
6.1 Sentiment Pie Chart:
6.2 Word Cloud: