# a5 - Tweeter Sentiment

In this assignment you will write a program to perform simple [sentiment analysis](https://en.wikipedia.org/wiki/Sentiment_analysis) of Twitter data&mdash;that is, determining the "attitude" or "emotion" (e.g., how "positive", "negative", "joyful", etc) of tweets made by a particular Twitter user. Sentiment analysis is a fascinating field: researchers have shown that the "mood" of Twitter communication [reflects biological rhythms](http://www.nytimes.com/2011/09/30/science/30twitter.html) and can even be used to [predict the stock market](http://arxiv.org/pdf/1010.3003&embedded=true). The particular analysis you'll be performing is inspired by an investigation of [personal vs. organizational tweets](http://varianceexplained.org/r/trump-tweets/) (which has become less amusing over time).

You will be implementing a Python program that performs this analysis on **real data** taken directly from a Twitter user's timeline. In the end, your script will produce output similar to the following:

```
EMOTION       % WORDS  EXAMPLE WORDS                     HASHTAGS
positive      6.16%    learn, faculty, happy             #accesstoinfoday, #indigenouspeoplesday, #idealistfair
trust         3.08%    school, faculty, happy            #indigenouspeoplesday, #diversity
anticipation  2.53%    happy, top, ready                 #indigenouspeoplesday, #informatics, #info340
joy           1.76%    happy, peace, deal                #indigenouspeoplesday, #accesstoinfoday
surprise      0.99%    deal, award, surprised            #suzzallolibrary, #nobrainer
negative      0.88%    fall, rejection, outstanding        
sadness       0.55%    fall, rejection, problem            
disgust       0.44%    rejection, weird, finally           
fear          0.44%    rejection, surprise, problem        
anger         0.33%    rejection, disaster, involvement  #mlis
```

Fill in the below code cells as specified. Note that cells may utilize variables and functions defined in previous cells; we should be able to use the `Kernal > Restart & Clear All` menu item followed by `Cell > Run All` to execute your entire notebook and see the correct output.

## The Data
You'll be working with two different pieces of data for this assignment.

First, you'll be loading tweet data taken directly from [Twitter's API](https://developer.twitter.com/en/docs/tweets/timelines/api-reference/get-statuses-user_timeline). You can find an example of this tweet data in the **`uw_ischool_sample.py`** file inside the `data/` folder. The below cell will import this data as a variable `SAMPLE_TWEETS` from the provided _module_ file:

In [1]:
# import from uw_ischool_sample file in the `data/` package (folder)
from data.uw_ischool_sample import SAMPLE_TWEETS

The data is represented as one giant **list of dictionaries**: the **list** contains a sequence of **dictionaries**, where each dictionary represents a tweet. Each dictionary contains many different _value_, some of which themselves may be dictionaries.

Print out the first three elements from the `SAMPLE_TWEETS` list to see what information can be found. The most relevant value is the `"text"` of the tweet.
- The Twitter API actually provides a lot more information about each tweet; I've stripped it down to only the most important properties for readability. Each dictionary is a proper subset of the full data you'd get from Twitter.
- Because of the source of the sentiment data, your analysis will be biased and only support English-language speakers. Nevertheless, Twitter is an international community so you may encounter non-English characters and words. You'll be working with real-world data and it will be messy!

In [2]:
# print out the first three elements from the SAMPLE_TWEETS list
print(SAMPLE_TWEETS[:3])

[{'created_at': 'Mon Oct 10 18:39:51 +0000 2016', 'retweet_count': 9, 'entities': {'hashtags': [{'indices': [20, 41], 'text': 'IndigenousPeoplesDay'}]}, 'user': {'screen_name': 'UW_iSchool'}, 'text': 'RT @UWAPress: Happy #IndigenousPeoplesDay https://t.co/YmU9e9lj7v'}, {'created_at': 'Mon Oct 10 18:00:00 +0000 2016', 'retweet_count': 0, 'entities': {'hashtags': [{'indices': [16, 29], 'text': 'IdealistFair'}]}, 'user': {'screen_name': 'UW_iSchool'}, 'text': "We'll be at the #IdealistFair this evening on the Seattle U. campus. Come and learn about our graduate programs: https://t.co/et1HrQshmr"}, {'created_at': 'Mon Oct 10 15:10:36 +0000 2016', 'retweet_count': 1, 'entities': {'hashtags': []}, 'user': {'screen_name': 'UW_iSchool'}, 'text': 'RT @iYouthUW: iYouth Tips for 1st\xa0Years https://t.co/K4SCIEhJ8k https://t.co/p4lbC6Jb5o'}]


The second piece of data you'll be working with is a set of **word-sentiments**&mdash;a list of English-language words and what emotions (e.g., "joy", "anger") [are associated with them](http://saifmohammad.com/WebPages/NRC-Emotion-Lexicon.htm).

- The [`nltk`](https://github.com/nltk/nltk/wiki/Sentiment-Analysis) library you used in the last assignment does support sentiment analysis. However, for practice and extendability you'll be doing a more "manual" analysis using the provided data file for this assignment.

`import` the word sentiments as a variable **`SENTIMENTS`** from the **`data.sentiments_nrc`** module. You should also import the `EMOTIONS` variable provided by the same module: this is a _list_ of possible emotions. You can inspect the variables (e.g., print them out) to confirm that you have imported them.

In [3]:
# IMPORT SENTIMENT and EMOTIONS from provided module 
from data.sentiments_nrc import SENTIMENTS,EMOTIONS

# print the variables in SENTIMENTS
print(SENTIMENTS)




In [4]:
# print the variables in EMOTIONS
print(EMOTIONS)

['positive', 'negative', 'anger', 'anticipation', 'disgust', 'fear', 'joy', 'sadness', 'surprise', 'trust']


## Text Sentiment
All of the sentiment analysis is based on the individual _words_ in the text. Thus you will need to will determine which words in a tweet have which sentiments.

- Note that the assignment explicitly does _not_ tell you what to name functions, what arguments they should take or values they should return: your task is to determine appropriate functions and arguments from the (guided) requirements! Use multiple functions for clarity, give them all informative names, and include a **doc string** to explain what it does.

Define a function that take a tweet's text (a string) and split it up into a list of individual words.

- To support future assignments, you should **not** use the `nltk` library to tokenize words. Instead, your analysis should split up the text using the [regular expression](https://www.regular-expressions.info/) **`"\W+"`** as a separator to "split up" the words by (rather than just a blank space). You can do this by using the [re.split()](https://docs.python.org/3/library/re.html#re.split) function (from the `re` module). This separator will cause your spitting to exclude punctuation and provide a reasonable (but not perfect!) list of words to consider. 

- All of the words in the sentiment dictionary are _lower case_, so you'll need to **map** your resulting words to be lower case. You will also need to **filter** out any words that have 1 letter or fewer. Use a **list comprehension** to do this.

The string `"Amazingly, I prefer a #rainy day to #sunshine."` should produce a list with 6 lower-case words in it.

In [5]:
# import 're' module
import re

# function that takes as input the tweet's text data (which is a string) and split it up into a list of individual words
def extractwords(text):
    """This function accepts a tweet's text (str) and splits it up into a list of individual words (str)."""
    
    # split the text into individual words
    individualwords = re.split('\W+', text)
    
    # convert the words to lowercase using 'map'
    lowercasewords = list(map(lambda x: x.lower(), individualwords))
    
    # filter out words with length equal to one or lower using list comprehension
    finalwords = [word for word in lowercasewords if len(word) > 1]
    
    # return statement
    return(finalwords)

# test the 'extract_words' function
extractwords('Amazingly, I prefer a #rainy day to #sunshine.')

['amazingly', 'prefer', 'rainy', 'day', 'to', 'sunshine']

Define a function that **filters** a list of the words to get only those words that contain a specific emotion. Use a **list comprehension** to do this.
- You can determine whether a word has a particular emotion by looking it up in the imported `SENTIMENTS` variable. Use the word as the "key" to find the dictionary of emotions

In [6]:
# A function that filters a list of words to get only those words that contain a specific emotion
def emotionfilter(wordlist, emotion):
    """This function accepts a list of the words (str) and an emotion (str) and,
    returns the list of the words to get only those words that contain that specific emotion."""
    
    # list comprehension to filter the list of the words
    finalwords = [word for word in (
        word for word in wordlist if SENTIMENTS.get(word) != None
    ) if SENTIMENTS.get(word).get(emotion) != None]
    
    # return statement
    return(finalwords)

# test the 'filter_emotion' function
emotionfilter(extractwords('Amazingly, I prefer a #rainy day to #sunshine.'), "positive")

['amazingly', 'prefer', 'sunshine']

Define a function that determines which words from a list have _each_ emotion (i.e., the "emotional" words). For example, the words extracted from `"Amazingly, I prefer a #rainy day to #sunshine."` should produce a dictionary that looks like:

```
{
 'anger': [],
 'anticipation': [],
 'disgust': [],
 'fear': [],
 'joy': ['amazingly', 'sunshine'],
 'negative': [],
 'positive': ['amazingly', 'prefer', 'sunshine'],
 'sadness': ['rainy'],
 'surprise': ['amazingly'],
 'trust': ['prefer']
}
```
    
(Note the empty lists for emotions that have no matching words).
    
- You can use the imported `EMOTIONS` variable to have a list of emotions to iterate through.
- Use the function you defined in the previous step to help you out!
- Using a [dictionary comprehension](https://www.smallsurething.com/list-dict-and-set-comprehensions-by-example/) is a nice way to do this, but is not required.

In [7]:
# A function that determines which words from a list have each emotion (i.e., the "emotional" words)
def emotionwords(wordlist):
    """This function accepts a list of the words (str) and, returns a dictionary with the list of words for each emotion."""
    
    # dictionary comprehension to filter for each emotion    
    finaldict = {k: emotionfilter(wordlist, k) for k in EMOTIONS}
    
    # return statement
    return(finaldict)

# test the 'emotion_words' function
emotionwords(extractwords('Amazingly, I prefer a #rainy day to #sunshine.'))

{'positive': ['amazingly', 'prefer', 'sunshine'],
 'negative': [],
 'anger': [],
 'anticipation': [],
 'disgust': [],
 'fear': [],
 'joy': ['amazingly', 'sunshine'],
 'sadness': ['rainy'],
 'surprise': ['amazingly'],
 'trust': ['prefer']}

Define a function that gets a list of the "most common" words in a list: that is a new list containing each word in the original list, in descending order by how many times that word appears in the orignal list.

- You can determine the frequency (number of occurrences) of a word with a similar process to what you did with digits in the last assignment.
- You should use the `sorted()` function to [sort](https://wiki.python.org/moin/HowTo/Sorting) the individual words. This function take a **`key`** argument which should be passed a [_callback function_](https://wiki.python.org/moin/HowTo/Sorting#Key_Functions) that can return a "transformed" value that you wish to sort by (e.g., which element in a tuple). An anonymous lambda function works well for this.

You can test this function with any list of "words" with repeated entries: `['a','b','c','c','c','a']` for example.


In [8]:
# A function that gets a list of the "most common" words in a list
def listmostcommon(wordlist):
    """This function accepts a list of the words (str) and, returns the list containing each word in the original list, 
    in descending order by how many times that word appears in the original list"""
    
    wordcount = {}
    
    # create a dictionary of all the words and count as key-value pairs
    for i in wordlist:
        wordcount[i] = 0
    
    # populate the wordcount dictionary
    for j in wordcount.keys():
        for k in wordlist:
            if j == k:
                wordcount[j] += 1
    
    # sort the words according to its count in a descending order
    sortedwords = sorted(wordcount.keys(), key = lambda x: wordcount[x], reverse = True)
    
    # return statement 
    return(sortedwords)
    
# test the function
listmostcommon(['a','b','c','c','c','a'])

['c', 'a', 'b']

## Tweet Statistics
Once you are able to determine the sentiment of an individual string of text (e.g., a single tweet's content), you can analyze an entire set of tweets from the user's timeline.

Define a function (e.g., `analyze_tweets()`) that takes as an argument a **list** of tweet data (with the same structure as the imported `SAMPLE_TWEETS` variable), and _returns_ the data of interest to display in a table like the one at the very top of the notebook. In particular, you'll need to produce the following information **for each emotion**:

1. The percentage of words _across all tweets_ that have that emotion
2. The most common words _across all tweets_ that have that emotion (in order!)
3. The most common **hashtags** _across all tweets_ associated with that emotion (see below)

(Think carefully: should this data be stored in a _list_ or a _dictionary_?)

Some tips for this task:

- You can optionally create some "helper" functions to break up this task even further; define those functions in the same notebook cell or add additional cells.

- You'll need to use your previous functions to get the _list of words_ and _dictionary of emotional words_ for each tweet. I recommend you assign the results of these methods as **new keys** of the respective tweet dictionary (so your tweet would gain a `words` key, for example).

- In order to get the percentage of emotional words, divide the number of words that have that emotion by the total number of words _across all the tweets_. Counting how many total words are in the tweet set is a **reducing** operation: you should use the `reduce()` function for this.

- For each emotion, you'll need to get a list of the words _across all the tweets_ that have that emotion (in order to determine how many there are for the percentage, as well as which are most common). This is another **reducing** operation; you should use the `reduce()` function to _add up_ all of these words (alternatively, the `sum()` function can be used here).

- For emotion emotion, you will also need to calculate the most common [hashtags](https://en.wikipedia.org/wiki/Hashtag) for tweets that have _at least one word with that emotion_.

    The Twitter data for each tweet includes a _list_ containing the hashtags found in that tweet&mdash;you should **NOT** try and search the tweet text for `#` symbols. These hashtags can be found in the `['entities']['hashtags'][i]['text']` element of each tweet&mdash;that is, the `'text'` key from _each_ element in the _list_ of the `'hashtags'` key in the `'entities'` dictionary of the tweet. See the `uw_school.json` example file to see this structure more clearly.

    (You might use a _list comprehension_ to "flatten" this complex nesting structure into just a list of hashtag words).

    Since hashtags are just words, you can use your function for finding the most common words to find the most common hashtags!

You can test your function by passing in the `SAMPLE_TWEETS` variable as an argument and checking if your returned data has the same numbers as in the table at the top of the page. Note that only the first 3 most common words are listed (and may be in a different order in the case of ties).

In [9]:
from functools import reduce

# A function that takes as an argument a list of tweet data and returns the data of interest to display in a table
def analyze_tweets(tweetdata):
    """This function accepts the tweets in the 'SAMPLE_TWEETS' format and returns the dictionary with the sentiment analysis.""" 
    
    # loop through each tweet
    for i in tweetdata:
        
        # extract and classify words as per each sentiment
        i['extractwords'] = extractwords(i['text'])
       
        i['emotionwords'] = emotionwords(i['extractwords'])
                    
    # Store the number of words
    numberofwords = reduce(lambda x, y: x + len(y['extractwords']), tweetdata, 0)
    
    # create final return dictionary
    finaldict = {emotion: (reduce(lambda x, y: x + len(y['emotionwords'][emotion]), tweetdata, 0) / numberofwords,
                           listmostcommon(reduce(lambda x, y: x + y['emotionwords'][emotion], tweetdata, [])),
                           listmostcommon(reduce(lambda x, y: x + ([element['text'].lower() for element in y['entities']['hashtags']] 
                                              if len(y['emotionwords'][emotion]) > 0
                                              else []), tweetdata, []))
                           ) for emotion in EMOTIONS}
    
    # return statement
    return(finaldict)
        
# test the function
analyze_tweets(SAMPLE_TWEETS)

{'positive': (0.061606160616061605,
  ['learn',
   'faculty',
   'happy',
   'information',
   'top',
   'ethics',
   'join',
   'visit',
   'study',
   'explain',
   'peace',
   'deal',
   'professor',
   'outstanding',
   'award',
   'invitation',
   'jam',
   'excited',
   'launch',
   'equity',
   'major',
   'scholarship',
   'finally',
   'prime',
   'surprise',
   'job',
   'wonderful',
   'technology',
   'good',
   'fun',
   'retention',
   'merry',
   'don',
   'investigate',
   'public',
   'share',
   'cool',
   'knowledge',
   'expert',
   'fair',
   'degree',
   'community',
   'opportunity'],
  ['accesstoinfoday',
   'indigenouspeoplesday',
   'idealistfair',
   'geekgirlcon',
   'diversity',
   'mlis']),
 'negative': (0.0088008800880088,
  ['fall',
   'rejection',
   'outstanding',
   'boring',
   'weird',
   'problem',
   'disaster'],
  []),
 'anger': (0.0033003300330033004,
  ['rejection', 'disaster', 'involvement'],
  ['mlis']),
 'anticipation': (0.025302530253025302

Once you've analyzed the tweets, you will need to _display_ that information as a printed table (as in the example at the top of the page).

Define another function to display this table (your function should take as an argument the data structure returned from your "analysis" function).

This function will need to print out the table. Using the [string formatting](https://docs.python.org/3/library/string.html#format-examples) language (via the **`.format()`** string method) makes it possible to have equally sized "columns" of data. For more example, [this tutorial](https://www.digitalocean.com/community/tutorials/how-to-use-string-formatters-in-python-3) is pretty good (check out the "Padding Variable Substitutions" section).


A few notes about formatting this output:

- For your reference, the example table at the top of the page uses `14` characters for the first column, `11` characters for the second,  `35` for the third, and the "remainder" for the fourth. You are not required to match these numbers.

- The percentage should be formatted with two decimals of precision (e.g., `1.23%`).

- Both the example sentiment words and the hashtags should be outputted as a _comma-separated list_ with spaces between them (and no square brackets). The `join()` string method is good for converting lists to formatted strings. Both lists should also be limited to the 3 most common items.

- Make sure to include `#` in front of the hashtags!

In [10]:
# A function to display the table (dictionary) returned by 'analyze_tweets'
def printedtable(finaldict):
    """This function accepts the dictionary returned by 'analyze_tweets' function and displays it in the form of a table."""
    
    # print the header row
    print("{0:<15} {1:<10} {2:<35} {3:<35}"
          .format("EMOTION", "% WORDS", "EXAMPLE WORDS", "HASHTAGS"))
    
    # print the table
    for k, v in sorted(finaldict.items(), key = lambda x: x[1], reverse = True):
        print("{0:<15} {1:.2f}% {2:<4} {3:<35} {4:<35}"
              .format(k, v[0] * 100, '', ", ".join(v[1][:3]), ", ".join("#" + str(e) for e in v[2][:3])))

# test the function    
printedtable(analyze_tweets(SAMPLE_TWEETS))

EMOTION         % WORDS    EXAMPLE WORDS                       HASHTAGS                           
positive        6.16%      learn, faculty, happy               #accesstoinfoday, #indigenouspeoplesday, #idealistfair
trust           3.08%      school, faculty, happy              #indigenouspeoplesday, #diversity  
anticipation    2.53%      happy, top, ready                   #indigenouspeoplesday, #informatics, #info340
joy             1.76%      happy, peace, deal                  #indigenouspeoplesday, #accesstoinfoday
surprise        0.99%      deal, award, surprised              #suzzallolibrary, #nobrainer       
negative        0.88%      fall, rejection, outstanding                                           
sadness         0.55%      fall, rejection, problem                                               
disgust         0.44%      rejection, weird, finally                                              
fear            0.44%      rejection, surprise, problem                     

## Getting Live Data
This is all good and well, but the real payoff would be to be able to see the sentiments of tweets taken directly from the Twitter feed of real users!

Define _another_ function that takes in a Twitter username as an argument and then returns a list of dictionaries representing the tweets made by that user.

Normally you would fetch this data by sending a request directly to the web service's API (e.g., to the the [statuses/user_timeline](https://developer.twitter.com/en/docs/tweets/timelines/api-reference/get-statuses-user_timeline) endpoint provided by the Twitter API at `https://api.twitter.com/1.1/statuses/user_timeline`). However, Twitter includes access controls so that only registered developers are allowed to send requests. While it is possible to register as a developer and access Twitter [directly through Python](https://python-twitter.readthedocs.io/en/latest/), this adds an extra level of complexity to the assignment.

Instead, I've set up a [proxy](https://en.wikipedia.org/wiki/Proxy) that has all the access keys specified which you can use to search Twitter. This proxy is available at:

**<https://faculty.washington.edu/joelross/proxy/twitter/timeline/>**

Send a request to _that_ url instead of `https://api.twitter.com/1.1/statuses/user_timeline`, and it will redirect your request with the proper authentication to Twitter, and then give you back whatever JSON Twitter's API responded with.

- You specify the same request parameters as you would when accessing Twitter directly. The request takes a `screen_name` request parameter which you can assign the given username. You can also specify the `count` parameter if you want to get more results back (up to 200); see the [documentation](https://developer.twitter.com/en/docs/tweets/timelines/api-reference/get-statuses-user_timeline) for details and other options you are welcome to use (just document them in your function's **docstring**).

- **WARNING:** The proxy I have set up is **rate-limited** so that it can only accept 900 requests every 15 minutes. If all 40 students are working rapidly on the assignment at the same time, you may find yourself needing to wait a few minutes and try again. You are alternatively welcome to set up your own developer account and API keys; just make sure you don't put the keys under version control and upload them to GitHub!

You can download the timeline data from Twitter using the [requests](http://docs.python-requests.org/en/master/user/quickstart/) module discussed in class: send a `GET` request to the [statuses/user_timeline](https://developer.twitter.com/en/docs/tweets/timelines/api-reference/get-statuses-user_timeline) endpoint provided by the Twitter API, and then use the `.json()` method to extract the JSON response as a Python _list_ or _dictionary_ value you can work with.

In [11]:
import requests

# A function that takes in a Twitter username and returns a list of dictionaries representing the tweets made by that user
def gettweets(username):
    """This function accepts the Twitter username and returns a list of dictionaries representing their 
    corresponding tweets (number of tweets - 200)."""
    
    # define query parameters
    queryparams = {"screen_name": username, "count": 200}
    
    # make the get request
    response = requests.get("https://faculty.washington.edu/joelross/proxy/twitter/timeline/", params = queryparams)
    
    # return statement
    return(response.json())

# test 'get_tweets' function
gettweets("geekwire")

[{'created_at': 'Sat Jun 15 00:13:58 +0000 2019',
  'id': 1139687456268095488,
  'id_str': '1139687456268095488',
  'text': '$400 million? Paul Allen’s Stratolaunch space venture is up for sale, sources say https://t.co/8zjbf0ywqN',
  'truncated': False,
  'entities': {'hashtags': [],
   'symbols': [],
   'user_mentions': [],
   'urls': [{'url': 'https://t.co/8zjbf0ywqN',
     'expanded_url': 'https://www.geekwire.com/2019/paul-allen-stratolaunch-sale/',
     'display_url': 'geekwire.com/2019/paul-alle…',
     'indices': [82, 105]}]},
  'source': '<a href="http://www.geekwire.com" rel="nofollow">GeekWire</a>',
  'in_reply_to_status_id': None,
  'in_reply_to_status_id_str': None,
  'in_reply_to_user_id': None,
  'in_reply_to_user_id_str': None,
  'in_reply_to_screen_name': None,
  'user': {'id': 255784266,
   'id_str': '255784266',
   'name': 'GeekWire',
   'screen_name': 'geekwire',
   'location': 'Seattle',
   'description': 'Tech news, commentary and other nerdiness covering Microsof

Define one last "main" function that will [prompt the user](https://docs.python.org/3/library/functions.html#input) for a Twitter username. The function should then call your "download" function to fetch the tweets, and pass the returned tweet data into your "analyze" and "show" functions in order to display your sentiment analysis of the user's timeline. 

**ADDITIONALLY**, `if` the user specifies `SAMPLE` (all caps) as the username, the function should instead show the analysis for the `SAMPLE_TWEETS` (this will help us out with grading).

In [12]:
# The 'main' function
def main():
    """This function prompts the user for a Twitter username and displays their corresponding sentiment analysis."""
    
    # username prompt
    username = input()
    
    # check to see if 'SAMPLE' or not to perform the sentiment analysis accordingly
    if username == 'SAMPLE':
        printedtable(analyze_tweets(SAMPLE_TWEETS))
    else:
        printedtable(analyze_tweets(gettweets(username)))
    
# test 'main' function
main()

geekwire
EMOTION         % WORDS    EXAMPLE WORDS                       HASHTAGS                           
positive        5.62%      deal, job, opportunity              #gwcloud, #blockparty, #gwsummit   
trust           3.36%      deal, don, top                      #gwcloud, #devops, #blockparty     
anticipation    2.34%      deal, opportunity, top              #gwcloud, #gwsponsor, #starwars    
joy             1.63%      deal, clean, cash                   #gwcloud, #blockparty, #devops     
negative        1.35%      drone, forget, small                #gwsponsor, #gwcloud, #earthranger 
fear            1.10%      giant, watch, cash                  #starwars, #earthranger            
surprise        0.99%      deal, rapid, raffle                 #gwsponsor, #gwcloud, #gwsummit    
anger           0.50%      cash, cutting, fight                #earthranger                       
sadness         0.41%      cutting, lawsuit, case              #gwcloud                           
d

Use your main function to try analyzing the timelines of different users and comparing their results. Are the current sentiments of the [iSchool](https://twitter.com/uw_ischool) and [CSE](https://twitter.com/uwcse) different in interesting ways?

In [13]:
# testing with the ischool feed --> Type 'UW_iSchool' when prompted for user name
main()

UW_iSchool
EMOTION         % WORDS    EXAMPLE WORDS                       HASHTAGS                           
positive        6.07%      award, join, technology             #ischoolcapstone, #informatics, #chi2019
trust           3.56%      team, award, professor              #ischoolcapstone, #informatics, #da2i
anticipation    2.43%      award, watch, good                  #ischoolcapstone, #informatics, #accesstoinformation
joy             2.13%      award, good, excited                #ischoolcapstone, #als, #informatics
surprise        1.05%      award, good, excited                #ischoolcapstone, #da2i, #datascience
negative        1.03%      evil, hungry, wait                  #wikipedia, #vr, #facebook         
fear            0.77%      watch, evil, problem                #ischoolcapstone, #accesstoinformation, #facebook
sadness         0.69%      evil, music, interested             #vr, #als, #ischoolcapstone        
disgust         0.49%      evil, interested, winning     

In [14]:
# testing with the iSchool feed --> type 'uwcse' when prompted
main()

uwcse
EMOTION         % WORDS    EXAMPLE WORDS                       HASHTAGS                           
positive        6.63%      center, building, professor         #uwallen, #uwgatescenter, #ai      
trust           3.84%      center, team, professor             #uwallen, #ai, #uwgatescenter      
anticipation    2.35%      award, celebration, university      #uwallen, #acmprize, #uwgatescenter
joy             2.02%      award, celebration, graduation      #uwallen, #ai, #uwgatescenter      
surprise        1.06%      award, celebration, chance          #uwallen, #huskygivingday, #cg     
negative        0.78%      late, challenge, outstanding        #uwallen, #ai, #scamalert          
fear            0.48%      graduation, challenge, fight        #uwallen, #uwinnovates, #realitylablecture
sadness         0.38%      late, art, apologize                #uwallen, #scamalert, #ai          
anger           0.30%      challenge, fight, disinformation    #realitylablecture, #ar, #vr     

# iSchool and CSE - Sentiment Analysis:

Observations – 

•	The tweets from 'UW iSchool' have less positive words at 6.07% when compared with UW CSE at 6.63%.
•	On the other hand the UW CSE users tweet less negatively with negative words at 0.78% respectively when compared with 'UW iSchool' at 1.03% negative words used.
•	#uwallen hashtag is the most frequent for tweets from 'uwcse' for all sentiments except for anger and disgust.
•	For the UW ischool the highest trending hashtag is #ischoolcapstone for most emotions such as positive,trust,anticipation,joy,anger and surprise which tells a good story about how the iSchool capstone enthralls students minds.	 
•	The least used sentiment for both the UW iSchool and UW CSE is anger and disgust, which paints a positive picture about the student life here and states that students are proud of their schools in general and have a happy feel about it.