# Project: Part 2

Today we will continue this semester's project. To download last week's notebook, click [here](https://drive.google.com/open?id=0B3D_PdrFcBfRaG5zcXQyYW1QR1k).

### Reference Bank

Other links referenced today:
* [538 - Political statistics](http://fivethirtyeight.com/)
* [How to apply for Twitter API key](https://apps.twitter.com/)
* [Twitter advanced search engine](https://twitter.com/search-advanced?lang=en)
* [Tweepy documentation](http://tweepy.readthedocs.io/en/v3.5.0/getting_started.html#api)
* [Twitter API documentation](https://dev.twitter.com/rest/reference)

**Our Twitter key: Q8kC59z8t8T7CCtIErEGFzAce**

Today:

* Make an API call to gather ____
* Review the format of the text, and make a plan to parse it
* Organize it into a dictionary

Weeks to come:

* Review the data collected
* Write the dictionary into a CSV file
* Plot some significant information using matplotlib


## Review

**Review of function definitions**

In [None]:
def function_name(function_parameter1, function_parameter2):
    # Function body
    return_value = function_parameter1 * function_parameter2
    return return_value

# After defining, run the function
function_name(2, 3)

**Review of loop structure**

In [None]:
# Iterable types include lists, strings and dictionaries
iterable = [1, 2, 3]

# Sum list items
sum = 0
for item in iterable:
    # Loop body
    sum += item
print("Sum of list is " + str(sum))

# Increment list items
for index in range(len(iterable)):
    iterable[index] += 1
print("Incremented list is " + str(iterable))

# Reverse a string
string = ""
for character in "Hello world":
    string = character + string
print(string)

**Review of string manipulation**

In [None]:
# Get middle characters of a string
def return_middle(string):
    return string[1:-1]
print(return_middle("abcde"))

# Get all but the last character of a string
def all_but_last(string):
    return string[:-1]
print(all_but_last("The last character of this string should be a 'w'. wr"))

# Combine all of the strings in a list
def make_sentence(list_of_words):
    sentence = ""
    for word in list_of_words:
        sentence = sentence + word + " "
    return sentence[:-1]
print(make_sentence(["this", "is", "a", "sentence"]))

## New String Methods

We know how to do basic string indexing at this point, but there are many built-in Python methods that help us handle strings tactfully. Here are some important methods that will be useful as we parse text, with examples.

String methods summary from [Google](https://developers.google.com/edu/python/strings) (where s is a string):

* *s.lower(), s.upper()*: returns the lowercase or uppercase version of the string
* *s.strip()*: returns a string with whitespace removed from the start and end
* *s.isalpha()/s.isdigit()/s.isspace()...*: tests if all the string chars are in the various character classes
* *s.startswith('other'), s.endswith('other')*: tests if the string starts or ends with the given other string
* *s.find('other')*: searches for the given other string (not a regular expression) within s, and returns the first index where it begins or -1 if not found
* *s.replace('old', 'new')*: returns a string where all occurrences of 'old' have been replaced by 'new'
* *s.split('delim')*: returns a list of substrings separated by the given delimiter. The delimiter is not a regular expression, it's just text. 'aaa,bbb,ccc'.split(',') -> ['aaa', 'bbb', 'ccc']. As a convenient special case s.split() (with no arguments) splits on all whitespace chars.
* *s.join(list)*: opposite of split(), joins the elements in the given list together using the string as the delimiter. e.g. '---'.join(['aaa', 'bbb', 'ccc']) -> aaa---bbb---ccc

In [None]:
# Lower/upper, split example
def make_name(string):
    # Split the string into separate words, with space as delimiter
    words = string.split(' ')
    # Make dummy string to be returned
    to_return = ""
    for word in words:
        # Add the uppercase first letter of each word
        to_return += word[0].upper()
        # Add rest of word
        to_return += word[1:]
        # Add spaces between words
        to_return += " "
    # Return string, with last space omitted
    return to_return[:-1]
        
make_name("megan elizabeth carey")

In [None]:
# Strip example
text = "        nonsense at beginning and end should be trimmed        "
print(len(text.strip()))
print(len(text))

In [None]:
# Startswith/endswith example
def check_start_or_end(string, substring):
    if string.startswith(substring):
        return True
    elif string.endswith(substring):
        return True
    else:
        return False
    
print(check_start_or_end("megan carey", "me"))
print(check_start_or_end("megan carey", "rey"))
print(check_start_or_end("megan carey", "hi"))

In [None]:
# Replace example
def find_and_swap(string1, string2, string3):
    # Find the first index where the second input string occurs
    first_end = string1.find(string2) + 1
    # Make one substring up to that point
    substring1 = string1[:first_end]
    # Make anotehr substring after that point
    substring2 = string1[first_end:]
    # Replace the second input string with the third
    substring1 = substring1.replace(string2, string3)
    # Replace the third input string with the second
    substring2 = substring2.replace(string3, string2)
    # Concatenate the strings
    return substring1 + substring2

print(find_and_swap("Hello there! How's it going?", "!", "?"))

## Making an API Call

Now we'll make an API call using Tweepy. First, we create an *api* object using Tweepy:

In [None]:
import tweepy
import json

## Our consumer key
consumer_key = 'Q8kC59z8t8T7CCtIErEGFzAce'
## Our signature, also given upon app creation
consumer_secret = '24bbPpWfjjDKpp0DpIhsBj4q8tUhPQ3DoAf2UWFoN4NxIJ19Ja'
## Our access token, generated upon request
access_token = '719722984693448704-lGVe8IEmjzpd8RZrCBoYSMug5uoqUkP'
## Our secret access token, also generated upon request
access_token_secret = 'LrdtfdFSKc3gbRFiFNJ1wZXQNYEVlOobsEGffRECWpLNG'

## Tweepy authorization commands
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth)

Now let's create a search query. The Tweepy API object has a *search* method that takes in a parameters *q*, which is our search query. The search query is a string that specifies what kind of tweets we want to search for. The documentation for query operators, which define what you'd like to search for, can be found [here]( https://dev.twitter.com/rest/public/search). The Twitter [advanced search engine](https://twitter.com/search-advanced?lang=en) also provides an easy way to build complex queries.

For example, if we want to search for tweets about Hillary with the hashtag "Imwithher":

In [None]:
# We first need to create our search query string, using either the query operator documentation, 
# or the Twitter advanced search enginer:
query = "%20hillary%23imwithher"

# We now use the api object's search method to find the tweets that match the query:
results = api.search(query)

# Now, let's see the results. The results will be a list of SearchResult objects. Let's look at the first result in the list:
print(results[0])

That's a lot of information, but notice that the result looks very similiar to a Python dictionary. This is actually a Status object from Tweepy, which functions the same as any other object we've encountered. Each status object has attributes that describe the tweet. Unfortunately, the Tweepy documentation doesn't explain this very well. For a list of available attributes, [click here](http://tkang.blogspot.com/2011/01/tweepy-twitter-api-status-object.html). Try extracting some values from the result:

In [None]:
# For example, if we wanted to search the text of the tweet, we would look at the text attribute:
print(results[0].text)

# Try looking at some of the other attributes. 
# Your code here!
