## Homework 1: Advanced Track -- Harvest the Twitter API

**Objective:** Write a series of functions that allow you to dynamically harvest Twitter data.

**Estimated Time to Complete:** 4-12 hours

#### Sections

 - **Section 1:** Setting up your developer account, using OAuth1 authentication (approx 45-120 minutes)
 - **Section 2:** Navigating the API documentation, getting your first query string (approx 45-120 minutes)
 - **Section 3:** Writing your API calls (approx 90 - 360 minutes)
 
#### What You'll Turn In:  
 - A `.py` (not a Notebook!) file that contains the functions that you were prompted to create.  These should contain comments demonstrating why your code does what it does, and after it's run, the instructor should be able to make the appropriate function calls in Spyder or any other IDE.

## Section 1:  Setting Up Your Developer Account

Most API's require you to do a little pre-work in order to be able to use them, so the first part of this homework assignment will be spent setting up your developer account so you have API Access.

**Step 1:  Create a Twitter Developer Account**

 - Make sure you have a regular twitter account before you do this
 - You can apply for a developer account here:  https://developer.twitter.com/en/apply-for-access
  - Choose either a student or hobbyist/personal account

**Step 2:  Create An App**

You don't have to intend to build an official software program to have an app.....this is just a way for you to get authentication keys to use with the API.

 - Go to the menu in the upper right hand corner and click on **Your Name** > **Apps**
 - Choose **Create An App**
 - You'll be prompted to enter some information about your app.  Don't worry too much about this, it can say almost anything.  You'll be prompted to list websites where it will be hosted...this can be anything for now.  Use https://generalassemb.ly if you're undecided about what to put.

**Step 3: Create Your API Tokens**

Now that you have an app, you can use its API tokens to go ahead and make requests like we did in class 3.  Like a lot of API's, the Twitter API uses something called OAuth authentication.  

If you didn't wait until the night before this assignment was due and have a spare 30 minutes, you can read a little about it here: https://oauth.net/

In any event, you need API tokens in order to make requests.  Do the following:

 - Go to the apps section of your developer portal
 - Click on the 'Details' button for the app that you just created
 - Click on the 'Keys & Tokens' tab
 - Generate your Access Token and Access Token Secret keys.  You'll need to write these down when you're done

Now you're ready to make requests to the Twitter api.  Everytime you make a request, you'll need to include the 4 tokens you just created.  (You can always regenerate them for whatever reason).  

**Step 4:  Your First Request**

To make requests to the Twitter API you're going to need a module which is **not** already pre-installed in Anaconda. You'll need to install it via PIP, which is python's package manager.  It's called `requests_oauthlib`.  You can install this via Anaconda Prompt or Terminal by simply typing in the command `pip install requests_oauthlib`, and then you'll be finished.

The logistics of making an OAuth1 authenticated request are very similar to what was done in class 3, but with a few additional steps.  You can see how to do it here:  https://requests.readthedocs.io/en/master/user/authentication/#oauth-1-authentication.  The only thing you'll need to change is the info for your API tokens that are passed into the `OAuth1()` function.

Try making a request to the following URL to confirm that it works: 'https://api.twitter.com/1.1/account/verify_credentials.json'

In [None]:
# your code here


If you get your json object back, then you're good to go.

## Section 2: Searching Tweets

Most websites you access will have a long string attached to the end of them that look something like this:  `http://thewebsite.com/?year=2019&color=%golden%yellow%user_id=48549395959438`.

Most people have no reason to pay attention to any of this, but all the special symbols at the end are basically encoded commands that say 'return a website that displays x,y,z characteristics.'  

When accessing api data, it basically works the same way.

**Step 1:  Set Up Your First Query String**

The search for Data Science should return a url that looks like this:  `https://twitter.com/search?q=Data%20Science&src=typed_query`

If you'd like, you can drop the `&src=typed_query` and still get the same results.

Now, let's try and make a request for a search for `Data Science`.  

If you look at Twitter's docs, you'll see that the base url is `'https://api.twitter.com/1.1/search/tweets.json`

This means you have to add the `?q=Whatever%20Word%20%Goes%20Here` to the end to complete the search.

Enter your query string below:

In [32]:
# your answer here


For good measure, try doing a search for tweets relating to `#MeToo` as well.

In [None]:
# your answer here


**Step 2:  Adding Parameters to Your Query String**

Query strings basically have two parts:

 - What follows fhe `?` encodes the actual text to search for, with additional utf-8 encoding added to account for special characters.  This is required.
 - You can also add additional search parameters, which are encoded by `&`, they dictate what kinds of results are returned.  
  - For example, a parameter you can use in Twitter's search API is `count`, which tells you how many results to return.  The default is 15, but you can return up to 100.  So if we wanted to search for tweets and return 50 results our query string would look like the following:
    `https://api.twitter.com/1.1/search/tweets.json?My%20Search%20String&count=50`
  - You can add as many of these parameters to your string as you'd like.  
  
To get the hang of this, try searching for tweets that mention the hashtag `#DeepLearning`, and return 75 results.

Try adding a second parameter.  You can find the list here:  https://developer.twitter.com/en/docs/tweets/search/api-reference/get-search-tweets

## Section 3: Searching Users

The last section of the API you'll need to get the hang of before you're let loose is the users API, which allows you to search for users and get their followers, friends, etc, as opposed to tweets which fit a particular criteria.  This part is pretty similar to the advanced lab in class 3, so if you saw how that worked then you shouldn't need much instruction.  

But if you're seeing this with fresh eyes, you'll want to spend 15-20 minutes to make sure you understand this part.  

Official documentation can be found here:  https://developer.twitter.com/en/docs/accounts-and-users/follow-search-get-users/overview

So, as an example, if you want to get a list of someone's followers, you use the base url `https://api.twitter.com/1.1/followers/list.json` and then enter your query string to get a list of that persons followers.  

Documentation can be found here:  https://developer.twitter.com/en/docs/accounts-and-users/follow-search-get-users/api-reference/get-followers-list.

One possible parameter to use is `screen_name`, so if you wanted to get a list of someone's followers based on their screen name (the handle that beging with an @), then you would set up your API call to look something like:

`https://api.twitter.com/1.1/followers/list.json?screen_name=persons_screenname`

Note that you exclude the `@`.

**Your turn:** Pull in the list of General Assembly's followers.  General Assembly's handle is `@GA`

Note that this won't return the whole list of GA's users.  If you want to do that you have to use cursoring:  https://developer.twitter.com/en/docs/basics/cursoring.  This is the topic of your bonus assignment.

In [69]:
# your answer here


## Section 4: Functions

This section details the functions you have to write and turn in as part of your homework assignment.  

Please read the requirements carefully.

**What you'll turn in:** A `.py` file with all of the functions written.  We should be able to load this into an IDE, run the file, and then call your functions to verify how and if they work. This file should also be properly commented so we can follow your line of reasoning.

The functions you'll be prompted to write will be defined in the following ways:

 - **name:** the name of the function
 - **returns:** what the function should return
 - **arguments:** arguments to include inside the function in order to specify how it should behave.
 
 **Note:** The free API has limitations built into it, so this means from time-to-time you'll only be able to return some of the results from the API.  This is fine.  It's understood and recognized that your functions won't be able to return an entire list of someone's users or other such things, so as long as your work delivers the best it can under present circumstances you'll be in good shape.
 
 **Other Note:** Every aspect of the API that you need to use can be found on either of these pages.
 
 Search API:  https://developer.twitter.com/en/docs/tweets/search/api-reference/get-search-tweets
 
 Users API: https://developer.twitter.com/en/docs/accounts-and-users/follow-search-get-users/api-reference/get-followers-list

##### Function 1 (Required)

**Name:** `find_user`

**Returns:** dictionary that represents user object returned by Twitter's API

**Arguments:**
 - `screen_name`: str, required; Twitter handle to search for
 - `keys`: list, optional; list that contains keys to return about user object.  If not specified, then function should return the entire user object.

##### Function 2 (Required)

**Name:** find_hashtag

**Returns:** list of data objects that contain information about each tweet that matches the hashtag provided as input.

**Arguments:**
 - `hashtag`: str, required; text to use as a hashtag search.  
 - `count`: int, optional; number of results to return
 - `search_type`: str, optional; type of results to return.  should accept 3 different values:
   - `mixed`:   return mix of most recent and most popular results
   - `recent`:  return most recent results
   - `popular`: return most popular results
   
**Note:** User should **not** have to actually use the `#` character for the `hashtag` argument.  The function should check to see if it's there, and if not, add it in for them.

##### Function 3 (Required)

**Name:** `get_followers`

**Returns:** list of data object for each users followers, returning values for the `name`, `followers_count`, `friends_count`, and `screen_name` key for each user.

**Arguments:** 

 - `screen_name`: str, required; Twitter handle to search for
 - `keys`: list, required;  keys to use when searching for user.  default value: [`name`, `followers_count`, `friends_count`, `screen_name`]; if something else is listed, values for those keys should be returned
 - `to_df`: bool, required; default value: False; if True, return results in a dataframe

##### Function 4 (Optional)

**Name:** `friends_of_friends`

**Returns:** list of data objects for each user that two Twitter users have in common

**Arguments:**

 - `names`: list, required; list of two Twitter users to compare friends list with
 - `keys`: list, optional; list of keys to return for information about each user.  Default value should be to return the entire data object.
 - `to_df`: boo, required; default value: False; if True, returns results in a dataframe.

 ##### Function 5 (Optional)

Rewrite the `friends_of_friends` function, except this time include an argument called `full_search`, which accepts a boolean value.  If set to `True`, use cursoring to cycle through the complete set of users for the users provided.  

The twitter API only returns a subset of users in your results to save bandwidth, so you have to cycle through multiple result sets to get all of the values.

You can read more about how this works here:  https://developer.twitter.com/en/docs/basics/cursoring

Basically you have to do a `while` loop to continually make a new request using the values stored in the `next_cursor` key as part of your next query string until there's nothing left to search.

**Note:** We're using the free API, so we're operating under some limitations.  One of them being that you can only make 15 API calls in a 15 minute span.  You can also only return up to 200 results per cursor, so this means you won't be able to completely search for everyone even if you set this up correctly.

That's fine, just do what you can under the circumstances.