## Homework 1: Advanced Track -- Harvest the Twitter API

**Objective:** Write a series of functions that allow you to dynamically harvest Twitter data.

**Estimated Time to Complete:** 4-12 hours

#### Sections

 - **Section 1:** Setting up your developer account, using OAuth1 authentication (approx 45-120 minutes)
 - **Sections 2 & 3:** Navigating the API documentation, getting your first query strings (approx 45-120 minutes)
 - **Section 3:** Writing your API calls (approx 90 - 360 minutes)
 
#### What You'll Turn In:  
 - A `.py` (not a Notebook!) file that contains the functions that you were prompted to create.  These should contain comments demonstrating why your code does what it does, and after it's run, the instructor should be able to make the appropriate function calls in Spyder or any other IDE.

## Section 1:  Setting Up Your Developer Account

Most API's require you to do a little pre-work in order to be able to use them, so the first part of this homework assignment will be spent setting up your developer account so you have API Access.

**Step 1:  Create a Twitter Developer Account**

 - Make sure you have a regular twitter account before you do this
 - You can apply for a developer account here:  https://developer.twitter.com/en/apply-for-access
  - Choose either a student or hobbyist/personal account
  - **note:** these typically get approved right away, but it's possible you might have to wait a little bit......if 15 minutes pass, it might be best to take a break and come back in an hour or so.

**Step 2:  Create An App**

You don't have to intend to build an official software program to have an app.....this is just a way for you to get authentication keys to use with the API.

 - Go to the menu in the upper right hand corner and click on **Your Name** > **Apps**
 - Choose **Create An App**
 - You'll be prompted to enter some information about your app.  Don't worry too much about this, it can say almost anything.  You'll be prompted to list websites where it will be hosted...this can be anything for now.  Use https://generalassemb.ly if you're undecided about what to put.

**Step 3: Create Your API Tokens**

Now that you have an app, you can use its API tokens to go ahead and make requests like we did in class 3.  Like a lot of API's, the Twitter API uses something called OAuth authentication.  

If you didn't wait until the night before this assignment was due and have a spare 30 minutes, you can read a little about it here: https://oauth.net/

In any event, you need API tokens in order to make requests.  Do the following:

 - Go to the **Apps** section of your developer portal
 - Click on the **Details** button for the app that you just created
 - Click on the **Keys & Tokens** tab:
   - Two keys should already be given to you:  **API Key** & **API Secret Key**
   - Two you have to generate:  **Access token** & **Access token secret**
 - Generate your Access Token and Access Token Secret keys.  You'll need to write these down when you're done -- you can only see them once.

Now you're ready to make requests to the Twitter api.  Everytime you make a request, you'll need to include the 4 tokens you just created.  (You can always regenerate them for whatever reason).  

**Step 4:  Your First Request**

To make requests to the Twitter API you're going to need a module which is **not** already pre-installed in Anaconda. You'll need to install it via PIP, which is python's package manager.  It's called `requests_oauthlib`.  You can install this via Anaconda Prompt or Terminal by simply typing in the command `pip install requests_oauthlib`, and then you'll be finished.

The logistics of making an OAuth1 authenticated request are very similar to what was done in class 3, but with a few additional steps.  You can see how to do it here:  https://requests.readthedocs.io/en/master/user/authentication/#oauth-1-authentication.  The only thing you'll need to change is the info for your API tokens that are passed into the `OAuth1()` function.

Try making a request to the following URL to confirm that you have things set up correctly: 'https://api.twitter.com/1.1/account/verify_credentials.json'

In [1]:
# your code here
import requests
from requests_oauthlib import OAuth1

url = 'https://api.twitter.com/1.1/search/tweets.json?q=%23datascience&src=typed_query'
auth = OAuth1('r8ZOzLMZoouq5u8iCdmNKiSMx', 'tIqgdrMHBrkMRXfBlY36cUpxDw34LygVL9hBg5A4Dl0yIQLOAb','1216043853284827136-XgLCLf15GtZZ0oIsgInUIjJT1lnPmT', 'ksKmplYg3WBqwThvTbhknJmhBYE7t43Bt0ZijmExr2Nq2')

req = requests.get(url, auth=auth)

In [2]:
twitter_data = req.json()

In [3]:
type(twitter_data)

dict

In [4]:
twitter_data.keys()

dict_keys(['statuses', 'search_metadata'])

If you get your json object back, then you're good to go.

## Section 2: Searching Tweets

Most websites you access will have a long string attached to the end of them that look something like this:  `http://thewebsite.com/?year=2019&color=golden%20yellow&user_id=48549395959438`.

Most people have no reason to pay attention to any of this, but all the special symbols at the end are basically encoded commands that say 'return a website that displays x,y,z characteristics.'  

When accessing api data, it basically works the same way.

**Step 1:  Set Up Your First Query String**

If you go into Twitter and search for the term `Data Science`, you should be brought to a url that looks like this:  `https://twitter.com/search?q=Data%20Science&src=typed_query`

If you'd like, you can drop the `&src=typed_query` from the url and still get the same results.

There are some important details to pay attention to:

 - Like class 3 when we worked with GitHub, there is a **base url**.  In this case it's `https://twitter.com/search`
 - Whenever you enter a search for something, the base url will be followed by something that looks like `?q=My%20Search%20Term`
  - The `?` marks the beginning of the query string.  This basically says 'initiate a request with whatever parameters that follow'
  - The `q` is a **parameter**, essentially some condition to pass into the query string that determines what results will be given back to you.  In this case, `q` encodes the text you typed in into something the API can understand.
  
**Useful Thing To Do Right Now:** Go back to the Twitter search page, and just try searching for different things, and notice what shows up after the `q=`.  Here are some questions to ask yourself:

 - How are white spaces encoded?  Ie, if you search for `Jonathan Bechtel` in the search box, what shows up to account for the space between the two words?
 - What about hash symbols?  If you search for `#MeToo`, `#GirlsWhoCode` or `#DataScience`, what happens with that `#` symbol?
 - Once you get the hang of this, see if you can just re-create some searches yourself by creating the url directly, and bypassing the search box altogether.  Ie, be familiar enough with how searches are formatted that you know `https://twitter.com/search?q=%23DataScience` will take you to the same page as typing in `#DataScience` into the search box.

Now, let's try and make a request for a search for `Data Science`.  

If you look at Twitter's docs, you'll see that the base url for the search API is `'https://api.twitter.com/1.1/search/tweets.json`

This means you have to add the `?q=Whatever%20Word%20%Goes%20Here` to the end to complete the search.

So go ahead, and see if you can create your API call for a search for the term `Data Science`.

If you did it correctly, you should have a dictionary with a key called `statuses`, and it'll be a list with all of the tweets returned by your search.  

In [7]:
# your answer here
url = 'https://api.twitter.com/1.1/search/tweets.json?q=%23datascience&src=typed_query'
req = requests.get(url, auth=auth)
ds_data = req.text
#ds_data = req.json()
ds_data

'{"statuses":[{"created_at":"Sat Nov 07 02:26:13 +0000 2020","id":1324900936767082496,"id_str":"1324900936767082496","text":"RT @5stocksinto: 5 Indonesian Stocks Into Software https:\\/\\/t.co\\/XB2NBi7gDp  #Indonesia #software #Stocks #technology #erp #digital #mobilea\\u2026","truncated":false,"entities":{"hashtags":[{"text":"Indonesia","indices":[76,86]},{"text":"software","indices":[87,96]},{"text":"Stocks","indices":[97,104]},{"text":"technology","indices":[105,116]},{"text":"erp","indices":[117,121]},{"text":"digital","indices":[122,130]}],"symbols":[],"user_mentions":[{"screen_name":"5stocksinto","name":"Andrew Duff","id":760674570458140673,"id_str":"760674570458140673","indices":[3,15]}],"urls":[{"url":"https:\\/\\/t.co\\/XB2NBi7gDp","expanded_url":"http:\\/\\/www.5stocksinto.com\\/5-indonesian-stocks-into-software\\/","display_url":"5stocksinto.com\\/5-indonesian-s\\u2026","indices":[51,74]}]},"metadata":{"iso_language_code":"it","result_type":"recent"},"source":"\\u003ca href

For good measure, try doing a search for tweets relating to `#MeToo` as well.

In [8]:
# your answer here
twitter_url = 'https://twitter.com/search?q=%23metoo&src=typed_query'
api_url ='https://api.twitter.com/1.1/search/tweets.json?q=%23metoo'
req = requests.get(api_url, auth=auth)
me_too_data = req.text
#me_too_data = req.json()
me_too_data



**Step 2:  Adding Parameters to Your Query String**

Query strings basically have two parts:

 - The `?` initiates the beginning of the API call, and basically says 'everything that follows this will encode something about the information that's going to get returned to you'.
 - What follows that is are a bunch of symbols followed by `=` signs.  These are parameters.
 - So when you make an api call to `'https://api.twitter.com/1.1/search/tweets.json?q=My%20Search%20Term`, the `q` is a paremeter.  
 - You can add multiple paremeters to a query string. They are separated by `&`. They dictate what kinds of results are returned.  
  - For example, a parameter you can use in Twitter's search API is `count`, which tells you how many results to return.  The default is 15, but you can return up to 100.  So if we wanted to search for tweets and return 50 results our query string would look like the following:
    `https://api.twitter.com/1.1/search/tweets.json?q=My%20Search%20String&count=50`
  - You can add as many of these parameters to your string as you'd like.  So for example, if we wanted to include parameters for `count` and `result_type`, we could do the following: `https://api.twitter.com/1.1/search/tweets.json?q=My%20Search%20String&count=50&result_type=mixed`
  
To get the hang of this, try searching for tweets that mention the hashtag `#DeepLearning`, and return 75 results.

In [14]:
api_url ='https://api.twitter.com/1.1/search/tweets.json?q=%23metoo&count=50'
me_too_fifty_req = requests.get(api_url, auth=auth)

In [16]:
me_too_fifty_req.json()

{'statuses': [{'created_at': 'Sun Nov 01 19:23:23 +0000 2020',
   'id': 1322982589058158595,
   'id_str': '1322982589058158595',
   'text': '@Kemasda23 #meToo',
   'truncated': False,
   'entities': {'hashtags': [{'text': 'meToo', 'indices': [11, 17]}],
    'symbols': [],
    'user_mentions': [{'screen_name': 'Kemasda23',
      'name': 'KEMASDA',
      'id': 965333013948182529,
      'id_str': '965333013948182529',
      'indices': [0, 10]}],
    'urls': []},
   'metadata': {'iso_language_code': 'und', 'result_type': 'recent'},
   'source': '<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>',
   'in_reply_to_status_id': 1322911872945893376,
   'in_reply_to_status_id_str': '1322911872945893376',
   'in_reply_to_user_id': 965333013948182529,
   'in_reply_to_user_id_str': '965333013948182529',
   'in_reply_to_screen_name': 'Kemasda23',
   'user': {'id': 421229385,
    'id_str': '421229385',
    'name': 'Marian LB',
    'screen_name': 'marialilb',
    'loca

Try adding a second parameter.  You can find the list here:  https://developer.twitter.com/en/docs/tweets/search/api-reference/get-search-tweets

In [9]:
api_url ='https://api.twitter.com/1.1/search/tweets.json?q=%23metoo&count=50&lang=en'
me_too_fifty_req = requests.get(api_url, auth=auth)
me_too_fifty_req.json()

{'statuses': [{'created_at': 'Sat Nov 07 02:29:11 +0000 2020',
   'id': 1324901681994199041,
   'id_str': '1324901681994199041',
   'text': '@lucasbet49 Sorry, not a gambler as such, HOWEVER l did put my money where my mouth was in suing someone for raping… https://t.co/zpASMvE3Qh',
   'truncated': True,
   'entities': {'hashtags': [],
    'symbols': [],
    'user_mentions': [{'screen_name': 'lucasbet49',
      'name': 'Lucasbet',
      'id': 1293727664033816576,
      'id_str': '1293727664033816576',
      'indices': [0, 11]}],
    'urls': [{'url': 'https://t.co/zpASMvE3Qh',
      'expanded_url': 'https://twitter.com/i/web/status/1324901681994199041',
      'display_url': 'twitter.com/i/web/status/1…',
      'indices': [117, 140]}]},
   'metadata': {'iso_language_code': 'en', 'result_type': 'recent'},
   'source': '<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>',
   'in_reply_to_status_id': 1320505342166355969,
   'in_reply_to_status_id_str': '13205

## Section 3: Searching Users

The last section of the API you'll need to get the hang of before you're let loose is the users API, which allows you to search for users and get their followers, friends, etc, as opposed to tweets which fit a particular criteria.  This part is pretty similar to the advanced lab in class 3, so if you saw how that worked then you shouldn't need much instruction.  

But if you're seeing this with fresh eyes, you'll want to spend 15-20 minutes to make sure you understand this part.  

Official documentation can be found here:  https://developer.twitter.com/en/docs/accounts-and-users/follow-search-get-users/overview

So, as an example, if you want to get a list of someone's followers, you use the base url `https://api.twitter.com/1.1/followers/list.json` and then enter your query string to get a list of that person's followers.  

List of parameters to use can be found here:  https://developer.twitter.com/en/docs/accounts-and-users/follow-search-get-users/api-reference/get-followers-list

One possible parameter to use is `screen_name`, so if you wanted to get a list of someone's followers based on their screen name (the handle that begins with an @), then you would set up your API call to look something like:

`https://api.twitter.com/1.1/followers/list.json?screen_name=persons_screenname`

Note that you exclude the `@`.

**Your turn:** Pull in the list of General Assembly's followers.  General Assembly's handle is `@GA`.

Note that this won't return the whole list of GA's users.  If you want to do that you have to use cursoring:  https://developer.twitter.com/en/docs/basics/cursoring.  This is the topic of your bonus assignment.

In [16]:
# your answer here
"""api_url ='https://api.twitter.com/1.1/followers/list.json?screen_name=ga'
ga_followers = requests.get(api_url, auth=auth)
ga_followers.json()"""

import pandas as pd
api_url ='https://api.twitter.com/1.1/users/lookup.json?screen_name=ga'
ga_followers = requests.get(api_url, auth=auth)
ga_followers.json()
#df = pd.DataFrame(ga_followers.json())
#keys = df.columns
#keys

[{'id': 170393291,
  'id_str': '170393291',
  'name': 'General Assembly',
  'screen_name': 'GA',
  'location': '',
  'description': 'We transform careers and teams — including more than one third of the Fortune 100 — through dynamic courses in coding, data, design, and business.',
  'url': 'https://t.co/YQeEXPxJ4H',
  'entities': {'url': {'urls': [{'url': 'https://t.co/YQeEXPxJ4H',
      'expanded_url': 'http://ga.co/Twitter',
      'display_url': 'ga.co/Twitter',
      'indices': [0, 23]}]},
   'description': {'urls': []}},
  'protected': False,
  'followers_count': 164654,
  'friends_count': 5393,
  'listed_count': 3248,
  'created_at': 'Sat Jul 24 18:19:59 +0000 2010',
  'favourites_count': 35115,
  'utc_offset': None,
  'time_zone': None,
  'geo_enabled': True,
  'verified': True,
  'statuses_count': 22807,
  'lang': None,
  'status': {'created_at': 'Fri Nov 06 23:26:00 +0000 2020',
   'id': 1324855585737744387,
   'id_str': '1324855585737744387',
   'text': 'Take the road less tra

## Section 4: Functions

This section details the functions you have to write and turn in as part of your homework assignment.  

Please read the requirements carefully.

**What you'll turn in:** A `.py` file with all of the functions written.  We should be able to load this into an IDE, run the file, and then call your functions to verify how and if they work. This file should also be properly commented so we can follow your line of reasoning.

The functions you'll be prompted to write will be defined in the following ways:

 - **name:** the name of the function
 - **returns:** what the function should return
 - **arguments:** arguments to include inside the function in order to specify how it should behave.
 
 **Note:** The free API has limitations built into it, so this means from time-to-time you'll only be able to return some of the results from the API.  This is fine.  It's understood and recognized that your functions won't be able to return an entire list of someone's users or other such things, so as long as your work delivers the best it can under present circumstances you'll be in good shape.
 
 **Other Note:** Every aspect of the API that you need to use can be found on either of these pages.
 
 Search API:  https://developer.twitter.com/en/docs/tweets/search/api-reference/get-search-tweets
 
 Users API: https://developer.twitter.com/en/docs/accounts-and-users/follow-search-get-users/api-reference
 
**Remarks About Your Final Work**

 - It's okay if you get stuck somewhere.  If there's one item that you can't figure out and it doesn't quite work right, it's probably best to move on and try other things.  Again, try and explain what you were looking to do.  You'll pass if you give an honest effort.
 - There's potentially a lot of error handling you could do to verify user input is correct, but you can leave that alone for now.  Just make sure the core purpose of the function works the way it's supposed to.
 - While you're working on this, it's possible you may bump into your API limits.  Keep this in mind if you have a function that's working, but 45 minutes later it doesn't and you haven't changed anything.  This usually means the data you're getting back from your API calls isn't what it's supposed to be because you've exhausted your limits. We won't hold you to double checking for all of this in your functions.
 - In the file you turn in, make sure your requests are referencing your API tokens, so that way we can run your file right away.  Ie, make sure somewhere in your script you have a variable at the top that reads `tokens = OAuth1('token1', 'token2', 'token3', 'token4')` so it can be used for your requests inside the file.

In [12]:
"""
step 1 - include default arguments to accept tokens
tokens = OAuth1('token1','token2','token3','token4')
function details:
    name
    returns
    arguments
"""#function details 
print(f"No user exists with the handle {name}"{name})

SyntaxError: invalid syntax (<ipython-input-12-8f08c2ba35fe>, line 9)

##### Function 1 (Required)

**Name:** `find_user`

**Returns:** dictionary that represents user object returned by Twitter's API

**Arguments:**
 - `screen_name`: str, required; Twitter handle to search for.  **Can include the @ symbol.  The function should check for this and remove it if necessary.**
 - `keys`: list, optional; list that contains keys to return about user object.  If not specified, then function should return the entire user object.  **These only need to be outer keys.** If they are keys nested within another key, you don't have to account for this.
 
**To test:** We'll test your function in the following ways:

 - `find_user('@GA')`
 - `find_user('GA')`
 - `find_user('GA', keys=['name', 'screen_name', 'followers_count', 'friends_count'])`

In [32]:
auth = OAuth1('r8ZOzLMZoouq5u8iCdmNKiSMx', 'tIqgdrMHBrkMRXfBlY36cUpxDw34LygVL9hBg5A4Dl0yIQLOAb','1216043853284827136-XgLCLf15GtZZ0oIsgInUIjJT1lnPmT', 'ksKmplYg3WBqwThvTbhknJmhBYE7t43Bt0ZijmExr2Nq2')
def find_user(name,keys = ['id', 'id_str', 'name', 'screen_name', 'location', 'description', 'url',
       'entities', 'protected', 'followers_count', 'friends_count',
       'listed_count', 'created_at', 'favourites_count', 'utc_offset',
       'time_zone', 'geo_enabled', 'verified', 'statuses_count', 'lang',
       'status', 'contributors_enabled', 'is_translator',
       'is_translation_enabled', 'profile_background_color',
       'profile_background_image_url', 'profile_background_image_url_https',
       'profile_background_tile', 'profile_image_url',
       'profile_image_url_https', 'profile_banner_url', 'profile_link_color',
       'profile_sidebar_border_color', 'profile_sidebar_fill_color',
       'profile_text_color', 'profile_use_background_image',
       'has_extended_profile', 'default_profile', 'default_profile_image',
       'following', 'follow_request_sent', 'notifications', 'translator_type']):
    """
    This function finds if a user object exists for the twitter handle returned
    as a required input it accepts the user_name - a user's twitter handle
    as optional inputs it accepts user parameters by which to filter the output. These parameters are entered as a list of strings
    as output it returns a dictionary that represents the users's twitter information
    """
    
    #This line checks to see if the string begins with an @ and removes it
    name = name.strip("@").lower()
    
    #this returns the user details if the user exists or returns an empty dictionary if the user does not exist
    try:
        api_url ='https://api.twitter.com/1.1/users/lookup.json?screen_name={}'.format(name)
        user_details = requests.get(api_url, auth=auth)
        output = { key: user_details.json()[0][key] for key in keys }
        return output
    except:
        print("No user exists with the handle {}".format(name))
        return {}
        
example1 = find_user('@GA')
example2 = find_user('GA')
example3 = find_user('GA', keys=['name', 'screen_name', 'followers_count', 'friends_count'])

In [33]:
print(example1)

{'id': 170393291, 'id_str': '170393291', 'name': 'General Assembly', 'screen_name': 'GA', 'location': '', 'description': 'We transform careers and teams — including more than one third of the Fortune 100 — through dynamic courses in coding, data, design, and business.', 'url': 'https://t.co/YQeEXPxJ4H', 'entities': {'url': {'urls': [{'url': 'https://t.co/YQeEXPxJ4H', 'expanded_url': 'http://ga.co/Twitter', 'display_url': 'ga.co/Twitter', 'indices': [0, 23]}]}, 'description': {'urls': []}}, 'protected': False, 'followers_count': 164653, 'friends_count': 5393, 'listed_count': 3248, 'created_at': 'Sat Jul 24 18:19:59 +0000 2010', 'favourites_count': 35115, 'utc_offset': None, 'time_zone': None, 'geo_enabled': True, 'verified': True, 'statuses_count': 22807, 'lang': None, 'status': {'created_at': 'Fri Nov 06 23:26:00 +0000 2020', 'id': 1324855585737744387, 'id_str': '1324855585737744387', 'text': 'Take the road less travelled. 🚗 Map out a scenic, social distance friendly trip in your home 

In [34]:
print(example2)

{'id': 170393291, 'id_str': '170393291', 'name': 'General Assembly', 'screen_name': 'GA', 'location': '', 'description': 'We transform careers and teams — including more than one third of the Fortune 100 — through dynamic courses in coding, data, design, and business.', 'url': 'https://t.co/YQeEXPxJ4H', 'entities': {'url': {'urls': [{'url': 'https://t.co/YQeEXPxJ4H', 'expanded_url': 'http://ga.co/Twitter', 'display_url': 'ga.co/Twitter', 'indices': [0, 23]}]}, 'description': {'urls': []}}, 'protected': False, 'followers_count': 164653, 'friends_count': 5393, 'listed_count': 3248, 'created_at': 'Sat Jul 24 18:19:59 +0000 2010', 'favourites_count': 35115, 'utc_offset': None, 'time_zone': None, 'geo_enabled': True, 'verified': True, 'statuses_count': 22807, 'lang': None, 'status': {'created_at': 'Fri Nov 06 23:26:00 +0000 2020', 'id': 1324855585737744387, 'id_str': '1324855585737744387', 'text': 'Take the road less travelled. 🚗 Map out a scenic, social distance friendly trip in your home 

In [35]:
print(example3)

{'name': 'General Assembly', 'screen_name': 'GA', 'followers_count': 164653, 'friends_count': 5393}


##### Function 2 (Required)

**Name:** `find_hashtag`

**Returns:** list of data objects that contain information about each tweet that matches the hashtag provided as input.

**Arguments:**
 - `hashtag`: str, required; text to use as a hashtag search.  
 - `count`: int, optional; number of results to return
 - `search_type`: str, optional; type of results to return.  should accept 3 different values:
   - `mixed`:   return mix of most recent and most popular results
   - `recent`:  return most recent results
   - `popular`: return most popular results
   
**Note:** User should **not** have to actually use the `#` character for the `hashtag` argument.  The function should check to see if it's there, and if not, add it in for them.

**To Test:**  We'll check your function in the following ways:
 - `find_hashtag('DataScience')`
 - `find_hashtag('#DataScience')`
 - `find_hashtag('#DataScience', count=100)`, and double check the length of the `statuses` key to make sure it contains the right amount of results.  **Note:** Due to the version of the API we're using, the number of results returned will **not** necessarily match the value passed into the `count` parameter.  So if you specify 50 and it only returns 45, you are likely still doing it correctly.
 - `find_hashtag('#DataScience', search_type='recent/mixed/popular')`

In [79]:
auth = OAuth1('r8ZOzLMZoouq5u8iCdmNKiSMx', 'tIqgdrMHBrkMRXfBlY36cUpxDw34LygVL9hBg5A4Dl0yIQLOAb','1216043853284827136-XgLCLf15GtZZ0oIsgInUIjJT1lnPmT', 'ksKmplYg3WBqwThvTbhknJmhBYE7t43Bt0ZijmExr2Nq2')
def find_hashtag(hashtag, count = 20, search_type = 'mixed'):
    """
    This function finds related tweets that are tagged with a specific hashtag
    as a required input it accepts the hashtag with or without the #
    as one of the optional inputs it accepts the count of tweets required to be returned 
    as another optional input it accepts the search type as either recent, mixed, or popular
    as output it returns a list of data objects containing information about the tweets returned
    """
    
    #This line checks to see if the string begins with a # and removes it
    half_hash = hashtag.strip("#").lower()
    hashtag = '%23'+hashtag.strip("#").lower()
    
    #this returns the user details if the user exists or returns an empty dictionary if the user does not exist
    try:
        api_url ='https://api.twitter.com/1.1/search/tweets.json?q={}&count={}&result_type={}'.format(hashtag,count,search_type)
        selected_tweets = requests.get(api_url, auth=auth)
        # i'm not sure what a list of data objects here refers to and so I'm choosing to return what I think is most appropriate
        return selected_tweets.json()['statuses']
        #alternative solutions include the two options below
        #return selected_tweets.json()['search_metadata']
        #return [selected_tweets.json()]
        
    except:
        print("There was an error associated with your input")
        return []
example1 = find_hashtag('DataScience')
example2 = find_hashtag('#DataScience')
example3 = find_hashtag('#DataScience', count=100)

In [80]:
example1

[{'created_at': 'Thu Nov 05 22:11:00 +0000 2020',
  'id': 1324474320643629059,
  'id_str': '1324474320643629059',
  'text': "10-page [PDF] reference covers a semester's worth of introductory probability — The best #Probability Cheat Sheet t… https://t.co/gmlAaTGpgN",
  'truncated': True,
  'entities': {'hashtags': [{'text': 'Probability', 'indices': [89, 101]}],
   'symbols': [],
   'user_mentions': [],
   'urls': [{'url': 'https://t.co/gmlAaTGpgN',
     'expanded_url': 'https://twitter.com/i/web/status/1324474320643629059',
     'display_url': 'twitter.com/i/web/status/1…',
     'indices': [117, 140]}]},
  'metadata': {'result_type': 'popular', 'iso_language_code': 'en'},
  'source': '<a href="https://about.twitter.com/products/tweetdeck" rel="nofollow">TweetDeck</a>',
  'in_reply_to_status_id': None,
  'in_reply_to_status_id_str': None,
  'in_reply_to_user_id': None,
  'in_reply_to_user_id_str': None,
  'in_reply_to_screen_name': None,
  'user': {'id': 534563976,
   'id_str': '534563

In [63]:
example2

[{'statuses': [{'created_at': 'Thu Nov 05 22:11:00 +0000 2020',
    'id': 1324474320643629059,
    'id_str': '1324474320643629059',
    'text': "10-page [PDF] reference covers a semester's worth of introductory probability — The best #Probability Cheat Sheet t… https://t.co/gmlAaTGpgN",
    'truncated': True,
    'entities': {'hashtags': [{'text': 'Probability', 'indices': [89, 101]}],
     'symbols': [],
     'user_mentions': [],
     'urls': [{'url': 'https://t.co/gmlAaTGpgN',
       'expanded_url': 'https://twitter.com/i/web/status/1324474320643629059',
       'display_url': 'twitter.com/i/web/status/1…',
       'indices': [117, 140]}]},
    'metadata': {'result_type': 'popular', 'iso_language_code': 'en'},
    'source': '<a href="https://about.twitter.com/products/tweetdeck" rel="nofollow">TweetDeck</a>',
    'in_reply_to_status_id': None,
    'in_reply_to_status_id_str': None,
    'in_reply_to_user_id': None,
    'in_reply_to_user_id_str': None,
    'in_reply_to_screen_name': None

In [67]:
example3

[{'statuses': [{'created_at': 'Thu Nov 05 22:11:00 +0000 2020',
    'id': 1324474320643629059,
    'id_str': '1324474320643629059',
    'text': "10-page [PDF] reference covers a semester's worth of introductory probability — The best #Probability Cheat Sheet t… https://t.co/gmlAaTGpgN",
    'truncated': True,
    'entities': {'hashtags': [{'text': 'Probability', 'indices': [89, 101]}],
     'symbols': [],
     'user_mentions': [],
     'urls': [{'url': 'https://t.co/gmlAaTGpgN',
       'expanded_url': 'https://twitter.com/i/web/status/1324474320643629059',
       'display_url': 'twitter.com/i/web/status/1…',
       'indices': [117, 140]}]},
    'metadata': {'result_type': 'popular', 'iso_language_code': 'en'},
    'source': '<a href="https://about.twitter.com/products/tweetdeck" rel="nofollow">TweetDeck</a>',
    'in_reply_to_status_id': None,
    'in_reply_to_status_id_str': None,
    'in_reply_to_user_id': None,
    'in_reply_to_user_id_str': None,
    'in_reply_to_screen_name': None

In [66]:
example4

[{'errors': [{'code': 32, 'message': 'Could not authenticate you.'}]}]

##### Function 3 (Required)

**Name:** `get_followers`

**Returns:** list of data objects for each of the users followers, returning values for the `name`, `followers_count`, `friends_count`, and `screen_name` key for each user.

**Arguments:** 

 - `screen_name`: str, required; Twitter handle to search for.  **Results should not depend on user inputting the @ symbol.**
 - `keys`: list, required;  keys to return for each user.  default value: [`name`, `followers_count`, `friends_count`, `screen_name`]; if something else is listed, values for those keys should be returned
 - `to_df`: bool, required; default value: False; if True, return results in a dataframe.  Every value provided in the `keys` argument should be its own column, with rows populated by the corresponding values for each one for every user.
 
**To Test:** We'll test your functions in the following ways:

 - `get_followers('@GA')`
 - `get_followers('GA')`
 - `get_followers('GA', keys=['name', 'followers_count'])`
 - `get_followers('GA', keys=['name', 'followers_count'], to_df=True)`
 - `get_followers('GA', to_df=True)`

In [112]:
auth= OAuth1('r8ZOzLMZoouq5u8iCdmNKiSMx', 'tIqgdrMHBrkMRXfBlY36cUpxDw34LygVL9hBg5A4Dl0yIQLOAb','1216043853284827136-XgLCLf15GtZZ0oIsgInUIjJT1lnPmT', 'ksKmplYg3WBqwThvTbhknJmhBYE7t43Bt0ZijmExr2Nq2')
def get_followers(name, keys = ['name', 'followers_count', 'friends_count', 'screen_name'], to_df=False):
    """
    This function finds if a user exists for the twitter handle and details about the users' followers
    as a required input it accepts the user_name - a user's twitter handle
    as optional inputs it accepts parameters by which to filter the output. These parameters are entered as a list of strings
    as output it returns either a list of data objects on each of the users followers or a dataFrame of the same information
    """
    
    #This line checks to see if the string begins with an @ and removes it
    name = name.strip("@").lower()
    
    #this returns the user details if the user exists or returns an empty dictionary if the user does not exist
    user_data = []
    
    api_url ='https://api.twitter.com/1.1/followers/list.json?screen_name={}'.format(name)
    user_details = requests.get(api_url, auth=auth)
    users_list = user_details.json()['users']

    
    try:
        api_url ='https://api.twitter.com/1.1/followers/list.json?screen_name={}'.format(name)
        user_details = requests.get(api_url, auth=auth)
        users_list = user_details.json()['users']
        
        for user in users_list:
            output = {}
            for key in keys:
                output[key]=user[key]
            user_data.append(output)
        
        if to_df == True:
            return pd.DataFrame(user_data)
        else:
            return user_data
    except:
        print("There was an error associated with your input")
        
        if to_df == True:
            return pd.DataFrame(user_data)
        else:
            return user_data

example1 = get_followers('@GA')
example2 = get_followers('GA')
example3 = get_followers('GA', keys=['name', 'followers_count'])
example4 = get_followers('GA', keys=['name', 'followers_count'], to_df=True)
example5 =get_followers('GA', to_df=True)

In [113]:
example1

[{'name': 'Carl Koski',
  'followers_count': 14,
  'friends_count': 198,
  'screen_name': 'carl_koski'},
 {'name': 'John White',
  'followers_count': 266523,
  'friends_count': 175061,
  'screen_name': 'juanblanco76'},
 {'name': 'Hierarch Soft Tech',
  'followers_count': 384,
  'friends_count': 1394,
  'screen_name': 'HierarchTech'},
 {'name': 'Brandon R Myers',
  'followers_count': 2,
  'friends_count': 50,
  'screen_name': 'Brandon50352325'},
 {'name': 'Francisco',
  'followers_count': 7,
  'friends_count': 33,
  'screen_name': '_franciscomsosa'},
 {'name': 'Angela Crampton',
  'followers_count': 1932,
  'friends_count': 1024,
  'screen_name': 'angelatravels11'},
 {'name': 'Sumon Chandra',
  'followers_count': 186,
  'friends_count': 2314,
  'screen_name': 'SumonCh58302627'},
 {'name': 'Priscilla Solorzano',
  'followers_count': 105,
  'friends_count': 464,
  'screen_name': 'MsP_Solorzano'},
 {'name': 'Seo Young Oh',
  'followers_count': 11,
  'friends_count': 22,
  'screen_name': 's

In [114]:
example2

[{'name': 'Carl Koski',
  'followers_count': 14,
  'friends_count': 198,
  'screen_name': 'carl_koski'},
 {'name': 'John White',
  'followers_count': 266523,
  'friends_count': 175061,
  'screen_name': 'juanblanco76'},
 {'name': 'Hierarch Soft Tech',
  'followers_count': 384,
  'friends_count': 1394,
  'screen_name': 'HierarchTech'},
 {'name': 'Brandon R Myers',
  'followers_count': 2,
  'friends_count': 50,
  'screen_name': 'Brandon50352325'},
 {'name': 'Francisco',
  'followers_count': 7,
  'friends_count': 33,
  'screen_name': '_franciscomsosa'},
 {'name': 'Angela Crampton',
  'followers_count': 1932,
  'friends_count': 1024,
  'screen_name': 'angelatravels11'},
 {'name': 'Sumon Chandra',
  'followers_count': 186,
  'friends_count': 2314,
  'screen_name': 'SumonCh58302627'},
 {'name': 'Priscilla Solorzano',
  'followers_count': 105,
  'friends_count': 464,
  'screen_name': 'MsP_Solorzano'},
 {'name': 'Seo Young Oh',
  'followers_count': 11,
  'friends_count': 22,
  'screen_name': 's

In [115]:
example3

[{'name': 'Carl Koski', 'followers_count': 14},
 {'name': 'John White', 'followers_count': 266523},
 {'name': 'Hierarch Soft Tech', 'followers_count': 384},
 {'name': 'Brandon R Myers', 'followers_count': 2},
 {'name': 'Francisco', 'followers_count': 7},
 {'name': 'Angela Crampton', 'followers_count': 1932},
 {'name': 'Sumon Chandra', 'followers_count': 186},
 {'name': 'Priscilla Solorzano', 'followers_count': 105},
 {'name': 'Seo Young Oh', 'followers_count': 11},
 {'name': 'Ven2021', 'followers_count': 2},
 {'name': 'Elizabeth Kane (she/her)', 'followers_count': 1413},
 {'name': 'Andres', 'followers_count': 6},
 {'name': 'Citizen Consulting', 'followers_count': 4},
 {'name': 'مَنارْ | Manar', 'followers_count': 1011},
 {'name': 'Brad Steeves', 'followers_count': 6643},
 {'name': 'Funke Bakare', 'followers_count': 178},
 {'name': 'Ahlam | أحـلامـ 💫', 'followers_count': 125},
 {'name': 'Social Hoote', 'followers_count': 522},
 {'name': 'Podcasters Directory', 'followers_count': 477},
 

In [116]:
example4

Unnamed: 0,name,followers_count
0,Carl Koski,14
1,John White,266523
2,Hierarch Soft Tech,384
3,Brandon R Myers,2
4,Francisco,7
5,Angela Crampton,1932
6,Sumon Chandra,186
7,Priscilla Solorzano,105
8,Seo Young Oh,11
9,Ven2021,2


In [117]:
example5

Unnamed: 0,name,followers_count,friends_count,screen_name
0,Carl Koski,14,198,carl_koski
1,John White,266523,175061,juanblanco76
2,Hierarch Soft Tech,384,1394,HierarchTech
3,Brandon R Myers,2,50,Brandon50352325
4,Francisco,7,33,_franciscomsosa
5,Angela Crampton,1932,1024,angelatravels11
6,Sumon Chandra,186,2314,SumonCh58302627
7,Priscilla Solorzano,105,464,MsP_Solorzano
8,Seo Young Oh,11,22,syoh1010
9,Ven2021,2,128,yanglutc


##### Function 4 (Optional)

**Name:** `friends_of_friends`

**Returns:** list of data objects for each user that two Twitter users have in common

**Arguments:**

 - `names`: list, required; list of two Twitter users to compare friends list with
 - `keys`: list, optional; list of keys to return for information about each user.  Default value should be to return the entire data object.
 - `to_df`: bool, required; default value: False; if True, returns results in a dataframe.
 
**To Test:** We'll test your function in the following ways:

 - `friends_of_friends(['Beyonce', 'MariahCarey'])`
 - `friends_of_friends(['@Beyonce', '@MariahCarey'], to_df=True)`
 - `friends_of_friends(['Beyonce', 'MariahCarey'], keys=['id', 'name'])`
 - `friends_of_friends(['Beyonce', 'MariahCarey'], keys=['id', 'name'], to_df=True)`
 
Each of these should return 3 results. (Assuming they haven't followed the same people since this was last written).  

**Hint:** The `id` key is the unique identifier for someone, so if you want to check if two people are the same this is the best way to do it.

In [228]:
auth= OAuth1('r8ZOzLMZoouq5u8iCdmNKiSMx', 'tIqgdrMHBrkMRXfBlY36cUpxDw34LygVL9hBg5A4Dl0yIQLOAb','1216043853284827136-XgLCLf15GtZZ0oIsgInUIjJT1lnPmT', 'ksKmplYg3WBqwThvTbhknJmhBYE7t43Bt0ZijmExr2Nq2')

def friends_of_friends(people, keys = ['id', 'id_str', 'name', 'screen_name', 'location'
    , 'description', 'url', 'entities', 'protected', 'followers_count', 'friends_count', 'listed_count'
    , 'created_at', 'favourites_count', 'utc_offset', 'time_zone', 'geo_enabled', 'verified', 'statuses_count'
    , 'lang', 'status', 'contributors_enabled', 'is_translator', 'is_translation_enabled', 'profile_background_color'
    , 'profile_background_image_url', 'profile_background_image_url_https', 'profile_background_tile'
    , 'profile_image_url', 'profile_image_url_https', 'profile_banner_url', 'profile_link_color'
    , 'profile_sidebar_border_color', 'profile_sidebar_fill_color', 'profile_text_color'
    , 'profile_use_background_image', 'has_extended_profile', 'default_profile', 'default_profile_image'
    , 'following', 'live_following', 'follow_request_sent', 'notifications', 'muting', 'blocking'
    , 'blocked_by', 'translator_type']
    , to_df=False):

    """
    This function finds if friends are shared between two users
    as a required input it accepts a list of usernames 
    as optional inputs it accepts keys by which to filter the output. These parameters are entered as a list of strings
    as optional input it accepts a flag that helps decide if the returned value should be a DataFrame or a list
    as output it returns either a list of data objects on each of the users followers or a dataFrame of the same information
    """

    #This line checks to see if the names begins with an @ and removes it
    people = [name.strip("@").lower() for name in people]

    #useful lists for later
    all_user_ids = []
    all_user_data = []
    output = []

    
    #getting the list of friends for each name in the people list
    for name in people:
        api_url ='https://api.twitter.com/1.1/friends/list.json?screen_name={}'.format(name)
        user_details = requests.get(api_url, auth=auth)
        friends_list = user_details.json()['users']
        friend_ids = []
        #for each name we add the user_id into the all_user_ids list and the data into the all_user_data list
        for people in friends_list:
            friend_ids.append(people['id'])
            all_user_data.append(people)
        all_user_ids.append(friend_ids)
    
    #this helps us figure out the shared friends between the people
    shared_friends = [x for x in all_user_ids[0] if x in all_user_ids[1]]
    
    #doing this to eliminate duplicate records
    deduped_user_list = [] 
    for i in range(len(all_user_data)): 
        if all_user_data[i] not in all_user_data[i + 1:]: 
            deduped_user_list.append(all_user_data[i]) 
    all_user_data = deduped_user_list

    #this is where we select our output based on the shared friends
    for user_data in all_user_data:
        if user_data['id'] in shared_friends:
            output.append(user_data)

    if to_df == False:
        return output
    else:
        return pd.DataFrame(output)

In [205]:
test = user_details.json()['users']
len(shared_friends)

20

In [167]:
shared_friends = [x for x in all_user_data[0] if x in all_user_data[1]]

In [218]:
user_details

<Response [200]>

In [230]:
friends_of_friends(['@ifeladipo','@ladipoore'],keys=['id', 'name'], to_df=True)

Unnamed: 0,id,id_str,name,screen_name,location,description,url,entities,protected,followers_count,...,default_profile,default_profile_image,following,live_following,follow_request_sent,notifications,muting,blocking,blocked_by,translator_type
0,72853440,72853440,Moe,Mochievous,Washington DC | Lagos,11 | Attorney - Tech & Impact | I tweet about ...,https://t.co/l92M3p4whs,{'url': {'urls': [{'url': 'https://t.co/l92M3p...,False,238699,...,False,False,True,False,False,False,False,False,False,none
1,35735895,35735895,Editi Effiòng,EditiEffiong,Lagos,"Pretend you're a genius, then act like one.",https://t.co/bxdYdfXYeC,{'url': {'urls': [{'url': 'https://t.co/bxdYdf...,False,87950,...,False,False,True,False,False,False,False,False,False,regular


 ##### Function 5 (Optional)

Rewrite the `friends_of_friends` function, except this time include an argument called `full_search`, which accepts a boolean value.  If set to `True`, use cursoring to cycle through the complete set of users for the users provided.  

The twitter API only returns a subset of users in your results to save bandwidth, so you have to cycle through multiple result sets to get all of the values.

You can read more about how this works here:  https://developer.twitter.com/en/docs/basics/cursoring

Basically you have to do a `while` loop to continually make a new request using the values stored in the `next_cursor` key as part of your next query string until there's nothing left to search.

**Note:** We're using the free API, so we're operating under some limitations.  One of them being that you can only make 15 API calls in a 15 minute span to this portion of the API.  You can also only return up to 200 results per cursor, so this means you won't be able to completely search for everyone even if you set this up correctly.

That's fine, just do what you can under the circumstances.

**To Test:** To test your function, we'll run the following function calls:

 - `friends_of_friends(['ezraklein', 'tylercowen'])` -- should return 4 results if you do an API call that returns 200 results
 - `friends_of_friends(['ezraklein', 'tylercowen'], full_search=True)` -- should return 54 results if you do an API call that returns 200 results
 
**Hint:** Chances are you will exhaust your API limits quite easily in this function depending on who you search for.  Depending on how you have things set up, this could cause error messages to arise when things are otherwise fine.  Remember in class 3 when we were getting those weird dictionaries back because our limits were used up?  We won't hold you accountable for handling this inside your function, although it could make some things easier for your own testing.
       
Good luck!

In [239]:
auth= OAuth1('r8ZOzLMZoouq5u8iCdmNKiSMx', 'tIqgdrMHBrkMRXfBlY36cUpxDw34LygVL9hBg5A4Dl0yIQLOAb','1216043853284827136-XgLCLf15GtZZ0oIsgInUIjJT1lnPmT', 'ksKmplYg3WBqwThvTbhknJmhBYE7t43Bt0ZijmExr2Nq2')

def friends_of_friends(people, keys = ['id', 'id_str', 'name', 'screen_name', 'location'
    , 'description', 'url', 'entities', 'protected', 'followers_count', 'friends_count', 'listed_count'
    , 'created_at', 'favourites_count', 'utc_offset', 'time_zone', 'geo_enabled', 'verified', 'statuses_count'
    , 'lang', 'status', 'contributors_enabled', 'is_translator', 'is_translation_enabled', 'profile_background_color'
    , 'profile_background_image_url', 'profile_background_image_url_https', 'profile_background_tile'
    , 'profile_image_url', 'profile_image_url_https', 'profile_banner_url', 'profile_link_color'
    , 'profile_sidebar_border_color', 'profile_sidebar_fill_color', 'profile_text_color'
    , 'profile_use_background_image', 'has_extended_profile', 'default_profile', 'default_profile_image'
    , 'following', 'live_following', 'follow_request_sent', 'notifications', 'muting', 'blocking'
    , 'blocked_by', 'translator_type']
    , to_df=False
    , full_search = False):

    """
    This function finds if friends are shared between two users
    as a required input it accepts a list of usernames 
    as optional inputs it accepts keys by which to filter the output. These parameters are entered as a list of strings
    as optional input it accepts a flag that helps decide if the returned value should be a DataFrame or a list
    as output it returns either a list of data objects on each of the users followers or a dataFrame of the same information
    """

    #This line checks to see if the names begins with an @ and removes it
    people = [name.strip("@").lower() for name in people]

    #useful lists for later
    all_user_ids = []
    all_user_data = []
    output = []

    if full_search == False:
        #getting the list of friends for each name in the people list
        for name in people:
            api_url ='https://api.twitter.com/1.1/friends/list.json?screen_name={}'.format(name)
            user_details = requests.get(api_url, auth=auth)
            friends_list = user_details.json()['users']
            friend_ids = []
            #for each name we add the user_id into the all_user_ids list and the data into the all_user_data list
            for people in friends_list:
                friend_ids.append(people['id'])
                all_user_data.append(people)
            all_user_ids.append(friend_ids)
    else:
        cursor = -1
        #getting the complete list of friends for each name in the people list
        for name in people:
            while cursor != 0:
                api_url ='https://api.twitter.com/1.1/friends/list.json?screen_name={}'.format(name)
                user_details = requests.get(api_url, auth=auth)
                friends_list = user_details.json()['users']
                cursor = user_details.json()['next_cursor']
                friend_ids = []
                #for each name we add the user_id into the all_user_ids list and the data into the all_user_data list
                for people in friends_list:
                    friend_ids.append(people['id'])
                    all_user_data.append(people)
            all_user_ids.append(friend_ids)
    
    #this helps us figure out the shared friends between the people
    shared_friends = [x for x in all_user_ids[0] if x in all_user_ids[1]]
    
    #doing this to eliminate duplicate records
    deduped_user_list = [] 
    for i in range(len(all_user_data)): 
        if all_user_data[i] not in all_user_data[i + 1:]: 
            deduped_user_list.append(all_user_data[i]) 
    all_user_data = deduped_user_list

    #this is where we select our output based on the shared friends
    for user_data in all_user_data:
        if user_data['id'] in shared_friends:
            output.append(user_data)

    if to_df == False:
        return output
    else:
        return pd.DataFrame(output)

In [240]:
friends_of_friends(['@ifeladipo','@ladipoore'],keys=['id', 'name'], to_df=True, full_search = True)

KeyError: 'users'

In [234]:
friends_list

{'users': [{'id': 14634731,
   'id_str': '14634731',
   'name': 'Andy Hall',
   'screen_name': 'andrewbhall',
   'location': '',
   'description': "I'm a professor who runs a research group collecting data to understand how to build a better democracy @Stanford polisci.  Tweets self-destruct after 1 month.",
   'url': 'http://t.co/o9OesVT9h3',
   'entities': {'url': {'urls': [{'url': 'http://t.co/o9OesVT9h3',
       'expanded_url': 'http://www.andrewbenjaminhall.com',
       'display_url': 'andrewbenjaminhall.com',
       'indices': [0, 22]}]},
    'description': {'urls': []}},
   'protected': False,
   'followers_count': 5232,
   'friends_count': 1248,
   'listed_count': 148,
   'created_at': 'Sat May 03 04:10:02 +0000 2008',
   'favourites_count': 1476,
   'utc_offset': None,
   'time_zone': None,
   'geo_enabled': False,
   'verified': False,
   'statuses_count': 107,
   'lang': None,
   'status': {'created_at': 'Fri Nov 06 21:11:56 +0000 2020',
    'id': 1324821846257139714,
    'i

In [235]:
friends_list.keys()

dict_keys(['users', 'next_cursor', 'next_cursor_str', 'previous_cursor', 'previous_cursor_str', 'total_count'])

In [236]:
friends_list['next_cursor']

1668149150393811712