# Pulling Badgers Football Followers and Gophers Football Followers

For the Twitter API assignment, I am looking at the followers of Badgers Football (On Wisconsin!) and Gophers Football. First, I import the proper packages and my API keys so I can authenticate the tweepy API. Then, I create a dictionary to store lists of each twitter account's follower information. The first for loop will pull all follower ids, which are used in the second for loop to obtain all follower information and write it to a text file.

In [1]:
#Importing Packages
import datetime
import tweepy

# Using my own API Keys
from MD_API_Keys import api_key, api_key_secret, access_token, access_token_secret

In [2]:
# Authenticate the Tweepy API
auth = tweepy.OAuthHandler(api_key,api_key_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth,wait_on_rate_limit=True)

In [4]:
#Grabbing follower IDs

team_handles = ['BadgerFootball','GopherFootball']


# This will iterate through each Twitter handle that we're collecting from
for screen_name in team_handles:
    
    # Tells Tweepy we want information on the handle we're collecting from
    # The next line specifies which information we want, which in this case is the number of followers 
    user = api.get_user(screen_name) 
    followers_count = user.followers_count

    # Let's see roughly how long it will take to grab all the follower IDs. 
    print(f'''
    @{screen_name} has {followers_count} followers. 
    That will take roughly {followers_count/(5000*60):.0f} hours and {followers_count/(5000):.2f} minutes
    ''')
    


    @BadgerFootball has 332818 followers. 
    That will take roughly 1 hours and 66.56 minutes
    

    @GopherFootball has 129840 followers. 
    That will take roughly 0 hours and 25.97 minutes
    


In [9]:
##Creating a dictionary that contains a list for each team's twitter
id_dict = {'BadgerFootball' : [],
           'GopherFootball' : []}

In [10]:
# Grabs the time when we start making requests to the API
start_time = datetime.datetime.now()

# .keys() allows us to iterate through each key in the dictionary
for handle in id_dict.keys():
    
    # Each page contains 5,000 records, so since we know there are much more than 5,000 followers for both
    # the Storm and Aces, we must iterate through each of the pages in order to get all follower IDs
    # To grab the follower IDs, we will be using followers_ids
    for page in tweepy.Cursor(api.followers_ids,
                              # This is how we will get around the issue of not being able to grab all ids at once
                              # Once the rate limit is hit, we will be notified that we must wait 15 mins (900 secs)
                              wait_on_rate_limit=True, wait_on_rate_limit_notify=True, compression=True,
                              screen_name=handle).pages():

        # The page variable comes back as a list, so we have to use .extend rather than .append
        id_dict[handle].extend(page)
        

# Let's see how long it took to grab all follower IDs
end_time = datetime.datetime.now()
elapsed_time = end_time - start_time
print(elapsed_time)

Rate limit reached. Sleeping for: 894
Rate limit reached. Sleeping for: 893
Rate limit reached. Sleeping for: 893
Rate limit reached. Sleeping for: 891
Rate limit reached. Sleeping for: 891
Rate limit reached. Sleeping for: 891


1:30:38.286192


In [8]:
##Grabbing indicated information from users and writing it to a text file
headers = ['screen_name','name', 'id', 'location', 'followers_count', 'friends_count', 'description']

for team in id_dict.keys():
    
    with open(f'{team}_followers.txt','w', encoding='utf-8') as out_file:
        out_file.write('\t'.join(headers) + '\n')

        for idx, ids in enumerate(id_dict[team]):
            
            # Getting around private accounts with try and except statement
            try:
                user = api.get_user(ids)
                description = str(user.description).replace('\t',' ').replace('\n',' ')
                outline = [user.screen_name, user.name, user.id, user.location, user.followers_count, user.friends_count, user.description]
                
                out_file.write('\t'.join([str(item) for item in outline]) + '\n')
                
            except:
                continue
                
        
            

The above code resulted in a large text file that, if listed by user, contains screen name, name, id number, location, follower count, friend count, and description. Each field, if provided, is separated by a tab. However, I made a mistake and had to re-run grabbing follower info for "BadgerFootball" because I realized that I had tagged an 's' onto the end of the handle, causing it to run an account other than the official UW Wisconsin account. To save time, I just ran 'BadgerFootball' separately again below so I didn't have to wait for 'GopherFootball' to finish running when I already had their correct info.

In [None]:
##Grabbing indicated information from users and writing it to a text file
headers = ['screen_name','name', 'id', 'location', 'followers_count', 'friends_count', 'description']


    
with open(f'BadgerFootball_followers.txt','w', encoding='utf-8') as out_file:
    out_file.write('\t'.join(headers) + '\n')

    for idx, ids in enumerate(id_dict['BadgerFootball']):
            
            # Getting around private accounts with try and except statement
        try:
            user = api.get_user(ids)
            description = str(user.description).replace('\t',' ').replace('\n',' ')
            outline = [user.screen_name, user.name, user.id, user.location, user.followers_count, user.friends_count, user.description]
                
            out_file.write('\t'.join([str(item) for item in outline]) + '\n')
                
        except:
            continue