# EasyTwitterAPI 

This jupyer notebook contains a summary of the main  `EasyTwitterAPI` functionality that will help you to scrapee and store data from Twitter:
#### Profile Information
- General information: number of posts, followees, screen name...
- Followees (a.k.a., Friends) of a user
- Followers of a user

#### List information
- Lists for which a user is a member/creator/subscriber
- Members of a List

#### Activity
- Timeline (i.e., tweets, answers, retweets, qtweets) of a user
- Favourited tweets of a user
- Tweets by id

In [None]:
# %%
from EasyTwitterAPI.easy_twitter_api import EasyTwitterAPI

import json
import pickle

In order to use the `EasyTwitterAPI` package, you need i) MongodDB credentials and ii) Twitter API credentials.

The **MongoDB credentials** are stored in a file called `local/host_mongodb.txt`, which should contain the  connection string for your MongoDB database, it should look something like

    mongodb+srv://<USERNAME>:<PASSWORD>@cluster[...]/myFirstDatabase?retryWrites=true&w=majority
Alternatively, you can install MongoDB locally and then you only need to set `host='localhost'`.

The **Twitter API credentials** are stored in a file called `local/credentials_api.json`. It should be a json file with the following four fields

    {
    "ACCESS_KEY": "XXXXXX",
    "ACCESS_SECRET": "XXXXXX",
    "CONSUMER_KEY": "XXXXXX",
    "CONSUMER_SECRET": "XXXXXX",
    "BEARER_TOKEN":"XXXXXX"
    }

In [None]:
with open("local/host_mongodb.txt", "r") as f:
    host = f.readline().strip()
scraper = EasyTwitterAPI(cred_file='local/credentials_api.json',  
                         db_name='twitter_test',
                         sleep_secs=330,
                         host=host
                        )

In [None]:
for i, coll_name in enumerate(scraper.db.collection_names()):
    print(f"{i} {coll_name}")

In [None]:
scraper.db.create_indexes()

# User information

To scrape the profile information of a user:

In [None]:
scraper.activate_cache(True)
user = scraper.get_user(screen_name='Twitter')
user = scraper.get_user(screen_name='jack')

To load the profile information from the database:

In [None]:
df = scraper.db.load_users(filter_={'screen_name': {'$in':['Twitter', 'jack']}}, find_one=False, return_as='df')
df.head()

To scrape the profile information of multiple users at the same time:

In [None]:
df = scraper.get_many_users(screen_name=['Twitter', 'jack', 'TwitterAPI'])
df.head()

# Followees
To scrape the followees of a user

In [None]:
followees = scraper.get_followees(screen_name='TwitterAPI')

# Followers
To scrape the followers of a user

In [None]:
followers = scraper.get_followers(screen_name='jack', max_num=800)

# Lists of user
We can collect three types of lists for a given user:
 - membership (m)
 - owned (o)
 - subscriptions (s)

In [None]:
from datetime import datetime
since = datetime.strptime('2021-09-01', '%Y-%m-%d')
lists = scraper.update_lists_of_user(list_type='m', 
                             min_dt=datetime.now(),
                             screen_name='jack',
                                    max_num=120)

print('\nThis are the first 4 Lists:')
for list_id in lists[:4]:
    print(list_id)

To scraper only the ids of the Lists:

In [None]:
scraper.activate_cache(True)
lists_ids = scraper.get_lists_ids_of_user(list_type='m', screen_name='jack')

print('\nThis are the first 4 Lists:')
for list_id in lists_ids[:4]:
    print(list_id)


To scraper the full information of the Lists:

In [None]:

df_lists = scraper.get_lists_of_user_full(list_type='m', screen_name='jack', max_num=289248, force=True)


df_lists.head()

# Lists

In [None]:
list_ = scraper.get_list(list_id_str='1283064489957445633')
for key, value in list_.items():
    print(f"{key} : {value}")

# Get members of Lists

In [None]:
members_list = scraper.get_members_of_list(list_id_str='1283064489957445633', max_num=1000000)
print('\nThis are the first 4 members:')
for member_id in members_list[:4]:
    print(member_id)

# Timeline

In [None]:
scraper.activate_cache(True)
df = scraper.get_user_activity_limited(screen_name='jack', max_num=2000, update_many=True)
df.head()

In [None]:
df.groupby('type').count()[['id_str']]

In [None]:
df = scraper.get_user_activity_limited(screen_name='jack', max_num=5151517)
df.head()

# Favourites

In [None]:
df = scraper.get_user_favorites(screen_name='jack', max_num=5151517)

In [None]:
df.head()

# Get tweets

In [None]:
id_str_list = list(df['id_str'].values)[:612]
print(f"Number of tweets: {len(id_str_list)}")

In [None]:
df_i =  scraper.get_tweets(id_str_list=id_str_list)

In [None]:
df[df.type=='answer'].head().columns

# How to collect the activity for t the followees of a user?

In [None]:
followees = scraper.get_followees(screen_name='jack')

for foll_id_str in followees[:20]:
    df = scraper.get_user_activity_limited(user_id=foll_id_str, max_num=5151517)
    