# Introduction

In [2]:
import tweepy
import pandas as pd
import numpy as np
import configparser

### 1. Configuration and Authentication 
This is the setup part and authentication. I will be using **configparser** to ensure my api keys are not visible. I suggest you do the same. The following is how you set up your own configuration process.


1.  create a project from the developer's portal
2.  generate your api and access keys
3.  save them in a 'config.ini' file in the following format:
 
   
   
    > ```ini
    > [twitter]
    > CONSUMER_KEY = ''
    > CONSUMER_SECRET = ''
    > ACCESS_TOKEN = ''
    > ACCESS_TOKEN_SECRET = ''
    > ```
    >
    
4. install configparser by running  `pip install configparser`

> **Note:** If you don't plan on using the config parser make sure you remove the import and change the next cell accordingly. But make sure you adhere to the same variable names :)

In [21]:
# read the file from 'config.ini' 
config = configparser.ConfigParser()
config.read('config.ini')

# API Variables
CONSUMER_KEY = config['twitter']['CONSUMER_KEY']
CONSUMER_SECRET = config['twitter']['CONSUMER_SECRET']
ACCESS_TOKEN = config['twitter']['ACCESS_TOKEN']
ACCESS_TOKEN_SECRET = config['twitter']['ACCESS_TOKEN_SECRET']


In [22]:
# authenticate using tweepy
def twitter_setup():
    auth = tweepy.OAuth1UserHandler(CONSUMER_KEY, CONSUMER_SECRET)  # project access
    auth.set_access_token(ACCESS_TOKEN, ACCESS_TOKEN_SECRET)  # user access

    api = tweepy.API(auth = auth)
    return api

extractor = twitter_setup() 

### 2. Data Collection

In [10]:
tweets = extractor.user_timeline(count = 100)

### 3. Data Exploration

### 4. Data Visualization

### 5. Data Storage

In [11]:
columns_header = ['ID', 'Tweet','Timestamp', 'Likes', 'Retweets', 'Length'] # these should be universal in every .csv file
data = []

In [19]:
for tweet in tweets:
    data.append([tweet.id, tweet.text, tweet.created_at, tweet.favorite_count, tweet.retweet_count, len(tweet.text)])

In [17]:
# convert to a dataframe
df = pd.DataFrame(data = data, columns = columns_header)
df.head()


Unnamed: 0,ID,Tweet,Timestamp,Likes,Retweets,Length
0,1493976251899564039,@AseeISadan كاميرتهم ترقع شوي ترا 😭😭😭😭😭,2022-02-16 15:51:09+00:00,0,0,39
1,1493967109516410884,@AseeISadan هم يصورونك مضيع اخوي,2022-02-16 15:14:50+00:00,0,0,32
2,1488910367065579521,@AseeISadan Better late than never I guess 🤭\n...,2022-02-02 16:21:08+00:00,0,0,76
3,1440984432345821189,RT @stat_ksu: مشاركة طلاب قسم الإحصاء وبحوث ال...,2021-09-23 10:20:35+00:00,0,10,124
4,1417558226552463372,RT @ChesterBe: Me and my friends singing songs...,2021-07-20 18:53:12+00:00,0,1021,71


In [23]:
df.to_csv('tweets_1.csv') # make sure you create a new file name 