# #RWFD Twitter Dataset Analysis

## I. Introduction

**Twitter**  is a social media platform where people express their public opinions thru a tweet. One tweet can be viral and make someone famous or be canceled that is why businesses tend to use this opportunity to market their product.
    
**#RWFD** has used Twitter in order to market its product, starting from June 1 until October 19, 2020. How the business is performing? Is it getting enough engagement rates on its posts? if it is high, how can it be maintained? if it is low how can it be improved?

## II. Data Description

* **ID** - id of the tweet 
* **Tweet** - message compose on twitter webapp/app
* **Time** - time a user posted the tweet
* **Engagement Rate** - the number of engagements (clicks, retweets, replies, follows and likes) divided by the total number of impressions.
* **Impressions** - number of users saw the tweet on twitter
* **Engagements** - total number of times a user has interacted with a tweet.This includes click anywhere on the Tweet (including hashtags, links, avatar, username, and Tweet expansion), retweets,replies, follows,and likes
* **Likes** - number of times people liked the tweet
* **Hashtag Clicks** - number of times a user clicked  a hashtag in the tweet
* **Retweets** - number of times people retweeted a tweet
* **Replies** - number of times a tweet received a reply
* **User Profile Clicks** - number of times a user profile was click
* **Url Clicks** - number of times the link in the tweet was click
* **Detail Expands** - number of times people viewed the details about a tweet
* **Permalink Clicks** - number of times the permalink to the tweet(the individual page dedicated to this tweet) has been clicked
* **App Opens** - number of times the tweet shared app was opened
* **App Installs** - number of times the tweets shared app was installed
* **Follows** - number of times user has been followed because of a tweet
* **Email Tweet** - number of times a tweet has been shared thru email
* **Dial Phone** - number of times a tweets phone number has been dialed
* **Media Views** - number of times a media view the tweet
* **Media Engagements** - number of clicks on your media counted accross videos, vines, gifs and images

## III. Methodology


First we need to import and install all the libraries that will be needing to analyze the data.

In [1]:
import os # library for filepath
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from datetime import datetime as dt
from datetime import timedelta
from pandas.plotting import register_matplotlib_converters
from statsmodels.tsa.stattools import acf, pacf
from statsmodels.tsa.arima_model import ARMA
register_matplotlib_converters()
from time import time

In [2]:
pwd=os.getcwd() #this creates a string of the folder this python script is stored in
path = pwd + '/SocialMedia.csv'#this creates a string that is the filepath to the socmed csv file 
socmed_df = pd.read_csv(path)
socmed_df.head()

Unnamed: 0,Tweet,id,time,impressions,engagements,engagement rate,retweets,replies,likes,user profile clicks,...,hashtag clicks,detail expands,permalink clicks,app opens,app installs,follows,email tweet,dial phone,media views,media engagements
0,id ligula suspendisse ornare consequat lectus ...,6672570000000000.0,2020-06-30 21:09 +0000,365,4,0.010959,0,0,1,2,...,0,1,0,0,0,0,0,0,0,0
1,euismod scelerisque quam turpis adipiscing lor...,8265460000000000.0,2020-06-30 17:14 +0000,184,2,0.01087,0,0,2,0,...,0,0,0,0,0,0,0,0,0,0
2,leo rhoncus sed vestibulum sit amet cursus id ...,281117000000000.0,2020-06-30 16:59 +0000,2644,46,0.017398,1,1,17,0,...,0,23,0,0,0,0,0,0,354,1
3,aenean lectus pellentesque eget nunc donec qui...,7758030000000000.0,2020-06-30 13:55 +0000,301,3,0.009967,0,1,2,0,...,0,0,0,0,0,0,0,0,0,0
4,sed accumsan felis ut at dolor quis odio conse...,6131840000000000.0,2020-06-30 12:13 +0000,528,0,0.0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [3]:
socmed_df.describe() 

Unnamed: 0,id,impressions,engagements,engagement rate,retweets,replies,likes,user profile clicks,url clicks,hashtag clicks,detail expands,permalink clicks,app opens,app installs,follows,email tweet,dial phone,media views,media engagements
count,1166.0,1166.0,1166.0,1166.0,1166.0,1166.0,1166.0,1166.0,1166.0,1166.0,1166.0,1166.0,1166.0,1166.0,1166.0,1166.0,1166.0,1166.0,1166.0
mean,4868715000000000.0,781.90223,79.111492,0.03638,0.403945,0.966552,6.76072,2.202401,2.426244,0.259863,13.294168,0.0,0.002573,0.0,0.0,0.0,0.0,76.393654,52.761578
std,2914026000000000.0,4705.174635,1587.72074,0.042312,4.928566,3.196524,73.221354,20.512202,14.847932,1.618631,137.062569,0.0,0.065462,0.0,0.0,0.0,0.0,1365.208497,1362.6146
min,4837990000000.0,5.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
25%,2342715000000000.0,166.0,3.0,0.012247,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
50%,4785870000000000.0,298.5,6.0,0.02329,0.0,0.0,2.0,0.0,0.0,0.0,2.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
75%,7473875000000000.0,710.75,18.0,0.042003,0.0,1.0,5.0,1.0,0.0,0.0,5.0,0.0,0.0,0.0,0.0,0.0,0.0,30.0,1.0
max,9987610000000000.0,156685.0,54098.0,0.389892,165.0,65.0,2487.0,641.0,240.0,41.0,4024.0,0.0,2.0,0.0,0.0,0.0,0.0,46489.0,46489.0


In [4]:
socmed_df.corr() #correlation matrix

Unnamed: 0,id,impressions,engagements,engagement rate,retweets,replies,likes,user profile clicks,url clicks,hashtag clicks,detail expands,permalink clicks,app opens,app installs,follows,email tweet,dial phone,media views,media engagements
id,1.0,-0.001808,0.005715,-0.008065,0.007981,-0.004528,0.00696,-0.002252,0.000621,-0.026965,-0.012775,,-0.030467,,,,,0.00931,0.007587
impressions,-0.001808,1.0,0.983443,0.254709,0.974741,0.602432,0.986237,0.925781,0.499667,0.101875,0.917847,,0.010493,,,,,0.977166,0.976146
engagements,0.005715,0.983443,1.0,0.244017,0.981747,0.512097,0.995858,0.922128,0.479686,0.026823,0.885499,,6.3e-05,,,,,0.996528,0.998726
engagement rate,-0.008065,0.254709,0.244017,1.0,0.241946,0.270397,0.246627,0.276519,0.345663,0.093746,0.271534,,0.002793,,,,,0.231263,0.23421
retweets,0.007981,0.974741,0.981747,0.241946,1.0,0.498087,0.982569,0.907716,0.528239,0.039554,0.861761,,0.031363,,,,,0.977964,0.980197
replies,-0.004528,0.602432,0.512097,0.270397,0.498087,1.0,0.5222,0.520287,0.263318,0.6185,0.736193,,0.012718,,,,,0.485635,0.478998
likes,0.00696,0.986237,0.995858,0.246627,0.982569,0.5222,1.0,0.92469,0.482823,0.028974,0.874752,,0.002278,,,,,0.995123,0.994658
user profile clicks,-0.002252,0.925781,0.922128,0.276519,0.907716,0.520287,0.92469,1.0,0.453077,0.051439,0.838301,,0.00153,,,,,0.916038,0.915896
url clicks,0.000621,0.499667,0.479686,0.345663,0.528239,0.263318,0.482823,0.453077,1.0,0.053461,0.421225,,0.034196,,,,,0.467803,0.47031
hashtag clicks,-0.026965,0.101875,0.026823,0.093746,0.039554,0.6185,0.028974,0.051439,0.053461,1.0,0.263658,,0.009887,,,,,0.000137,-0.000968


In [5]:
socmed_df.shape # to check the number of rows and columns

(1166, 21)

In [6]:
socmed_df.dtypes #check the datatypes of each column

Tweet                   object
id                     float64
time                    object
impressions              int64
engagements              int64
engagement rate        float64
retweets                 int64
replies                  int64
likes                    int64
user profile clicks      int64
url clicks               int64
hashtag clicks           int64
detail expands           int64
permalink clicks         int64
app opens                int64
app installs             int64
follows                  int64
email tweet              int64
dial phone               int64
media views              int64
media engagements        int64
dtype: object

In [7]:
socmed_df['time'] = socmed_df['time'].str.strip('+0000')
socmed_df['time'] = pd.to_datetime(socmed_df['time']) #converting time column into a date time data type
socmed_df['time']
socmed_df.dtypes

Tweet                          object
id                            float64
time                   datetime64[ns]
impressions                     int64
engagements                     int64
engagement rate               float64
retweets                        int64
replies                         int64
likes                           int64
user profile clicks             int64
url clicks                      int64
hashtag clicks                  int64
detail expands                  int64
permalink clicks                int64
app opens                       int64
app installs                    int64
follows                         int64
email tweet                     int64
dial phone                      int64
media views                     int64
media engagements               int64
dtype: object