# Service client

The goal of this notebook is to predict the response time of a brand twitter account to a tweet, in order to prioritize the queue.

In [313]:
import pandas as pd
import numpy as np

## Computes reply_time
### Data loading

In [314]:
df = pd.read_csv("extraction_twint/data_service_clients/wholefoods.csv").filter(['id','conversation_id','username','date','time'])

### Dates parsing

In [315]:
df['datetime']=pd.to_datetime(df['date'] + ' ' + df['time'])
df.drop(['date','time'], axis=1, inplace=True)
df.head()

Unnamed: 0,id,conversation_id,username,datetime
0,1202231733690732544,1202231733690732544,henrystewartdam,2019-12-04 15:22:36
1,1202231286351507461,1202231286351507461,beverlyharzog,2019-12-04 15:20:49
2,1202231159859621888,1202231159859621888,coreycade,2019-12-04 15:20:19
3,1202230605725028352,1202046088649281537,wholefoods,2019-12-04 15:18:07
4,1202227816923828227,1201978721550385153,commasftw,2019-12-04 15:07:02


### Create reply_time

This column contains the reply time in seconds if Wholefoods replied

In [316]:
##Filter Wholefoods tweets
wholefoods_tweets = df[df['username'] == 'wholefoods'].filter(['conversation_id','datetime'])

In [317]:
##Remove wholefoods tweets from the main df
df = df[df['username'] != 'wholefoods']

In [318]:
##Join the tables
join=df.set_index('conversation_id').join(wholefoods_tweets.set_index('conversation_id'), lsuffix='', rsuffix='_reply')

We still have a lot of duplicates, so we need to filtrate and keep only the first reply from the brand

In [319]:
##computes reply time
join['reply_time']=join['datetime_reply']-join['datetime']

In [320]:
#remove negative reply_time
join = join.loc[join['reply_time'].apply(lambda x: x.days>=0)]
#keep only the smallest value
reply_times = join.groupby("id", as_index=False)["reply_time"].min().set_index('id')

In [321]:
#Add the reply time to the main df
df=df.set_index('id')
df.drop(['conversation_id'], axis=1, inplace=True)
df['reply_time']=reply_times['reply_time']

In [322]:
df.head(20000)

Unnamed: 0_level_0,username,datetime,reply_time
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1202231733690732544,henrystewartdam,2019-12-04 15:22:36,NaT
1202231286351507461,beverlyharzog,2019-12-04 15:20:49,NaT
1202231159859621888,coreycade,2019-12-04 15:20:19,NaT
1202227816923828227,commasftw,2019-12-04 15:07:02,NaT
1202226820856385536,theaustinot,2019-12-04 15:03:05,NaT
1202226526713974787,thefoodbankinc,2019-12-04 15:01:54,NaT
1202221503674273796,annajoh5575333,2019-12-04 14:41:57,NaT
1202219721350959104,msamysteele,2019-12-04 14:34:52,NaT
1202216537995857920,msamysteele,2019-12-04 14:22:13,00:11:43
1202216214451195904,annekpix,2019-12-04 14:20:56,NaT


## Save to CSV

In [324]:
original = pd.read_csv("extraction_twint/data_service_clients/wholefoods.csv")
original = original.set_index('id')
original['reply_time']=reply_times['reply_time']
original.to_csv("extraction_twint/data_service_clients/wholefoods_computed.csv")