# Message Analysis

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

In [3]:
whatsapp = pd.read_csv('whatsapp_messages.csv')
whatsapp = whatsapp.drop(['UserPhone', 'QuotedMessage', 'QuotedMessageDate', 'QuotedMessageTime'], axis=1)

In [4]:
df = whatsapp
len(df)

6264

In [5]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 6264 entries, 0 to 6263
Data columns (total 7 columns):
Date1          56 non-null object
Date2          6264 non-null object
Time           6264 non-null object
UserName       6264 non-null object
MessageBody    6264 non-null object
MediaType      76 non-null object
MediaLink      76 non-null object
dtypes: object(7)
memory usage: 342.6+ KB


## Data Cleaning
Let's start the data cleaning up the date/time columns.

First, it looks like in our case all non-null rows of the `Date1` column have the same value as `Date2`, so let's drop `Date1` and rename `Date2`.

In [6]:
date1_neq_date2 = (~df['Date1'].isnull()) & (df['Date1'] != df['Date2'])

print('Number of rows where Date1 is not equal to Date2:', df[date1_neq_date2].shape[0])

Number of rows where Date1 is not equal to Date2: 0


In [7]:
# Consolidate Date1 and Date2
df = df.drop(['Date1'], axis=1)
df = df.rename(columns={'Date2': 'Date'})

In [8]:
# Combine date and time col to get full timestamp
df['Date'] = df['Date'] + ' ' + df['Time']
df = df.drop(['Time'], axis=1)

# Convert to date/time objects
datetime_cols = ['Date']
for col in datetime_cols:
    df[col] = pd.to_datetime(df[col])

In [9]:
df

Unnamed: 0,Date,UserName,MessageBody,MediaType,MediaLink
0,2019-08-27 17:26:30,Harrison,Let's do WhatsApp then,,
1,2019-08-27 17:27:20,Harrison,"My dad's a Jersey boy, Italian if you couldn't...",,
2,2019-08-27 17:28:05,Harrison,I was a music major for a year and nearly had ...,,
3,2019-08-27 17:28:50,Harrison,Insects are vegan right? Ever tried one?,,
4,2019-08-27 18:43:17,Jayqwellin Morsay,Cool glad you have WhatsApp! It's the best,,
5,2019-08-27 18:44:41,Jayqwellin Morsay,Gonna take full advantage of the reply feature...,,
6,2019-08-27 18:45:07,Jayqwellin Morsay,I think so? And no ew. You?,,
7,2019-08-27 18:46:03,Jayqwellin Morsay,Does that involve a specific instrument?? Soun...,,
8,2019-08-27 18:47:12,Jayqwellin Morsay,üòÑ yeah hahaha I could guess. That's cool. Do y...,,
9,2019-08-27 19:33:37,Harrison,Tried something I had no idea what it was in T...,,
