## Analysing Intercom Data

In this notebook we will be analysing customer and conversation data from Intercom, using a python library that leverages the Intercom API to fetch the data. If you want to run this notebook, the install instructions are kept in the Readme.md.

### Introduction

A B2B SAAS company called GiveMeRates has installed Intercom on their member’s site. The company is a price comparison site for rates of different software tools ir products that can be integrated into businesses. Anything from; where to buy the best corporprate cards, to software that focuses employee wellbeing. 

GiveMeRates allow its customers (a group employees per company) access to the comparison site and a live chat function to ask any questions or report any issues they have with the service.

Each conversation with an employee gets logged by Intercom and resolved by GiveMeRates.
In this example, we aim to analyse the customers (companies) and their respective users (employees).


### Setting up Intercom Connection

A python library '[python-intercom](https://pypi.org/project/python-intercom/)' connects the [Itercom API](https://developers.intercom.com/building-apps/docs/rest-apis) in a simple way, which we will be using today. For more help on the Intercom API and authentication click [here](https://developers.intercom.com/building-apps/docs/authentication-types#section-access-tokens).

1. Set up an account with Intercom if you don't already have one. You will need some conversatin activity logged in order to pull any data.
2. Navigate to the Intercom Developer Hub, go to 'Your apps' and create a New App.
3. In your app, navigate to Authentication on left hand side panel and make a note of your access token.
4. Install and import the intercom library:
    - Cmd: [pip install python-intercom](https://pypi.org/project/python-intercom/)
5. Insert your access token into the below chunk and run.

In [3]:
from intercom.client import Client
intercom = Client(personal_access_token='ACCESS_TOKEN')
intercom = Client(personal_access_token='dG9rOmVhMDMwNDZhXzk1OTBfNDY1Ml84ZDc3XzU2ZjViMTQxNjhjYToxOjA=')

### Import Libraries for Data and Plotting

In [4]:
import pandas as pd
import json
import numpy as np
import os
import random
from datetime import date
import datetime

import plotly.offline as py
import plotly.graph_objs as go
import cufflinks as cf
from cufflinks import tools

### Pull Intercom Data Using the Library

##### Customer Data

In [23]:
customers = intercom.companies.find()
customer_data = customers.data

In [43]:
company_cols = ['company_id', 'company_name', 'user_count','revenue','tag']
df_customer = pd.DataFrame(columns=company_cols)
for i in customer_data:
    tag = ''
    if len(i['tags']['tags']) > 0:
        tag = i['tags']['tags'][0]['name']
    df_customer = df_customer.append({'company_id': i['id'],
                                      'company_name':i['name'],
                                      'user_count':i['user_count'],
                                      'revenue':i['monthly_spend'],
                                      'tag':tag
                                     }, ignore_index=True)

##### Conversation Data

In [47]:
conversations = intercom.conversations.find()
conversation_data = conversations.conversations

In [48]:
conversation_cols = ['created_at','parts','contact_id']
df_conversation = pd.DataFrame(columns=conversation_cols)
for i in conversation_data:
    df_conversation = df_conversation.append({'created_at':i['created_at'],
                                             'parts':i['statistics']['count_conversation_parts'],
                                             'contact_id':i['contacts']['contacts'][0]['id']},
                                            ignore_index=True)

##### User List per Customer - Mapping File

In [88]:
df_customer_map = pd.DataFrame(columns = ['company_id','contact_id'])
# for index,row in df_customer.iterrows():
#     users = intercom.companies.users(row['company_id'])
#     for i in users:
#             df_customer_map = df_customer_map.append({'company_id':row['company_id'],
#                                                      'contact_id':i['id']},
#                                                     ignore_index=True)
    

In [None]:
df_conversation = df_conversation.merge(df_customer_map,on='contact_id')

### Caching Our Data

Let's save everything to cache file so we dont need to download the data every time we want to run this notebook. This is also handy to save data for APIs that have historical limits.

In [90]:
df_customer.to_csv('./data/customer_data.csv')
df_conversation.to_csv('./data/conversation_data.csv')

#### Pull in the data from cache

In [95]:
df_customer = pd.read_csv('./data/customer_data.csv')
df_conversation = pd.read_csv('./data/conversation_data.csv')

### Preparing our Data

Let's prepare the data to be used later in the analysis.

In [96]:
df_conversation['created_at'] = pd.to_datetime(df_conversation['created_at'], unit = 's')
df_conversation['created_date'] = pd.to_datetime(df_conversation['created_at']).dt.date
df_conversation.head()

Unnamed: 0,created_at,parts,contact_id,company_id,created_date
0,2020-02-27 03:23:10,58.0,10237.0,10002,2020-02-27
1,2020-02-24 10:12:09,41.0,9754.0,10002,2020-02-24
2,2020-02-26 23:38:45,40.0,10197.0,10021,2020-02-26
3,2020-02-23 01:49:54,34.0,9929.0,10020,2020-02-23
4,2020-02-24 12:49:08,37.0,9929.0,10020,2020-02-24


## Analysing the Data

#### How much revenue is brought in per customer per month?

In [97]:
layout = cf.Layout(
    height = 600,
    width = 800,
    yaxis = dict(
    title = 'Revenue'),
    xaxis = dict(
    title = 'Customer'),
    title = 'Revenue per Customer'
)

fig = df_customer.set_index('company_name')[['revenue']].sort_values('revenue',ascending=False).\
    iplot(kind='bar',
          fill=True,
          width=2,
          asFigure=True,
          layout = layout)

    

fig.show()

In [99]:
layout = cf.Layout(
    height = 600,
    width = 800,
    yaxis = dict(
    title = 'Users'),
    xaxis = dict(
    title = 'Customer'),
    title = 'Users per Customer'
)

fig = df_customer.set_index('company_name')[['user_count']].sort_values('user_count',ascending=False).\
    iplot(kind='bar',
          fill=True,
          width=2,
          asFigure=True,
          layout = layout)

    

fig.show()

#### How many conversations are coming through intercom per day?

In [73]:
layout = cf.Layout(
    height = 500,
    width = 800,
    yaxis = dict(
    title = 'Number of Conversations'),
    xaxis = dict(
    title = 'Date'),
    title = 'Number of Conversations per Day'
)

fig = df_conversation.groupby('created_date').agg({'contact_id':'count'}).\
    iplot(kind='line',
          fill=True,
          width=2,
          asFigure=True,
          layout = layout)

    

fig.show()

#### What is the distribution of each user's conversation length?

In [101]:
layout = cf.Layout(
    height = 500,
    width = 800,
    yaxis = dict(
    title = 'Number of Conversations'),
    xaxis = dict(
    title = 'Number of Parts'),
    title = 'How Many Parts Per Conversation (length of conversation)'
)

fig = df_conversation.groupby('parts').agg({'contact_id':'count'}).\
    iplot(kind='bar',
          fill=True,
          width=2,
          asFigure=True,
          layout = layout)

    

fig.show()

In [103]:
print('Number of paying customers: '+str(len(df_customer.loc[df_customer.tag =='Paid','tag'])))
print('Number of trial customers: '+str(len(df_customer.loc[df_customer.tag =='Trial','tag'])))


Number of paying customers: 16
Number of trial customers: 10


#### What is the distribution of a user's activity. How many conversations per user? 

In [104]:
layout = cf.Layout(
    height = 500,
    width = 800,
    yaxis = dict(
    title = 'Number of User'),
    xaxis = dict(
    title = 'Number of Conversations'),
    title = 'How Many Conversations per User'
)

df_temp = df_conversation.groupby('contact_id').agg({'created_at':'count'}).\
    reset_index().groupby('created_at').agg({'contact_id':'count'})
    
fig = df_temp.\
    iplot(kind='bar',
          fill=True,
          width=2,
          asFigure=True,
          layout = layout)

    

fig.show()

#### What is the average activity rate of each customer vs the revenue?

In [108]:
df_merged = df_customer.merge(df_conversation,on='company_id')

df_merged2 = df_merged.groupby(['company_name']).agg({'created_at':'count',\
                                                      'created_date':'nunique','revenue':'max'})
df_merged2['conv_per_user'] = df_merged2['created_at']/df_merged2['created_date']
df_merged2 = df_merged2.reset_index()

In [107]:
layout = cf.Layout(
    height = 500,
    width = 800,
    yaxis = dict(
    title = 'Customer Revenue'),
    xaxis = dict(
    title = 'Avg Conversations Per User'),
    title = 'Customer Activity vs Revenue'
)


fig = df_merged2.\
    iplot(kind='scatter',
          x='conv_per_user',
          y='revenue',
          categories ='company_name',
          fill=True,
          width=2,
          asFigure=True,
          layout = layout)

    

fig.show()