### Factors that influence adopted users
        
#### Creation_source
The following shows the percentage of users that were adopted for each source:
        
GUEST_INVITE          20.528967         
ORG_INVITE            15.401506         
PERSONAL_PROJECTS     19.109948         
SIGNUP                14.488936        
SIGNUP_GOOGLE_AUTH    14.873646          
        
Based on the above table, it can be concluded that users invited by guests or those who sign up for personal projects have a higher chance of becoming adopted users.

#### Opted_in_to_mailing_list         
The following shows the percentage of users that were adopted based on user opting into mailing list (1) or not opting into mailing list (0):      
         
0    16.158860         
1    16.981132           

Based on the above table, it can be concluded that there is no effect of opting into mailing list on adopted users.

#### Enabled for marketing drip
The following shows the percentage of users that were adopted based on user enabling marketing drip (1) or not enabling marketing drip (0):          
           
0    16.305801          
1    16.703952                
        
Based on the above table, it can be concluded that there is no effect of enabling marketing drip on adopted users.       
### The following shows all the code that was used to reach the above mentioned conclusions.

In [1]:
import pandas as pd

In [2]:
takehome_user_engagement = pd.read_csv('takehome_user_engagement.csv')

In [3]:
takehome_users = pd.read_csv('takehome_users.csv',  encoding='latin-1')

In [4]:
takehome_users.head(10)

Unnamed: 0,object_id,creation_time,name,email,creation_source,last_session_creation_time,opted_in_to_mailing_list,enabled_for_marketing_drip,org_id,invited_by_user_id
0,1,2014-04-22 03:53:30,Clausen August,AugustCClausen@yahoo.com,GUEST_INVITE,1398139000.0,1,0,11,10803.0
1,2,2013-11-15 03:45:04,Poole Matthew,MatthewPoole@gustr.com,ORG_INVITE,1396238000.0,0,0,1,316.0
2,3,2013-03-19 23:14:52,Bottrill Mitchell,MitchellBottrill@gustr.com,ORG_INVITE,1363735000.0,0,0,94,1525.0
3,4,2013-05-21 08:09:28,Clausen Nicklas,NicklasSClausen@yahoo.com,GUEST_INVITE,1369210000.0,0,0,1,5151.0
4,5,2013-01-17 10:14:20,Raw Grace,GraceRaw@yahoo.com,GUEST_INVITE,1358850000.0,0,0,193,5240.0
5,6,2013-12-17 03:37:06,Cunha Eduardo,EduardoPereiraCunha@yahoo.com,GUEST_INVITE,1387424000.0,0,0,197,11241.0
6,7,2012-12-16 13:24:32,Sewell Tyler,TylerSewell@jourrapide.com,SIGNUP,1356010000.0,0,1,37,
7,8,2013-07-31 05:34:02,Hamilton Danielle,DanielleHamilton@yahoo.com,PERSONAL_PROJECTS,,1,1,74,
8,9,2013-11-05 04:04:24,Amsel Paul,PaulAmsel@hotmail.com,PERSONAL_PROJECTS,,0,0,302,
9,10,2013-01-16 22:08:03,Santos Carla,CarlaFerreiraSantos@gustr.com,ORG_INVITE,1401833000.0,1,1,318,4143.0


In [5]:
takehome_user_engagement.head()

Unnamed: 0,time_stamp,user_id,visited
0,2014-04-22 03:53:30,1,1
1,2013-11-15 03:45:04,2,1
2,2013-11-29 03:45:04,2,1
3,2013-12-09 03:45:04,2,1
4,2013-12-25 03:45:04,2,1


In [6]:
takehome_user_engagement.isnull().sum()

time_stamp    0
user_id       0
visited       0
dtype: int64

In [7]:
import datetime as dt
takehome_user_engagement['time_stamp_dt'] = [dt.datetime.strptime(k, '%Y-%m-%d %H:%M:%S') for k in takehome_user_engagement['time_stamp']]
takehome_user_engagement.pop('time_stamp');

In [8]:
engagement = takehome_user_engagement.groupby(['user_id', takehome_user_engagement['time_stamp_dt'].dt.strftime('%W')])['visited'].sum()

In [9]:
new_engagement = engagement.reset_index()

In [10]:
new_engagement.head()

Unnamed: 0,user_id,time_stamp_dt,visited
0,1,16,1
1,2,1,1
2,2,5,3
3,2,6,2
4,2,9,1


In [11]:
from collections import defaultdict
d = defaultdict(list)

In [12]:
for i, k in enumerate(new_engagement.user_id):
    d[k].append(new_engagement.loc[i, 'visited'])

In [13]:
d2 = defaultdict(int)
for key, values in d.items():
    d2[key] = [1 for j in values if j > 2]
d3 = defaultdict(int)
for key, values in d2.items():
    if len(values) == 0:
        d3[key] = 0
    else:
        d3[key] = 1

In [14]:
filtered_engagement = pd.DataFrame(d3, index = d3.keys())

In [15]:
engagement = filtered_engagement.transpose().loc[:,1].reset_index()

In [16]:
engagement.columns = ['index', 'adopted_users']

In [17]:
merged_df = pd.merge(takehome_users, engagement, left_on = 'object_id', right_on = 'index', how = 'outer')

In [18]:
import matplotlib.pyplot as plt

In [19]:
non_adp = merged_df[merged_df['adopted_users'] == 0]['creation_source'].reset_index().groupby('creation_source').count()

In [20]:
adp = merged_df[merged_df['adopted_users'] == 1]['creation_source'].reset_index().groupby('creation_source').count()

In [21]:
non_adp.columns = ['non_adopted']
adp.columns = ['adopted']
source = pd.concat([adp, non_adp], axis = 1)

In [22]:
(source['adopted']/(source['non_adopted'] + source['adopted']))*100

creation_source
GUEST_INVITE          20.528967
ORG_INVITE            15.401506
PERSONAL_PROJECTS     19.109948
SIGNUP                14.488936
SIGNUP_GOOGLE_AUTH    14.873646
dtype: float64

In [23]:
non_adp = merged_df[merged_df['adopted_users'] == 0]['opted_in_to_mailing_list'].reset_index().groupby('opted_in_to_mailing_list').count()
adp = merged_df[merged_df['adopted_users'] == 1]['opted_in_to_mailing_list'].reset_index().groupby('opted_in_to_mailing_list').count()
non_adp.columns = ['non_adopted']
adp.columns = ['adopted']
source = pd.concat([adp, non_adp], axis = 1)
(source['adopted']/(source['non_adopted'] + source['adopted']))*100

opted_in_to_mailing_list
0    16.158860
1    16.981132
dtype: float64

In [24]:
non_adp = merged_df[merged_df['adopted_users'] == 0]['enabled_for_marketing_drip'].reset_index().groupby('enabled_for_marketing_drip').count()
adp = merged_df[merged_df['adopted_users'] == 1]['enabled_for_marketing_drip'].reset_index().groupby('enabled_for_marketing_drip').count()
non_adp.columns = ['non_adopted']
adp.columns = ['adopted']
source = pd.concat([adp, non_adp], axis = 1)
(source['adopted']/(source['non_adopted'] + source['adopted']))*100

enabled_for_marketing_drip
0    16.305801
1    16.703952
dtype: float64