### Exercise 3:

#### Advertising Data and Logistic Regression 

Imagine you are asked to predict a call to action on an ad campaign

- This exercise will concentrate on showing some key features leveraged in Data Science

- This example uses made up data set:

    * `Daily Time Spent on Site`: consumer time on site in minutes
    * `Age`: cutomer age in years
    * `Area Income`: Avg. Income of geographical area of consumer
    * `Daily Internet Usage`: Avg. minutes a day consumer is on the internet
    * `Ad Topic Line`: Headline of the advertisement
    * `City`: City of consumer
    * `Male`: Whether or not consumer was male
    * `Country`: Country of consumer
    * `Timestamp`: Time at which consumer clicked on Ad or closed window
    * `Clicked on Ad`: 0 or 1 indicated clicking on Ad

#### Import Libraries

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

#### Get the Data

In [None]:
ad_data = pd.read_csv('data/3_advertising/advertising.csv')

In [None]:
ad_data.head()

In [None]:
ad_data.info()

In [None]:
ad_data.describe()

#### Exploratory Data Analysis

In [None]:
sns.set_style('whitegrid')
ad_data['Age'].hist(bins=30)
plt.xlabel('Age')

In [None]:
sns.jointplot(x='Age',y='Area Income',data=ad_data)

In [None]:
sns.jointplot(x='Age',y='Daily Time Spent on Site',data=ad_data,color='red',kind='kde');

In [None]:
sns.jointplot(x='Daily Time Spent on Site',y='Daily Internet Usage',data=ad_data,color='green')

In [None]:
sns.pairplot(ad_data,hue='Clicked on Ad',palette='bwr')

#### Prepare model

In [None]:
from sklearn.model_selection import train_test_split

In [None]:
X = ad_data[['Daily Time Spent on Site', 'Age', 'Area Income','Daily Internet Usage', 'Male']]
y = ad_data['Clicked on Ad']

In [None]:
# Split the data into a training set and test set
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)

#### Train  model

In [None]:
from sklearn.linear_model import LogisticRegression

In [None]:
logmodel = LogisticRegression()
logmodel.fit(X_train,y_train)

In [None]:
predictions = logmodel.predict(X_test)

#### Evaluate  model

In [None]:
from sklearn.metrics import classification_report

In [None]:
print(classification_report(y_test,predictions))