# Ad Click Prediction on Internet Accessing Users

Here I predict whether or not a user will click on an ad, based on his/her features. As this is a binary classification problem, a logistic regression model is well suited here.

Import Dependencies

In [None]:
import pandas as pd 
import numpy as np
import seaborn as sns 
import matplotlib.pyplot as plt
%matplotlib inline

import Data

In [None]:
ad_data = pd.read_csv('advertising.csv')

In [None]:
ad_data.head()

In [None]:
ad_data.info()

In [None]:
ad_data.isnull().sum

In [None]:
ad_data.describe()

Exploratory Data Analysis

In [None]:
plt.hist(ad_data['Age'], bins=30)
plt

Checking out the relationship between the age and daily time spent on the site

In [None]:
sns.jointplot('Age', 'Daily Time Spent on Site',ad_data)

Relationship between the daily time spend on the site and the internet usage

In [None]:
sns.jointplot('Daily Time Spent on Site', 'Daily Internet Usage',ad_data)

In [None]:
sns.pairplot(ad_data, hue='Clicked on Ad')

Model Planning and Building

In [None]:
ad_data.columns

In [None]:
countries = pd.get_dummies(ad_data['Country'],drop_first=True)

In [None]:
ad_data = pd.concat([ad_data,countries], axis=1)

In [None]:
ad_data.drop(['Country','Ad Topic Line','City','Timestamp'],axis=1,inplace=True)

In [None]:
X=ad_data.drop('Clicked on Ad',axis=1)
Y=ad_data['Clicked on Ad']

Data splitting for test and training 

In [None]:
from sklearn.model_selection import train_test_split

In [None]:
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.3, random_state=101)

Model Training

In [None]:
from sklearn.linear_model import LogisticRegression

In [None]:
logrec = LogisticRegression()

In [None]:
logrec.fit(X, Y)

Predictions

In [None]:
pred = logrec.predict(X_test)

Evaluations

In [None]:
from sklearn.metrics import classification_report
print(classification_report(Y_test,pred))