# Sales Prediction Model

## Introduction
This Jupyter Notebook implements a sales prediction model using logistic regression. The dataset used is 'DigitalAd_dataset.csv', and the goal is to predict sales based on various features.

## Libraries Used

In [2]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.preprocessing import MinMaxScaler
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import confusion_matrix , accuracy_score

# Importing and Reviewing the Dataset

In [3]:
df = pd.read_csv('DigitalAd_dataset.csv')
df.head()
df.describe()
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 400 entries, 0 to 399
Data columns (total 3 columns):
 #   Column  Non-Null Count  Dtype
---  ------  --------------  -----
 0   Age     400 non-null    int64
 1   Salary  400 non-null    int64
 2   Status  400 non-null    int64
dtypes: int64(3)
memory usage: 9.5 KB


# Segmenting the features of the dataset

*Age* and *Salary* columns segmented as *x* and *status* as *y*

In [4]:
x = df.iloc[:,:-1].values
y = df.iloc[:,-1].values

# Train Test Splitting

The dataset is divided into training and testing sets using the train_test_split function from scikit-learn. 75% of the data is used for training the model, and the remaining 25% is reserved for evaluating its performance.

In [5]:
x_train , x_test , y_train , y_test = train_test_split(x,y,test_size=0.25,random_state=0)

# Feature Scaling

I have scaled the features in two version with two different feature scaling technique *Standardization* and *Min-Max* Scaling

In [6]:
sc = StandardScaler()
x1_train = sc.fit_transform(x_train)
x1_test = sc.transform(x_test)

mm = MinMaxScaler()
x2_train = mm.fit_transform(x_train)
x2_test = mm.transform(x_test)


# Model Training
Created two *Logistic Regression Models*  
- *model1* trained with *Standardization* scaled data
- *model2* trained with *Min-Max* scaled data

In [7]:
model1 = LogisticRegression(random_state=0)
model1.fit(x1_train,y_train)
model2 = LogisticRegression(random_state=0)
model2.fit(x2_train,y_train)

# Model Evaluation
Evaluted and Compared two model 

In [10]:
y1_pred = model1.predict(x1_test)
y2_pred = model2.predict(x2_test)
cm1 = confusion_matrix(y_test,y1_pred)
cm2 = confusion_matrix(y_test,y2_pred)
print(f'Confusion Matrix for model1: {cm1}')
print(f'Confusion Matrix for model2: {cm2}')
print(f'Accuracy Score model1: {accuracy_score(y_test,y1_pred) * 100}')
print(f'Accuracy Score model2: {accuracy_score(y_test,y2_pred) * 100}')

Confusion Matrix for model1: [[61  0]
 [20 19]]
Confusion Matrix for model2: [[61  0]
 [21 18]]
Accuracy Score model1: 80.0
Accuracy Score model2: 79.0


# Prediction
As we can see *model1* has better accuracy score so we use this model to predict

In [9]:
age = int(input("Enter Cutomer's Age : "))
salary = int(input("Enter Cutomer's Salary : "))

newCustomer = [[age,salary]]
result = model1.predict(sc.transform(newCustomer))
print(result)
if result == 1:
    print('Customer will buy the product.')
else :
    print('Customer will not buy the product.')

[0]
Customer will not buy the product.
