# Sales Prediction

Predicting TV, Radio, and Newspaper sales involves analyzing various factors and trends that can impact the sales of these traditional media channels. 

<img src='https://i.pinimg.com/originals/95/2f/7b/952f7bb4f9b139ae50f64f6a6542b492.png' width=500 height=500>

### Importing the Libraries

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.metrics import confusion_matrix
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn.metrics import classification_report

### Importing the dataset

In [2]:
df = pd.read_csv(r"C:\Users\aditya kumar\Downloads\advertising.csv")

In [3]:
df.head()

Unnamed: 0,TV,Radio,Newspaper,Sales
0,230.1,37.8,69.2,22.1
1,44.5,39.3,45.1,10.4
2,17.2,45.9,69.3,12.0
3,151.5,41.3,58.5,16.5
4,180.8,10.8,58.4,17.9


In [4]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 200 entries, 0 to 199
Data columns (total 4 columns):
 #   Column     Non-Null Count  Dtype  
---  ------     --------------  -----  
 0   TV         200 non-null    float64
 1   Radio      200 non-null    float64
 2   Newspaper  200 non-null    float64
 3   Sales      200 non-null    float64
dtypes: float64(4)
memory usage: 6.4 KB


In [5]:
df.isnull().sum()

TV           0
Radio        0
Newspaper    0
Sales        0
dtype: int64

In [6]:
x=df.iloc[:, : -1].values # Independent variables
y=df.iloc[:, -1].values # Dependent variables
print(x)
print(y)

[[230.1  37.8  69.2]
 [ 44.5  39.3  45.1]
 [ 17.2  45.9  69.3]
 [151.5  41.3  58.5]
 [180.8  10.8  58.4]
 [  8.7  48.9  75. ]
 [ 57.5  32.8  23.5]
 [120.2  19.6  11.6]
 [  8.6   2.1   1. ]
 [199.8   2.6  21.2]
 [ 66.1   5.8  24.2]
 [214.7  24.    4. ]
 [ 23.8  35.1  65.9]
 [ 97.5   7.6   7.2]
 [204.1  32.9  46. ]
 [195.4  47.7  52.9]
 [ 67.8  36.6 114. ]
 [281.4  39.6  55.8]
 [ 69.2  20.5  18.3]
 [147.3  23.9  19.1]
 [218.4  27.7  53.4]
 [237.4   5.1  23.5]
 [ 13.2  15.9  49.6]
 [228.3  16.9  26.2]
 [ 62.3  12.6  18.3]
 [262.9   3.5  19.5]
 [142.9  29.3  12.6]
 [240.1  16.7  22.9]
 [248.8  27.1  22.9]
 [ 70.6  16.   40.8]
 [292.9  28.3  43.2]
 [112.9  17.4  38.6]
 [ 97.2   1.5  30. ]
 [265.6  20.    0.3]
 [ 95.7   1.4   7.4]
 [290.7   4.1   8.5]
 [266.9  43.8   5. ]
 [ 74.7  49.4  45.7]
 [ 43.1  26.7  35.1]
 [228.   37.7  32. ]
 [202.5  22.3  31.6]
 [177.   33.4  38.7]
 [293.6  27.7   1.8]
 [206.9   8.4  26.4]
 [ 25.1  25.7  43.3]
 [175.1  22.5  31.5]
 [ 89.7   9.9  35.7]
 [239.9  41.5

### Spliting the dataset

In [7]:
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size = 1/3, random_state = 0)

In [8]:
print(x_train,y_train)

[[163.5  36.8   7.4]
 [100.4   9.6   3.6]
 [ 76.3  27.5  16. ]
 [184.9  43.9   1.7]
 [134.3   4.9   9.3]
 [273.7  28.9  59.7]
 [296.4  36.3 100.9]
 [ 96.2  14.8  38.9]
 [109.8  47.8  51.4]
 [255.4  26.9   5.5]
 [204.1  32.9  46. ]
 [240.1  16.7  22.9]
 [193.7  35.4  75.6]
 [191.1  28.7  18.2]
 [ 89.7   9.9  35.7]
 [ 43.   25.9  20.5]
 [ 38.2   3.7  13.8]
 [ 13.1   0.4  25.6]
 [239.3  15.5  27.3]
 [ 17.2  45.9  69.3]
 [210.7  29.5   9.3]
 [ 25.6  39.    9.3]
 [177.    9.3   6.4]
 [206.9   8.4  26.4]
 [ 66.1   5.8  24.2]
 [149.7  35.6   6. ]
 [129.4   5.7  31.3]
 [ 94.2   4.9   8.1]
 [276.7   2.3  23.7]
 [276.9  48.9  41.8]
 [  7.8  38.9  50.6]
 [250.9  36.5  72.3]
 [175.7  15.4   2.4]
 [ 11.7  36.9  45.2]
 [ 75.5  10.8   6. ]
 [199.8   3.1  34.6]
 [230.1  37.8  69.2]
 [107.4  14.   10.9]
 [225.8   8.2  56.5]
 [163.3  31.6  52.9]
 [131.1  42.8  28.9]
 [206.8   5.2  19.4]
 [177.   33.4  38.7]
 [216.8  43.9  27.2]
 [ 66.9  11.7  36.8]
 [227.2  15.8  49.9]
 [193.2  18.4  65.7]
 [ 97.5   7.6

In [9]:
print(x_test,y_test)

[[ 69.2  20.5  18.3]
 [ 50.   11.6  18.4]
 [ 90.4   0.3  23.2]
 [289.7  42.3  51.2]
 [170.2   7.8  35.2]
 [ 56.2   5.7  29.7]
 [  8.7  48.9  75. ]
 [240.1   7.3   8.7]
 [ 23.8  35.1  65.9]
 [197.6  23.3  14.2]
 [261.3  42.7  54.7]
 [ 87.2  11.8  25.9]
 [156.6   2.6   8.3]
 [187.8  21.1   9.5]
 [ 76.4  26.7  22.3]
 [120.2  19.6  11.6]
 [265.6  20.    0.3]
 [  0.7  39.6   8.7]
 [ 74.7  49.4  45.7]
 [213.4  24.6  13.1]
 [287.6  43.   71.8]
 [140.3   1.9   9. ]
 [175.1  22.5  31.5]
 [131.7  18.4  34.6]
 [ 53.5   2.   21.4]
 [123.1  34.6  12.4]
 [165.6  10.   17.6]
 [205.   45.1  19.6]
 [224.    2.4  15.6]
 [ 25.1  25.7  43.3]
 [ 67.8  36.6 114. ]
 [198.9  49.4  60. ]
 [280.7  13.9  37. ]
 [241.7  38.   23.2]
 [ 13.2  15.9  49.6]
 [ 18.7  12.1  23.4]
 [ 59.6  12.   43.1]
 [180.8  10.8  58.4]
 [ 68.4  44.5  35.6]
 [ 25.   11.   29.7]
 [ 36.9  38.6  65.6]
 [ 31.5  24.6   2.2]
 [142.9  29.3  12.6]
 [209.6  20.6  10.7]
 [215.4  23.6  57.6]
 [102.7  29.6   8.4]
 [  8.6   2.1   1. ]
 [ 16.9  43.7

### Linear Regression

In [10]:
from sklearn.linear_model import LinearRegression

regressor = LinearRegression() 

regressor.fit(x_train, y_train) 

### Predicting the sales value Using random values

In [11]:
y_pred1 = regressor.predict([[230.1,37.8,69.2]])
print(y_pred1)

y_pred2 = regressor.predict([[44.5,39.3,45.1]])
print(y_pred2)

[21.45633733]
[11.52723982]


In [12]:
print("R squared: {:.2f}".format(regressor.score(x,y)*100))

R squared: 90.18


### Accuracy Score

In [13]:
regressor.score(x_test,y_test)

0.8671668543617773