# Future Sales Prediction with Machine Learning

Predicting the future sales of a product helps a business manage the manufacturing and 
advertising cost of the product. 
There are many more benefits of predicting the future sales of a product. 

The dataset given here contains the data about the sales of the product. The dataset is about the advertising cost incurred by the business on various advertising platforms. Below is the description of all the columns in the dataset:

1. TV: Advertising cost spent in dollars for advertising on TV;
2. Radio: Advertising cost spent in dollars for advertising on Radio;
3. Newspaper: Advertising cost spent in dollars for advertising on Newspaper;
4. Sales: Number of units sold;

So, in the above dataset, the sales of the product depend on the advertisement cost of the product.

In [1]:
#future sales prediction with machine learning

 # importing the necessary Python libraries and the dataset:

In [2]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt 
import seaborn as sns 

In [3]:
df = pd.read_csv(r"C:\Users\dell\Downloads\advertising.csv")

In [4]:
df.head()

Unnamed: 0,TV,Radio,Newspaper,Sales
0,230.1,37.8,69.2,22.1
1,44.5,39.3,45.1,10.4
2,17.2,45.9,69.3,12.0
3,151.5,41.3,58.5,16.5
4,180.8,10.8,58.4,17.9


In [5]:
# checking null values or not 

In [6]:
df.isna().sum()

TV           0
Radio        0
Newspaper    0
Sales        0
dtype: int64

In [8]:
#visualize the relationship between the amount spent on advertising on TV and units sold

In [7]:
import plotly.express as px
import plotly.graph_objects as go
figure = px.scatter(data_frame = df, x="Sales",
                    y="TV", size="TV", trendline="ols")
figure.show()

In [9]:
#visualize the relationship between the amount spent on advertising on newspapers and units sold:

In [10]:
figure = px.scatter(data_frame = df, x="Sales",
                    y="Newspaper", size="Newspaper", trendline="ols")
figure.show()

In [11]:
#visualize the relationship between the amount spent on advertising on radio and units sold:

In [12]:
figure = px.scatter(data_frame = df, x="Sales",
                    y="Radio", size="Radio", trendline="ols")
figure.show()

Out of all the amount spent on advertising on various platforms, I can see that the amount spent on advertising the product on TV results in more sales of the product.

In [13]:
#orrelation of all the columns with the sales column:

In [15]:
correlation = df.corr()
print(correlation["Sales"].sort_values(ascending=False))

Sales        1.000000
TV           0.901208
Radio        0.349631
Newspaper    0.157960
Name: Sales, dtype: float64


# split and train 

In [17]:
from sklearn.model_selection import train_test_split

In [21]:
x = df.drop(["Sales"], axis=1)
y = df["Sales"]
xtrain,xtest,ytrain,ytest = train_test_split(x,y,test_size=0.2,random_state=42)
     

In [22]:
# train the model to predict future sales:

# linear regression 

In [23]:
from sklearn.linear_model import LinearRegression
LR = LinearRegression()

In [24]:
modelLr = LR.fit(xtrain,ytrain)


In [26]:
print(modelLr.score(xtest, ytest))

0.9059011844150826


# random forest 

In [28]:
from sklearn.ensemble import RandomForestRegressor
RF = RandomForestRegressor()

In [29]:
modelRF = RF.fit(xtrain,ytrain)

In [30]:
print(modelRF.score(xtest,ytest))

0.9518761387135086


# XGBOOST 

In [31]:
from xgboost import XGBRegressor
XGB =XGBRegressor()

In [32]:
modelXGB = XGB.fit(xtrain,ytrain)

In [33]:
print(modelXGB.score(xtest,ytest))

0.953029858014247


# prediction 

In [35]:
# xgbooster have the highest score from above trained models with 0.95

let’s input values into the model according to the features we have used to train it and predict how many units of the product can be sold based on the amount spent on its advertising on various platforms:

In [38]:
#features = [[TV, Radio, Newspaper]]
features = np.array([[230.1, 37.8, 69.2]])
print(modelXGB.predict(features))

[22.100317]


In [39]:
#features = [[TV, Radio, Newspaper]]
features = np.array([[100.1, 17.8, 39.2]])
print(modelXGB.predict(features))

[12.6595125]


In [40]:
#features = [[TV, Radio, Newspaper]]
features = np.array([[430.1, 137.8, 169.2]])
print(modelXGB.predict(features))

[26.791868]


# summary

Predicting the future sales of a product helps a business manage the manufacturing and advertising cost of the product.