In [5]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression

data = pd.read_csv("https://raw.githubusercontent.com/amankharwal/Website-data/master/advertising.csv")
print(data.head())

      TV  Radio  Newspaper  Sales
0  230.1   37.8       69.2   22.1
1   44.5   39.3       45.1   10.4
2   17.2   45.9       69.3   12.0
3  151.5   41.3       58.5   16.5
4  180.8   10.8       58.4   17.9


In [6]:
print(data.isnull().sum())

TV           0
Radio        0
Newspaper    0
Sales        0
dtype: int64


#### So this dataset doesn’t have any null values. Now let’s visualize the relationship between the amount spent on advertising on TV and units sold:

In [9]:
import plotly.express as px
import plotly.graph_objects as go
figure = px.scatter(data_frame =data, x= "Sales", y= "TV", size ="TV", trendline ="ols")
figure.show()


#### Now let’s visualize the relationship between the amount spent on advertising on newspapers and units sold:



In [10]:
figure = px.scatter(data_frame =data, x= "Sales", y= "Newspaper", size ="Newspaper", trendline ="ols")
figure.show()

#### Now let’s visualize the relationship between the amount spent on advertising on radio and units sold:



In [11]:
figure = px.scatter(data_frame = data, x="Sales",y="Radio", size="Radio", trendline="ols")
figure.show()

In [12]:
correlation = data.corr()
print(correlation["Sales"].sort_values(ascending= False))

Sales        1.000000
TV           0.901208
Radio        0.349631
Newspaper    0.157960
Name: Sales, dtype: float64


### Future Sales Prediction Model
Now in this section, I will train a machine learning model to predict the future sales of a product. But before I train the model, let’s split the data into training and test sets:

In [31]:
x = np.array(data.drop(["Sales"],1))
y = np.array(data["Sales"])
xtrain,xtest,ytrain,ytest = train_test_split(x,y,test_size=0.05, random_state=42)



In a future version of pandas all arguments of DataFrame.drop except for the argument 'labels' will be keyword-only.



# Now let’s train the model to predict future sales:



In [32]:
model = LinearRegression()
model.fit(xtrain,ytrain)
print(model.score(xtest,ytest))

0.9254875970050773


Now let’s input values into the model according to the features we have used to train it and predict how many units of the product can be sold based on the amount spent on its advertising on various platforms:

In [33]:
#feature = [[TV, Radio,Newspaper]]
features = np.array([[230.1,37.8,69.2]])
print(model.predict(features))

[21.21335827]


# Summary
#### So this is how we can train a machine learning model to predict the future sales of a product. Predicting the future sales of a product helps a business manage the manufacturing and advertising cost of the product. I hope you liked this article on future sales prediction with machine learning. Feel free to ask valuable questions in the comments section below.