# Predicting sales

## Predict sales based on different marketing platforms ads spending
### This notebook uses the dataset *advertising.csv*

Dataset source details: https://www.kaggle.com/ashydv/advertising-dataset<br>

This notebook is an example. It does not show an exhaustive detail on all CRISP-DM phases.

(c) 2020-2022 Nuno António - Rev. 1.02

### Initial setup/packages loading

In [1]:
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn import datasets, linear_model
from sklearn.metrics import r2_score, mean_absolute_error

In [2]:
# Load dataset
ds = pd.read_csv("advertising.csv")

### Data understanding

In [3]:
# Verify content
ds.head()

Unnamed: 0,TV,Radio,Newspaper,Sales
0,230.1,37.8,69.2,22.1
1,44.5,39.3,45.1,10.4
2,17.2,45.9,69.3,12.0
3,151.5,41.3,58.5,16.5
4,180.8,10.8,58.4,17.9


In [4]:
# Describe dataset
ds.describe().T

Unnamed: 0,count,mean,std,min,25%,50%,75%,max
TV,200.0,147.0425,85.854236,0.7,74.375,149.75,218.825,296.4
Radio,200.0,23.264,14.846809,0.0,9.975,22.9,36.525,49.6
Newspaper,200.0,30.554,21.778621,0.3,12.75,25.75,45.1,114.0
Sales,200.0,15.1305,5.283892,1.6,11.0,16.0,19.05,27.0


### Data preparation

In [5]:
cols = ['TV','Radio','Newspaper']
X = ds[cols]
y = ds['Sales']

### Modeling

In [6]:
# Split data into training and test
X_train, X_test, y_train, y_test = train_test_split(X, y, train_size = 0.75, test_size = 0.25, random_state = 123)

In [7]:
# Create object for linear regression
regr = linear_model.LinearRegression()

# Train the model for training dataset
regr.fit(X_train, y_train)

In [8]:
# Estimate results in test data
y_pred = regr.predict(X_test)

### Evaluation

In [9]:
print('Coefficients: \n', regr.coef_)

Coefficients: 
 [ 0.05491695  0.10454795 -0.0008706 ]


In [10]:
print('Mean Absolute Error: %.6f' % mean_absolute_error(y_test, y_pred))

Mean Absolute Error: 1.332117


In [11]:
print('Coefficient of determination (r2): %.6f' % r2_score(y_test, y_pred))

Coefficient of determination (r2): 0.887046
