The DNN approach is suitable for project with a huge amount of data. Since both DDN approach and statistical approach with BGF and Gamma  (Lifetimes) perform with merely same results , we only need to choose which one suits us for the interface we are going to build. 

(I thought of a section in the interface which could be a radio button or a dropdown to choose between Statistical approach using Lifetimes (BGF and Gamma) versus DNN and compare the two but the blog says the results are merely the same and since we are not going to integrate Lifetimes we are just opting for DNN and Regression

### Necessary libraries to import

In [59]:
# pip install pandas-profiling[notebook]

In [60]:
# from pandas_profiling import ProfileReport

# ML vs Statistical Approach for CLV

In [50]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
from datetime import datetime

#Statistical LTV
from lifetimes import BetaGeoFitter, GammaGammaFitter
from lifetimes.utils import calibration_and_holdout_data, summary_data_from_transaction_data

# ML approach to LTV
import tensorflow as tf 
#import tensorflow_probability as tfp
from tensorflow import keras
from tensorflow.keras import layers
import tensorflow_docs as tfdocs
import tensorflow_docs.modeling as tfmodel
import tensorflow_docs.plots


# evaluation
from sklearn.metrics import r2_score
from sklearn.metrics  import mean_absolute_error

# Plotting 
import matplotlib
import matplotlib.pyplot as plt
matplotlib.use('TkAgg')
import seaborn as sns 



## Data preprocessing and modeling (feature engineering)

In [2]:
# Make the default figures a bit bigger
plt.rcParams['figure.figsize'] = (10,7) 

In [3]:
# Read the dataset 
retail_ol_h1 = pd.read_csv('./data/Year 2009-2010_train.csv',  encoding= 'unicode_escape')
retail_ol_h2 = pd.read_csv('./data/Year 2010-2011_train.csv',  encoding= 'unicode_escape')


In [4]:
frames = [retail_ol_h1, retail_ol_h2]
results = pd.concat(frames)
data = results.copy()
data['InvoiceDate'] = pd.to_datetime(data.InvoiceDate, format = '%Y/%m/%d %H:%M')

In [5]:

#Datetime transformation
data['date'] = pd.to_datetime(data.InvoiceDate.dt.date)
data['time'] = data.InvoiceDate.dt.time
data['hour'] = data['time'].apply(lambda x: x.hour)
data['weekend'] = data['date'].apply(lambda x: x.weekday() in [5, 6])
data['dayofweek'] = data['date'].apply(lambda x: x.dayofweek)


In [6]:
data.drop(['Unnamed: 0'], axis =1, inplace=True)

In [7]:

print(data.sample(5))

       Invoice StockCode                         Description  Quantity  \
149492  517978     22634      CHILDS BREAKFAST SET SPACEBOY          2   
265418  501929     20738            GREEN MINI TAPE MEASURE         10   
252278  496290    84750A         PINK SMALL GLASS CAKE STAND         8   
29090   532659     22158  3 HEARTS HANGING DECORATION RUSTIC         5   
45207   576463     23356               LOVE HOT WATER BOTTLE         1   

               InvoiceDate  Price  Customer ID         Country       date  \
149492 2010-08-03 14:38:00   9.95      16316.0  United Kingdom 2010-08-03   
265418 2010-03-22 11:21:00   0.85      17760.0  United Kingdom 2010-03-22   
252278 2010-01-29 17:55:00   1.95          NaN  United Kingdom 2010-01-29   
29090  2010-11-14 11:18:00   2.95      16121.0  United Kingdom 2010-11-14   
45207  2011-11-15 11:37:00   5.95      17974.0  United Kingdom 2011-11-15   

            time  hour  weekend  dayofweek  
149492  14:38:00    14    False          1  
26

In [8]:
#Plots a timeseries of total sales
data.groupby('date')['Quantity'].sum().plot()
#Prints the total number of days between start and end
print(data['date'].max() - data['date'].min())

738 days 00:00:00


So, I have around 1 year of data. Because the ML approach requires time periods for feature creation, training targets, and validation targets, I'll split it into the following segments:
1. Training Features Period - from 2011-01-01 until 2011-06-11
2. Training Target Period - from 2011-06-12 until 2011-09-09
3. Testing Features Period - from 2011-04-02 until 2011-09-10
4. Testing Target Period - from 2011-09-11 until 2011-12-09

In [9]:
#Dataset info
print(f'Total Number of Purchases: {data.shape[0]}')
print(f'Total Number of transactions: {data.Invoice.nunique()}')
print(f'Total Unique Days: {data.date.nunique()}')
print(f"Total Unique Customers: {data['Customer ID'].nunique()}")
print(f"We are predicting {(data['date'].max() - datetime(2011, 9, 11)).days} days")

Total Number of Purchases: 853896
Total Number of transactions: 50850
Total Unique Days: 604
Total Unique Customers: 5910
We are predicting 89 days


## Baseline: BG/NBD + Gamma-Gamma model

BG/NDB and Gamma-Gamma models are statistical models that model the purchasing behaviour (transactions and the average order value) by fitting different types of distributions. This type of modelling requires the data on transaction level, so let's first aggregate the dataset by the invoice. I'll be aggregating the transactional revenue which is simply calculated as the quantity times the item price.

### Data Prep

In [12]:
#Get revenue column
data['Revenue'] = data['Quantity'] * data['Price']

#Context data for the revenue (date & customerID)
id_lookup = data[['Customer ID', 'Invoice', 'date']].drop_duplicates()
id_lookup.index = id_lookup['Invoice']
id_lookup = id_lookup.drop('Invoice', axis=1)

transactions_data = pd.DataFrame(data.groupby('Invoice')['Revenue'].sum()).join(id_lookup)

In [13]:
transactions_data.head()

Unnamed: 0_level_0,Revenue,Customer ID,date
Invoice,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
489434,364.9,13085.0,2009-12-01
489435,145.8,13085.0,2009-12-01
489436,436.6,13078.0,2009-12-01
489437,310.75,15362.0,2009-12-01
489438,2152.4,18102.0,2009-12-01


`lifetimes` package has a utility function to split the data and aggregate the features into RFM format. So here, I'm going to use it to save some time, but if the data is large, it makes sense to do this yourselve.

In [14]:
#Spit into train - test
rfm_train_test = calibration_and_holdout_data(transactions_data, 'Customer ID', 'date',
                                        calibration_period_end='2011-09-10',
                                        monetary_value_col = 'Revenue')   

#Selecting only customers with positive value in the calibration period (otherwise Gamma-Gamma model doesn't work)
rfm_train_test = rfm_train_test.loc[rfm_train_test['monetary_value_cal'] > 0, :]

In [15]:
print(rfm_train_test.shape)
rfm_train_test.head()

(3608, 7)


Unnamed: 0_level_0,frequency_cal,recency_cal,T_cal,monetary_value_cal,frequency_holdout,monetary_value_holdout,duration_holdout
Customer ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
12346.0,9.0,400.0,635.0,8569.27,0.0,0.0,90.0
12347.0,5.0,275.0,314.0,567.698,2.0,662.34,90.0
12348.0,3.0,190.0,348.0,439.693333,1.0,160.0,90.0
12349.0,3.0,328.0,645.0,777.383333,1.0,1326.49,90.0
12352.0,5.0,130.0,302.0,141.806,3.0,263.76,90.0


### Modelling

Now, we can use the RFM calibration data to train the models.

In [16]:
#Train the BG/NBD model
bgf = BetaGeoFitter(penalizer_coef=0.1)
bgf.fit(rfm_train_test['frequency_cal'], rfm_train_test['recency_cal'], rfm_train_test['T_cal'])

<lifetimes.BetaGeoFitter: fitted with 3608 subjects, a: 0.06, alpha: 54.11, b: 0.43, r: 0.95>

To fit Gamma-Gamma model, we first need to make sure that the monetary value and frequency are not correlated as this is one of the basi assumptions of the model.

In [17]:
#Train Gamma-Gamma
rfm_train_test[['monetary_value_cal', 'frequency_cal']].corr()

Unnamed: 0,monetary_value_cal,frequency_cal
monetary_value_cal,1.0,0.170918
frequency_cal,0.170918,1.0


They are not, so we can continue with fitting

In [18]:
ggf = GammaGammaFitter(penalizer_coef = 0)
ggf.fit(rfm_train_test['frequency_cal'],
        rfm_train_test['monetary_value_cal'])

<lifetimes.GammaGammaFitter: fitted with 3608 subjects, p: 1.39, q: 3.80, v: 553.14>

### Prediction

Prediction is done in three steps:
1. Predict the expected number of transactions
2. Predict the average order value
3. Multiply number of transations by the average order value

In [19]:
#Predict the expected number of transactions in the next 89 days
predicted_bgf = bgf.predict(89,
                        rfm_train_test['frequency_cal'], 
                        rfm_train_test['recency_cal'], 
                        rfm_train_test['T_cal'])
trans_pred = predicted_bgf.fillna(0)

#Predict the average order value
monetary_pred = ggf.conditional_expected_average_profit(rfm_train_test['frequency_cal'],
                                        rfm_train_test['monetary_value_cal'])

#Putting it all together
sales_pred = trans_pred * monetary_pred

### Evaluation

In [20]:
actual = rfm_train_test['monetary_value_holdout'] *  rfm_train_test['frequency_holdout']

In [21]:
def evaluate(actual, sales_prediction):
    print(f"Total Sales Actual: {np.round(actual.sum())}")
    print(f"Total Sales Predicted: {np.round(sales_prediction.sum())}")
    print(f"Individual R2 score: {r2_score(actual, sales_prediction)} ")
    print(f"Individual Mean Absolute Error: {mean_absolute_error(actual, sales_prediction)}")
    plt.scatter(sales_prediction, actual)
    plt.xlabel('Prediction')
    plt.ylabel('Actual')      
    plt.show()

In [22]:

evaluate(actual, sales_pred)

Total Sales Actual: 1644057.0
Total Sales Predicted: 1393624.0
Individual R2 score: 0.6680390533741762 
Individual Mean Absolute Error: 313.54211073164504


It seems like the model does a fairly good job at predicting the customer's revenue for the next 3 months. Let's take a look if we can get similar performance from the Machine Learning approach.b

## ML Approach

This approach differs from statistical one, in a sense that it doesn't fit a distribution for latent parameters, but models the conditional expectations explicitly. I'm going to be using a small Deep Neural Network because it does quite a good job at implicit feature engineering and models the interaction effects quite nicely. I'm going to split my work into 3 parts:
1. Feature engineering for train and test periods
2. Modelling
3. Evaluation

### Feature Engineering
The possibility to include additional features into the model besides the RFM is the greates advantage of this approach. However, it's also the main limitation as **your model is going to be only as good as your features**. So, make sure to spend a lot of time on this section, and experiment yourselve to find the best features.  

In [23]:
#  Feature engineering 
def get_features(data, feature_start, feature_end, target_start, target_end):
    """
    Function that outputs the features and targets on the user-level.
    Inputs:
        * data - a dataframe with raw data
        * feature_start - a string start date of feature period
        * feature_end - a  string end date of feature period
        * target_start - a  string start date of target period
        * target_end - a  string end date of target period
    """
    features_data = data.loc[(data.date >= feature_start) & (data.date <= feature_end), :]
    print(f'Using data from {(pd.to_datetime(feature_end) - pd.to_datetime(feature_start)).days} days')
    print(f'To predict {(pd.to_datetime(target_end) - pd.to_datetime(target_start)).days} days')
    
    #Transactions data features
    total_rev = features_data.groupby('Customer ID')['Revenue'].sum().rename('total_revenue')
    recency = (features_data.groupby('Customer ID')['date'].max() - features_data.groupby('Customer ID')['date'].min()).apply(lambda x: x.days).rename('recency')
    frequency = features_data.groupby('Customer ID')['InvoiceDate'].count().rename('frequency')
    t = features_data.groupby('Customer ID')['date'].min().apply(lambda x: (datetime(2011, 6, 11) - x).days).rename('t')
    time_between = (t / frequency).rename('time_between')
    avg_basket_value = (total_rev / frequency).rename('avg_basket_value')
    avg_basket_size = (features_data.groupby('Customer ID')['Quantity'].sum() / frequency).rename('avg_basket_Size')
    returns = features_data.loc[features_data['Revenue'] < 0, :].groupby('Customer ID')['InvoiceDate'].count().rename('num_returns')
    hour = features_data.groupby('Customer ID')['hour'].median().rename('purchase_hour_med')
    dow = features_data.groupby('Customer ID')['dayofweek'].median().rename('purchase_dow_med')
    weekend =  features_data.groupby('Customer ID')['weekend'].mean().rename('purchase_weekend_prop')
    train_data = pd.DataFrame(index = rfm_train_test.index)
    train_data = train_data.join([total_rev, recency, frequency, t, time_between, avg_basket_value, avg_basket_size, returns, hour, dow, weekend])
    train_data = train_data.fillna(0)
    
    #Target data
    target_data = data.loc[(data.date >= target_start) & (data.date <= target_end), :]
    target_quant = target_data.groupby(['Customer ID'])['date'].nunique()
    target_rev = target_data.groupby(['Customer ID'])['Revenue'].sum().rename('target_rev')
    train_data = train_data.join(target_rev).fillna(0)
    
    return train_data.iloc[:, :-1], train_data.iloc[:, -1]

In [24]:
X_train, y_train = get_features(data, '2010-01-01', '2010-12-30', '2010-12-31', '2011-11-30')

Using data from 363 days
To predict 334 days


In [25]:
X_test, y_test = get_features(data, '2010-01-01', '2010-12-30', '2010-12-31', '2011-11-30')

Using data from 363 days
To predict 334 days


### Modelling
Here, I'm going to use a Keras API to Tensorflow to build a simple DNN. Architecture here doesn't really matter because the problem is simple and small enough. However, if you have more data, make sure to fine tune the model. Start small, and see if the performance increases as the complexity increases.

In [75]:
#DNN
def build_model():
    model = keras.Sequential([
    layers.Dense(32, activation='relu', input_shape=[len(X_train.columns), ]),
    layers.Dropout(0.3),
    layers.Dense(32, activation='relu'),
    layers.Dense(1)
    ])

    optimizer = tf.keras.optimizers.Adam(0.001)
   
    model.compile(loss='mse',
            optimizer=optimizer,
            metrics=['mae', 'mse'])
    
    return model



In [82]:
y_train

Customer ID
12346.0    77183.60
12347.0     2726.27
12348.0      612.68
12349.0     1326.49
12352.0     1373.01
             ...   
18280.0       95.70
18281.0       64.32
18283.0     1493.47
18286.0        0.00
18287.0     1554.32
Name: target_rev, Length: 3608, dtype: float64

In [81]:
X_train

Unnamed: 0_level_0,total_revenue,recency,frequency,t,time_between,avg_basket_value,avg_basket_Size,num_returns,purchase_hour_med,purchase_dow_med,purchase_weekend_prop
Customer ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
12346.0,-83.67,273.0,34.0,523.0,15.382353,-2.460882,0.558824,11.0,13.0,0.0,0.000000
12347.0,1558.26,37.0,78.0,223.0,2.858974,19.977692,11.038462,0.0,14.0,1.0,0.307692
12348.0,1070.40,80.0,32.0,257.0,8.031250,33.450000,48.187500,0.0,14.0,0.0,0.000000
12349.0,2332.15,182.0,85.0,408.0,4.800000,27.437059,9.552941,0.0,9.0,3.0,0.000000
12352.0,220.65,17.0,12.0,211.0,17.583333,18.387500,10.583333,0.0,10.0,0.0,0.000000
...,...,...,...,...,...,...,...,...,...,...,...
18280.0,253.75,14.0,19.0,213.0,11.210526,13.355263,6.157895,2.0,15.0,2.0,0.000000
18281.0,97.54,0.0,7.0,396.0,56.571429,13.934286,9.428571,0.0,10.0,1.0,0.000000
18283.0,544.02,276.0,190.0,477.0,2.510526,2.863263,1.436842,0.0,13.0,3.0,0.289474
18286.0,555.13,57.0,42.0,352.0,8.380952,13.217381,9.285714,3.0,11.0,4.0,0.000000


In [76]:
len(X_train.columns)

11

In [77]:
# The patience parameter is the amount of epochs to check for improvement
early_stop = keras.callbacks.EarlyStopping(monitor='val_loss', patience=20)
#early_stop = keras.callbacks.EarlyStopping(monitor='val_mse', patience=50)

In [78]:
model = build_model()

In [83]:
early_history = model.fit( X_train, y_train, 
                    epochs=1000, validation_split = 0.2, verbose=0,
                    callbacks=[early_stop, tfmodel.EpochDots()])

ValueError: Expect x to be a non-empty array or dataset.

## Evaluation

Let's see how well the model can predict the 3 next 3 months which it has never seen before. We're going to use data from the most recent period (X_test) to make sure that our forecast is as accurate as possible.

In [65]:
#Predicting
dnn_preds = model.predict(X_test).ravel()

In [66]:
#Putting the actual and predictions into the same datarame for later comparison
compare_df = pd.DataFrame(index=X_test.index)
compare_df['dnn_preds'] = dnn_preds
compare_df = compare_df.join(sales_pred.rename('stat_pred')).fillna(0)
compare_df['actual'] = y_test

evaluate(compare_df['actual'], compare_df['dnn_preds'])

Total Sales Actual: 5465561.0
Total Sales Predicted: -832459.0
Individual R2 score: -0.3061928418961304 
Individual Mean Absolute Error: 1749.7772905954478


We can see that the model is fairly accurate with mean absolute error comparable to the BG/NBD. Total predicted sales are a bit off, but this is largely due to the significant outliers. Let's now attempt to compare the two models

## Comparison

I'll attempt to compare the performance of DNN and BG/NBD models by looking at:
1. How well do they fit the non-outlier distribution
2. How much revenue do the top 20% of CLV customers generate

It should be noted that the datasets to train the models do differ a bit. E.g. some customer IDs had to be dropped because of their returns so the expected value was replaced by 0 in BG/NBD. Still, I'm not looking at the prediction on the user level but at the aggregate so this should not affect the evaluation. 

In [67]:
#First 98% of data
no_out = compare_df.loc[(compare_df['actual'] <= np.quantile(compare_df['actual'], 0.985)), :]

sns.distplot(no_out['actual'])
sns.distplot(no_out['dnn_preds'])
plt.title('Actual vs DNN Predictions')
plt.show()

In [68]:
sns.distplot(no_out['actual'])
sns.distplot(no_out['stat_pred'])
plt.title('Actual vs BG/NBD Predictions')
plt.show()

It looks like both models correctly model the revenue as heavily skewed with long tail. Nevertheless, DNN seems to better fit the data as it doesn't have this second spike. Let's now look at the revenue of top 20%.

In [70]:
top_n = int(np.round(compare_df.shape[0] * 0.2))
print(f'Selecting the first {top_n} users')

#Selecting IDs
dnn_ids = compare_df['dnn_preds'].sort_values(ascending=False).index[:top_n].values
stat_ids = compare_df['stat_pred'].sort_values(ascending=False).index[:top_n].values

#Filtering the data
eval_subset = data.loc[data.date >= '2011-09-10', :]

#Sums
dnn_rev = eval_subset.loc[eval_subset['Customer ID'].isin(dnn_ids), 'Revenue'].sum() 
stat_rev = eval_subset.loc[eval_subset['Customer ID'].isin(stat_ids), 'Revenue'].sum()


print(f'Top 20% selected by DNN have generated {np.round(dnn_rev)}')
print(f'Top 20% selected by BG/NBD and Gamma Gamma have generated {np.round(stat_rev)}')
print(f'Thats {np.round(dnn_rev - stat_rev)} of marginal revenue')

Selecting the first 722 users
Top 20% selected by DNN have generated 234238.0
Top 20% selected by BG/NBD and Gamma Gamma have generated 1493337.0
Thats -1259099.0 of marginal revenue


The difference is only 6,134 which is quite insignificant. Hence, both methods are able to effectively pick the top 20% of most valuable customers which is not suprising, given that we've used only the transactions data in our DNN model. What about the first 10%?

In [71]:
top_n = int(np.round(compare_df.shape[0] * 0.1))
print(f'Selecting the first {top_n} users')

#Selecting IDs
dnn_ids = compare_df['dnn_preds'].sort_values(ascending=False).index[:top_n].values
stat_ids = compare_df['stat_pred'].sort_values(ascending=False).index[:top_n].values

#Filtering the data
eval_subset = data.loc[data.date >= '2011-09-10', :]

#Sums
dnn_rev = eval_subset.loc[eval_subset['Customer ID'].isin(dnn_ids), 'Revenue'].sum() 
stat_rev = eval_subset.loc[eval_subset['Customer ID'].isin(stat_ids), 'Revenue'].sum()


print(f'Top 20% selected by DNN have generated {np.round(dnn_rev)}')
print(f'Top 20% selected by BG/NBD and Gamma Gamma have generated {np.round(stat_rev)}')
print(f'Thats {np.round(dnn_rev - stat_rev)} of marginal revenue')

Selecting the first 361 users
Top 20% selected by DNN have generated 133932.0
Top 20% selected by BG/NBD and Gamma Gamma have generated 1194640.0
Thats -1060708.0 of marginal revenue


With the first 10%, the DNN model is actually worse but also by only a small percentage (1.3%). Hence, the conclusion from this experiment and comparison is - **Given only the transactions data, both DNN's performance is similar to the BG/NBD + Gamma-Gamma approach**. 

In [None]:
#  To change the date format into details 
# data['date'] = pd.to_datetime(data.InvoiceDate.dt.date)
# data['time'] = data.InvoiceDate.dt.time
# data['hour'] = data['time'].apply(lambda x: x.hour)
# data['weekend'] = data['date'].apply(lambda x: x.weekday() in [5, 6])
# data['dayofweek'] = data['date'].apply(lambda x: x.dayofweek)

In [None]:
# #  Context data for the revenue (date & customerID)
# id_lookup = data['CustomerID', 'InvoiceDate'].drop.duplicates()
# id_lookup.index = id_lookup['InvoiceDate']
# id_lookup = id_lookup.drop('InvoiceDate', axis=1)
# transaction_data = pd.DataFrame(data.groupby('InvoiceDate')['Revenue'].sum())


### DATA EXPLANATIONS
---Data gathered is for around one Year:ML approach needs time periods for featue creation, training targets,validation targets will be split according to the following segments: 
*   Training Features Period - from 2011-01-01 until 2011-06-11
*   Training Target Period - from 2011-06-12 until 2011-09-09
*   Testing Features Period - from 2011-04-02 until 2011-09-10
*   Testing Target Period - from 2011-09-11 until 2011-12-09


In [43]:
#Dataset info
print(f'Total Number of Purchases: {data.shape[0]}')
print(f'Total Number of transactions: {data.InvoiceDate.nunique()}')
print(f'Total Unique Days: {data.date.nunique()}')
print(f"Total Unique Customers: {data['Customer ID'].nunique()}")
print(f"We are predicting {(data['date'].max() - datetime(2011, 9, 11)).days} days")

Total Number of Purchases: 853896
Total Number of transactions: 45660
Total Unique Days: 604
Total Unique Customers: 5910
We are predicting 89 days


In [None]:
#DNN
def build_model():
    model = keras.Sequential([
    layers.Dense(256, activation='relu', input_shape=[len(X_train.columns), ]),
    layers.Dropout(0.3),
    layers.Dense(64, activation='relu'),
    layers.Dropout(0.3),
    layers.Dense(32, activation='relu'),
    layers.Dense(1)
    ])

    optimizer = tf.keras.optimizers.Adam(0.001)

    model.compile(loss='mse',
            optimizer=optimizer,
            metrics=['mae', 'mse'])
    
    return model

# The patience parameter is the amount of epochs to check for improvement
early_stop = keras.callbacks.EarlyStopping(monitor='val_mse', patience=50)

model = build_model()
#Should take 10 sec
early_history = model.fit(X_train, y_train, 
                    epochs=1000, validation_split = 0.2, verbose=0, 
                    callbacks=[early_stop, tfdocs.modeling.EpochDots()])



In [45]:

def evaluate(actual, sales_prediction):
    print(f"Total Sales Actual: {np.round(actual.sum())}")
    print(f"Total Sales Predicted: {np.round(sales_prediction.sum())}")
    print(f"Individual R2 score: {r2_score(actual, sales_prediction)} ")
    print(f"Individual Mean Absolute Error: {mean_absolute_error(actual, sales_prediction)}")
    plt.scatter(sales_prediction, actual)
    plt.xlabel('Prediction')
    plt.ylabel('Actual')      
    plt.show()

#Predicting
dnn_preds = model.predict(X_test).ravel()

evaluate(y_test, dnn_preds)


Total Sales Actual: 5465561.0
Total Sales Predicted: -1676.0
Individual R2 score: -0.0421635794136197 
Individual Mean Absolute Error: 1520.956218701501
