# Building Interactive Dashboards for Machine Learning using [Plotly Dash](https://plotly.com/dash/) 

### Models
The models that will be in our data.

- [Principal Component Analysis (PCA)](https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.PCA.html)
- [Uniform Manifold Approximation and Projection (UMAP)](https://umap-learn.readthedocs.io/en/latest/)
- [Autoencoder (AE)](https://www.tensorflow.org/tutorials/generative/autoencoder)
- [Variational Autoencoder (VAE) ](https://www.tensorflow.org/tutorials/generative/cvae)

In [30]:
# Import libaries
import pandas as pd

In [31]:
# Load data
df = pd.read_csv('data/customer_dataset.csv')
print('Shape of data', df.shape)
df.head()

Shape of data (440, 17)


Unnamed: 0,Channel,Region,Fresh,Milk,Grocery,Frozen,Detergents_Paper,Delicatessen,pca_x,pca_y,umap_x,umap_y,ae_x,ae_y,vae_x,vae_y,Total_Spend
0,2,3,12669,9656,7561,214,2674,1338,0.193291,-0.3051,7.08431,6.933166,3.548878,3.811006,0.82864,0.798793,34112
1,2,3,7057,9810,9568,1762,3293,1776,0.43442,-0.328413,6.25288,7.05078,3.579156,2.955884,0.838629,0.814789,33266
2,2,3,6353,8808,7684,2405,3516,7844,0.811143,0.815096,8.588828,6.877347,1.341199,2.187068,0.841106,0.797111,36610
3,1,3,13265,1196,4221,6404,507,1788,-0.778648,0.652754,13.654358,7.857928,6.34953,8.099434,0.814431,0.814974,27381
4,2,3,22615,5410,7198,3915,1777,5185,0.166287,1.271434,9.122227,5.977852,1.150562,3.304798,0.853156,0.828196,46100


In [52]:
# Columns of interest
columns = [col for col in df.columns if not (col.endswith('_x') or col.endswith('_y'))]
print(columns)

['Channel', 'Region', 'Fresh', 'Milk', 'Grocery', 'Frozen', 'Detergents_Paper', 'Delicatessen', 'Total_Spend']


In [33]:
# Unique columns with at most 50 unique values
for col in df.columns:
    if len(df[col].unique()) <= 50:
        print(col)
        print(df[col].unique())

Channel
[2 1]
Region
[3 1 2]


In [51]:
models = ['PCA', 'UMAP', 'AE', 'VAE']
user_view_models = ['Principal Component Analysis',
                    'Uniform Manifold Approximation and Projection',
                    'Autoencoder', 'Variational Autoencoder']

res = {user_view_models[i] : models[i] for i in range(len(models))}

output_format = []
label = "'label': "
value = ", 'value': "
quotes = "'"
for key, val in res.items():
    output_format.append('{')
    output_format.append(label + quotes + key + quotes + value + quotes + val + quotes)
    output_format.append('}, \n')

output_format = ''.join(output_format)    
print(output_format)

{'label': 'Principal Component Analysis', 'value': 'PCA'}, 
{'label': 'Uniform Manifold Approximation and Projection', 'value': 'UMAP'}, 
{'label': 'Autoencoder', 'value': 'AE'}, 
{'label': 'Variational Autoencoder', 'value': 'VAE'}, 



In [50]:
# Color skims
color_values = ['OrRd', 'Viridis', 'Plasma']
color_labels = ['Orange to Red', 'Viridis', 'Plasma']

res = {color_labels[i] : color_values[i] for i in range(len(color_labels))}

output_format = []
label = "'label': "
value = ", 'value': "
quotes = "'"
for key, val in res.items():
    output_format.append('{')
    output_format.append(label + quotes + key + quotes + value + quotes + val + quotes)
    output_format.append('}, \n')

output_format = ''.join(output_format)    
print(output_format)

{'label': 'Orange to Red', 'value': 'OrRd'}, 
{'label': 'Viridis', 'value': 'Viridis'}, 
{'label': 'Plasma', 'value': 'Plasma'}, 



In [54]:
df_average = df[columns].mean()
df_average

Channel                 1.322727
Region                  2.543182
Fresh               12000.297727
Milk                 5796.265909
Grocery              7951.277273
Frozen               3071.931818
Detergents_Paper     2881.493182
Delicatessen         1524.870455
Total_Spend         33226.136364
dtype: float64