<img src="./images/logo.png" alt="Drawing" style="width: 500px;"/>

# **Introduction**

You are a newly-promoted data scientist at one of Europe's leading grocery chains. You in charge of ingesting and presenting all of the sales numbers from stores in several countries to gather crucial insights into what customers like and how they shop such that the business can operate more effectively. Whilst presenting to your executives, you have been asked to come up with a new idea for the self-serve checkout experience that leverages vision-based machine learning - an exciting, futuristic and **smart** way to shop. 

To make this experience a reality for customers, we'll first need to understand how they shop. You will learn to leverage **Apache Spark** and its powerful distributed processing capabilities to analyze sales information and uncover insights of sales data across stores from three major European countries. You will then learn how to build, store and manage datasets from the sales data you pulled and serve them to other applications using **EzPresto**, which will allow you to build a dashboard using **Apache Superset** to display these datasets in a professional and modular manner. 

You will then learn how to train an object recognition model using **Tensorflow** and learn to optimize a scalable and collaborative machine learning workflow using **MLflow**. From there, you will learn to serve your newly-created model at scale - hundreds of self-serve checkouts in multiple stores across several countries - using **Kserve** on **Kubeflow**. We will then deploy the application that every self-serve checkout can run at scale that inference this model and deliver a smart retail experience for thousands of customers!

All of these applications will be running natively within a clean installation of **HPE Ezmeral Unified Analytics** - no further setup required to get these best-in-class open-source tools to connect and leverage one another.

Let's get started!

# **Prerequisites**

1. **Before running exercise notebooks**, open a new tab and start a new **Terminal** session. 
1. Ensure you are in the same directory path as this notebook.
1. Run `pip install -r requirements.txt`. Where applicable, ensure your environment proxy settings are set. 

# **Contents**

<a href="./01.exploring_data_with_spark.ipynb" style="color: black"><b style="color: #01a982;">Exercise 1:</b> Exploring Sales Data with Apache Spark</a>

<a href="./02.query_with_ezpreso.ipynb" style="color: black"><b style="color: #01a982;">Exercise 2:</b> Connecting and Querying Data Sources with EzPresto</a>

<a href="./03.visualizing_data_superset.ipynb" style="color: black"><b style="color: #01a982;">Exercise 3:</b> Visualizing Data with Superset</a>

<a href="./04.model_training.ipynb" style="color: black"><b style="color: #01a982;">Exercise 4:</b> Building a Image Classification Model with Tensorflow and MLflow</a>

<a href="./05.working_in_mlflow.ipynb" style="color: black"><b style="color: #01a982;">Exercise 5:</b> Tracking, Registering and Inferencing Models in MLflow</a>

<a href="./06.serve_model_kserve.ipynb" style="color: black"><b style="color: #01a982;">Exercise 6:</b> Serving your model with Kserve</a>

<a href="./07.deploying_retail_application.ipynb" style="color: black"><b style="color: #01a982;">Exercise 7:</b> Deploying Custom Applications on HPE Ezmeral Unified Analytics</a>


# **Acknowledgements**

The Smart Retail Experience technical demonstration was written and prepared by Alex Ollman.

This technical demonstartion is an adaptation from the *Retail Demo* developed by Dirk Derichsweiler, Isabelle Steinhauser and Vincent Charbonnier of the HPE Ezmeral DACH Solutions Achitect team. Their collective assistance over various timezones in bringing this adaptation of their work to life is beyond appreciated. 

Special thanks to Simhan Naveenam and Dimitrios Poulopoulos for their troubleshooting and guidance over several sessions and learning an incredible amount from each one. 
