# Retail Store Analysis Dashboard (Superset)

This tutorial provides an end-to-end workflow example for a retail store analysis scenario in HPE
Ezmeral Unified Analytics (EzUA) Software using EzPresto and Superset.

In this case, a data analyst working for a major retail company aims to visualize data sets from
MySQL, SQL Server, and Hive data sources using Superset. The analyst logs into HPE EzUA and
establishes connections to MySQL, SQL Server, and Hive data sources. Following this, the analyst
executes a federated query across the data sets and then creates a view based on this query. They
then access this view through Superset and utilize it to visualize the data in a bar chart,
ultimately creating a comprehensive dashboard. This dashboard provides a fully customized overview
of the retailer's operations.

## Table of Contents

- [Checking the Data Sources](#cheking-the-data-sources)
- [Select the Datasets and Create a View](#select-data-sets-and-create-a-view)
- [Connect to the Presto Database](#connect-to-the-presto-database)
- [Add the View to Superset and Create a Chart](#add-the-view-to-superset-and-create-a-chart)
- [Specify Query Conditions to Visualiza Resulrs in the Chart](#specify-query-conditions-to-visualize-results-in-the-chart)
- [Create a Superset Dashboard and Add the Chart](#create-a-superset-dashboard-and-add-the-chart)
- [Monitor Queries](#monitor-queries)
- [Conclusion](#conclusion)

# Checking Data Sources

In the sidebar navigation menu of HPE EzUA, click `Data Engineering > Data Sources`. Confirm you
have the following Data Sources:

- `mysql`
- `mssql`
- `hiveview`

> If you do not have these data sources, please follow the first tutorial notebook, "Data Source
> Connectivity and Exploration". 


# Select Data Sets and Create a View

Next, you create a new view for our retail dashboard. You first select the data sources and datasets
you wish to work with. Then, you run a federated query against the selected data sets and create a
view from the query. From these queries, you create an example view named `qf_retailstore_view`:

1. In the sidebar navigation menu in HPE Ezmeral Unified Analytics, select `Data Engineering > Data Catalog`.
1. On the `Data Catalog` page, click the dropdown next to the `mysql` and `mssql` data sources
   to expose the available schemas in those data sources.
1. Select schemas for each of the data sources:
    - For the `mysql` data source, select the `retailstore` schema.
    - For the `mssql` data source, select the `dbo` schema.
1. In the `All Datasets` section, click the `filter` icon to open the `Filters` drawer.
1. Use the filter to identify and select the following data sets in the selected schemas:
    - For the `dbo` schema, filter for and select the following datasets:
        * `call_center`
        * `catalog_sales`
        * `data_dim`
        * `item`
    - For the `retailstore` schema, filter for and select the following datasets:
        * `customer`
        * `customer_address`
        * `customer_demographics`
1. After you select all the data sets, click `Apply`.
1. Click `Selected Datasets` (button that is displaying the number of selected data sets).
1. In the drawer that opens, click `Query Editor`. Depending on the number of selected data sets,
   you may have to scroll down to the bottom of the drawer to see the `Query Editor` button.


Now, let's query the datasets and create a view:

1. In the `Query Editor`, click `+` to `Add Worksheet`.
1. Run the following command to create a new schema, such as `hiveview.demoschema`, for example:

    ```sql
    create schema if not exists hiveview.demoschema;
    ```
1. Run a query to create a new view from a federated query against the selected data sets, for
   example:

    ```sql
    create view hiveview.demoschema.qf_retailstore_view as select * from mssql.dbo.catalog_sales cs
    inner join mssql.dbo.call_center cc on cs.cs_call_center_sk = cc.cc_call_center_sk
    inner join mssql.dbo.date_dim d on cs.cs_sold_date_sk = d.d_date_sk
    inner join mssql.dbo.item i on cs.cs_item_sk = i.i_item_sk
    inner join mysql.retailstore.customer c on cs.cs_bill_customer_sk = c.c_customer_sk
    inner join mysql.retailstore.customer_address ca on c.c_current_addr_sk = ca.ca_address_sk
    inner join mysql.retailstore.customer_demographics cd on c.c_current_cdemo_sk = cd.cd_demo_sk
    ```
1. Click `Run`. When the query completes, the status `"Finished"` will display.

# Connect to the Presto Database

Complete the following steps to connect Superset to the Presto database for access to your data
sources and data sets in HPE EzUA. Once connected to the Presto database, you can access your data
sets in HPE EzUA from Superset.

To connect to the Presto database, you need the connection URI. You can get the URI from your HPE
EzUA administrator. To open Superset, in the left navigation pane of HPE Ezmeral Unified Analytics
Software, select `BI Reporting > Dashboards`. Supersets opens in a new tab.

In Supersets, perform the following:

1. Select `Settings > Database Connections`.
1. Click `+ DATABASE`.
1. In the `Connect a database` window, select the `Presto` tile.
1. Enter the SQLALCHEMY URI provided by your administrator.
1. Test the connection.
1. If the test was successful, click `Connect`.


# Add the View to Superset and Create a Chart

Complete the following steps to import the view you created in HPE EzUA and create a bar chart. This
tutorial demonstrates how to import the view `qf_retailstore_view`. To open Superset, in the left
navigation pane of HPE Ezmeral Unified Analytics Software, select `BI Reporting > Dashboards`.
Supersets opens in a new tab.

In Superset, perform the following:

1. Click the `Datasets` tab.
1. Click `+ DATASET`.
1. In the Add Dataset window, select the following options:
    - `DATABASE:` Presto
    - `SCHEMA:` `<your_schema>`
    - `SEE TABLE SCHEMA:` `<your_view>`

    > This tutorial uses the `retailstore` schema and `qf_retailstore_view`.

1. Click `ADD DATASET AND CREATE CHART`.
1. In the `Create a New Chart` window, select `Bar Chart`.
1. Click `CREATE NEW CHART`.
1. Enter a name for the chart (e.g "Retail Store View").

![title](images/03a.png)

# Specify Query Conditions to Visualize Results in the Chart

In Superset, charts visualize data based on the query conditions that you specify. The charts
created in Superset automatically generate queries that Superset passes to the SQL query engine.
Superset visualizes the query results in the chart. Try applying query conditions to visualize your
data.

The following steps demonstrate how query conditions were applied to visualize data in the resulting
example bar chart (shown in Step 4). First, enter the specified query parameters in the following
fields:

- `METRICS`
    1. Click into the `METRICS` field (located on the `DATA` tab). A metrics window opens.
    1. Select the `Simple` tab.
    1. Click the `edit` icon and enter a name for the metric, such as `SUM(cs_net_paid)`.
    1. In the `Column` field, select `cs_net_paid`.
    1. In the `Aggregate` field, select `SUM`.
    1. Click `Save`.
- `FILTERS`
    1. Click into the `FILTERS` field (located on the `DATA` tab).
    1. In the window that opens, select the `CUSTOM SQL` tab.
    1. Select the `WHERE` filter and enter the following: `NULLIF(ca_state, '') IS NOT NULL`
    1. Click `Save`.
- `DIMENSIONS`
    1. Drag and drop the `ca_state` column into the `DIMENSIONS` field.
    1. Click into the `BREAKDOWNS` column.
    1. In the window that opens, select the `SIMPLE` tab and select the `cc_name` column.
    1. Click `Save`.
- `SORT BY`
    1. Click into the `SORT BY` field.
    1. In the window that opens, select the `SIMPLE` tab and enter `cs_net_paid` as the `COLUMN` and
       `SUM` as the `AGGREGATE`.
    1. Click `Save`.

Next, click `CREATE CHART`. The bar chart displays results when the query finishes processing.

![title](images/03b.png)

Click `Save` to save the chart. In the `Save Chart` window that opens, do not enter or select a
dashboard. Click `Save` to continue.

# Create a Superset Dashboard and Add the Chart

Complete the following steps to create a new dashboard and add your chart to the dashboard. This
tutorial adds the Retail Store View chart to a dashboard named Retail Store Analysis Dashboard.

To create a new dashboard and add your visualized data:

1. In Superset, click on the `Dashboards` tab.
1. Click `+ DASHBOARD`.
1. Enter a name for the dashboard (e.g "Retail Store Analysis Dashboard")
1. Drag and drop your chart into the dashboard.
1. Click `Save` to save the dashboard.

![title](images/03c.png)

# Monitor Queries

You can monitor queries generated through Superset from the EzPresto endpoint. You can access the
EzPresto endpoint in the EzPresto tile in the Applications & Frameworks space in HPE EzUA.

Complete the following steps to monitor the query that the chart generates:

1. Return to the HPE EzUA dashboard.
1. In the sibebar navigation menu, select `Applications & Frameworks`.
1. Under the `Data Engineering` tab, click the `EzPresto endpoint` in the `EzPresto` tile. The
   EzPresto interface will open in a new tab.
1. In the `Query Details` section, verify that `Finished` is selected. Selected options have a
   visible checkmark. 

![title](images/03d.png)

You can see the query that ran to populate the Retail Store View bar chart in the Retail Store
Analysis Dashboard. Click on the `Query ID` to see the query details.

![title](images/03e.png)

To see a visualized query plan and metadata for the query, click `Live Plan` and hover over
different areas of the visualized plan. You can also click on various parts of the visualized plan
to zoom in on details.

# Conclusion

You have completed this tutorial series. This series demonstrated the integration of the HPE EzUA
SQL query engine (EzPresto) with Superset to visualize the results of a query on data sets made
available through the default Presto database connection. This tutorial also showed you how to
monitor queries from the EzPresto Cluster Monitoring tool.