# Retail Sales Performance and Inventory Reservation Use Case
Using RETAILER1_V, RETAILER2_V, RETAILER3_V from DWC, which are federated from Big Query.
Also using DISTRIBUTOR_V, PRODUCT_V, and RETAIL_V which are local table views in DWC.

## Install fedml_azure package

In [1]:
pip install fedml-azure --force-reinstall

Processing ./fedml-azure
Collecting hdbcli
  Using cached hdbcli-2.12.20-cp34-abi3-manylinux1_x86_64.whl (11.7 MB)
Collecting ruamel.yaml
  Using cached ruamel.yaml-0.17.21-py3-none-any.whl (109 kB)
Collecting ruamel.yaml.clib>=0.2.6; platform_python_implementation == "CPython" and python_version < "3.11"
  Using cached ruamel.yaml.clib-0.2.6-cp36-cp36m-manylinux1_x86_64.whl (552 kB)
Installing collected packages: hdbcli, ruamel.yaml.clib, ruamel.yaml, fedml-azure-test
  Attempting uninstall: hdbcli
    Found existing installation: hdbcli 2.12.20
    Uninstalling hdbcli-2.12.20:
      Successfully uninstalled hdbcli-2.12.20
  Attempting uninstall: ruamel.yaml.clib
    Found existing installation: ruamel.yaml.clib 0.2.6
    Uninstalling ruamel.yaml.clib-0.2.6:
      Successfully uninstalled ruamel.yaml.clib-0.2.6
  Attempting uninstall: ruamel.yaml
    Found existing installation: ruamel.yaml 0.17.21
    Uninstalling ruamel.yaml-0.17.21:
      Successfully uninstalled ruamel.yaml-0.17.2

## Import the libraries needed in this notebook

In [2]:
from fedml_azure import create_workspace
from fedml_azure import create_compute
from fedml_azure import create_environment
from fedml_azure import DwcAzureTrain

## Set up
### Creating a workspace. This takes a dictionary as input for parameter workspace_args.

Before running the below cell, ensure that you have a workspace and replace the subscription_id, resource_group, and workspace_name with your information.
https://docs.microsoft.com/en-us/azure/machine-learning/how-to-manage-workspace?tabs=python


In [3]:
#creation of workspace
workspace=create_workspace(workspace_args={"subscription_id": "<subscription_id>",
                                        "resource_group": "<resource_group>",
                                        "workspace_name": "<workspace_name>"
                                        }
                        )



2022-03-31 18:27:59,586: fedml_azure.logger INFO: Getting existing Workspace


### Creating a Compute Cluster. This takes the workspace, a compute_type, and compute_args.
https://docs.microsoft.com/en-us/azure/machine-learning/how-to-create-attach-compute-cluster?tabs=python

https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.compute.amlcompute.amlcompute?view=azure-ml-py

In [4]:
#creation of compute target
compute=create_compute(workspace=workspace,
                   compute_type='AmlComputeCluster',
                   compute_args={'vm_size':'Standard_D1',
                                'compute_name':'cpu-cluster',
                                'max_nodes':1
                                }
                )

2022-03-31 18:28:00,241: fedml_azure.logger INFO: Creating Compute_target.
2022-03-31 18:28:00,596: fedml_azure.logger INFO: Found compute target. just use it. cpu-cluster


### Creating an Environment. This takes the workspace, environment_type, and environment_args.

The whl file for the fedml_azure library must be passed to the pip_wheel_files key in the environment_args.

In this example, we are using a .yml for the environments dependencies and passing the file path to environment_arg's file_path key.

https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.environment(class)?view=azure-ml-py

https://docs.microsoft.com/en-us/azure/machine-learning/how-to-use-environments

In [5]:
#creation of environment
environment=create_environment(workspace=workspace,
                           environment_type='CondaPackageEnvironment',
                           environment_args={'name':'retail-env',
                           'conda_packages':['scikit-learn', 'lightgbm'],
                           'pip_wheel_files':['fedml-azure']})

2022-03-31 18:28:00,603: fedml_azure.logger INFO: Creating Environment.


## Now, let's train the model
### First, we need to instantiate the training class - this will assign the resources.

In [6]:
train=DwcAzureTrain(workspace=workspace,
                 environment=environment,
                 experiment_args={'name':'retail-exp'},
                 compute=compute)

2022-03-31 18:28:00,974: fedml_azure.logger INFO: Assigning Workspace.
2022-03-31 18:28:00,975: fedml_azure.logger INFO: Creating Experiment
2022-03-31 18:28:01,073: fedml_azure.logger INFO: Assigning compute.
2022-03-31 18:28:01,075: fedml_azure.logger INFO: Assigning Environment.


### Then, we need to generate the run config. This is needed to package the configuration specified so we can submit a job for training. 

Before running the following cell, you should have a config.json file with the specified values to allow you to access to DWC. Provide this file path to config_file_path in the below cell.

You should also have the follow view IRIS_VIEW created in your DWC. To gather this data, please refer to https://www.kaggle.com/uciml/iris

https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.scriptrunconfig?view=azure-ml-py

In [7]:
#generating the run config
src=train.generate_run_config(config_file_path='config.json',
                          config_args={
                                          'source_directory':'Retail-Project',
                                          'script':'training_script.py',
                                          'arguments':['--model_file_name','retailmodel.pkl',
                                                       '--dist_table', 'DISTRIBUTOR_V',
                                                        '--dist_size', '1',
                                                        '--product_table', 'PRODUCT_V',
                                                        '--product_size', '1',
                                                        '--retailer_table', 'RETAIL_V',
                                                        '--retailer_size', '1',
                                                        '--retailer1_table', 'RETAILER1_V',
                                                        '--retailer1_size', '1',
                                                        '--retailer2_table', 'RETAILER2_V',
                                                        '--retailer2_size', '1',
                                                        '--retailer3_table', 'RETAILER3_V',
                                                        '--retailer3_size', '1',]
                                          }
                            )

2022-03-31 18:28:01,203: fedml_azure.logger INFO: Generating script run config.
2022-03-31 18:28:01,204: fedml_azure.logger INFO: Copying config file for db connection to script_directory Retail-Project


### Submitting the job for training

In [8]:
#submitting the training run
run=train.submit_run(src)

2022-03-31 18:28:01,360: fedml_azure.logger INFO: Submitting training run.
RunId: retail-exp_1648751281_981b0e88
Web View: https://ml.azure.com/runs/retail-exp_1648751281_981b0e88?wsid=/subscriptions/2a9092f3-f89d-4589-b9aa-8a1f9ff8b776/resourcegroups/fedml_rg/workspaces/fedml_ws&tid=997ff79a-373b-416b-8f24-a97cb0225c59

Execution Summary
RunId: retail-exp_1648751281_981b0e88
Web View: https://ml.azure.com/runs/retail-exp_1648751281_981b0e88?wsid=/subscriptions/2a9092f3-f89d-4589-b9aa-8a1f9ff8b776/resourcegroups/fedml_rg/workspaces/fedml_ws&tid=997ff79a-373b-416b-8f24-a97cb0225c59



## Register the model for deployment

In [9]:
model=train.register_model(run=run,
                           model_args={'model_name':'retail_model',
                                       'model_path':'outputs/retailmodel.pkl'},
                            resource_config_args={'cpu':1, 'memory_in_gb':0.5},
                            is_sklearn_model=False
                           )
print('Name:', model.name)
print('Version:', model.version)

2022-03-31 18:41:40,834: fedml_azure.logger INFO: Registering the model.
Name: retail_model
Version: 2


## Deploying the model

In [13]:
inference_config_args = {
    'entry_script':'Retail-Project/predict.py',
    'environment': environment
}
deploy_args = {
    'workspace': workspace,
    'name': 'retail-usecase2',
    'models': [model],
    'kubeconfig_path': 'kubeconfig.yml',
    'sp_config_path': 'sp_config.json',
    'num_replicas': 2,

}

In [14]:
from fedml_azure import deploy

In [15]:
endpoint_url = deploy(compute_type='Kyma',deploy_args=deploy_args,inference_config_args=inference_config_args)

2022-03-31 18:42:37,208: fedml_azure.logger INFO: The command kubectl already exists.

2022-03-31 18:42:37,210: fedml_azure.logger INFO: The command jq already exists.

2022-03-31 18:42:37,288: fedml_azure.logger INFO: The deploy namespace is fedmlazure.

2022-03-31 18:42:37,964: fedml_azure.logger INFO: Creating deployment folder deployments/retail-usecase2.
2022-03-31 18:42:40,401: fedml_azure.logger INFO: Successfully created deployment folder deployments/retail-usecase2.
2022-03-31 18:42:40,402: fedml_azure.logger INFO: Creating docker file for deployment.
2022-03-31 18:42:41,411: fedml_azure.logger INFO: Successfully created docker file for kyma deployment.
2022-03-31 18:42:41,776: fedml_azure.logger INFO: Building and pushing the docker image to acr.

2022-03-31 18:42:48,083: fedml_azure.logger INFO: 2022/03/31 18:42:45 Downloading source code...

2022-03-31 18:42:48,085: fedml_azure.logger INFO: 2022/03/31 18:42:45 Finished downloading source code

2022-03-31 18:42:50,839: fedml

In [16]:
endpoint_url

'https://retail-usecase2.c-9da877f.kyma.shoot.live.k8s-hana.ondemand.com/score'

## Inferencing

In [17]:
import json
import pandas as pd
from fedml_azure import DbConnection
from fedml_azure import predict

In [20]:
db = DbConnection()
dist_data = db.get_data_with_headers(table_name='DISTRIBUTOR_V', size=1)
dist_data = pd.DataFrame(dist_data[0], columns=dist_data[1])

product_data = db.get_data_with_headers(table_name='PRODUCT_V', size=1)
product_data = pd.DataFrame(product_data[0], columns=product_data[1])

retailer_data = db.get_data_with_headers(table_name='RETAIL_V', size=1)
retailer_data = pd.DataFrame(retailer_data[0], columns=retailer_data[1])

retailer1_data = db.get_data_with_headers(table_name='RETAILER1_V', size=1)
retailer1_data = pd.DataFrame(retailer1_data[0], columns=retailer1_data[1])

retailer2_data = db.get_data_with_headers(table_name='RETAILER2_V', size=1)
retailer2_data = pd.DataFrame(retailer2_data[0], columns=retailer2_data[1])

retailer3_data = db.get_data_with_headers(table_name='RETAILER3_V', size=1)
retailer3_data = pd.DataFrame(retailer3_data[0], columns=retailer3_data[1])

In [21]:

# # Request data goes here
data = {
    'dist_data': dist_data.values.tolist(),
    'product_data': product_data.values.tolist(),
    'retailer_data': retailer_data.values.tolist(),
    'retailer1_data': retailer1_data.values.tolist(),
    'retailer2_data': retailer2_data.values.tolist(),
    'retailer3_data': retailer3_data.values.tolist()
}
test_data = json.dumps(data)

In [22]:
result = predict(endpoint_url=endpoint_url,compute_type='Kyma',data=test_data)

2022-03-31 18:49:11,192: fedml_azure.logger INFO: Using the parameters 'endpoint_url' and 'compute_type' for inferencing as service is not passed.


In [23]:
scores = result['result']
len(scores)

720000

In [24]:
type(scores)

list

In [25]:
result_df = pd.DataFrame(scores, columns=['retailer', 'productsku', 'calendar_year',
                          'calendar_month', 'Predictions'])

In [26]:
types = {'retailer': 'int',
'productsku': 'int',
'calendar_year': 'int',
'calendar_month': 'int'}
result_df = result_df.astype(types)

In [27]:
result_df['ID'] = result_df.index

In [29]:
result_df.head(10)

Unnamed: 0,retailer,productsku,calendar_year,calendar_month,Predictions,ID
0,10001,1002,2019,1,4916.0,0
1,10001,1002,2019,2,763.0,1
2,10001,1002,2019,3,633.0,2
3,10001,1002,2019,4,4591.0,3
4,10001,1002,2019,5,3390.0,4
5,10001,1002,2019,6,800.0,5
6,10001,1002,2019,7,4147.0,6
7,10001,1002,2019,8,989.0,7
8,10001,1002,2019,9,4674.0,8
9,10001,1002,2019,10,605.0,9


In [30]:
db.create_table("CREATE TABLE Retail_Predictions (ID INTEGER PRIMARY KEY, retailer INTEGER, productsku INTEGER, calendar_year INTEGER, calendar_month INTEGER, Predictions FLOAT(2))")

2022-03-31 18:52:22,729: fedml_azure.logger INFO: creating table...
2022-03-31 18:52:22,730: fedml_azure.logger INFO: CREATE TABLE Retail_Predictions (ID INTEGER PRIMARY KEY, retailer INTEGER, productsku INTEGER, calendar_year INTEGER, calendar_month INTEGER, Predictions FLOAT(2), INSERTED_AT TIMESTAMP NOT NULL)


In [31]:
db.insert_into_table('Retail_Predictions', result_df)

2022-03-31 18:52:24,574: fedml_azure.logger INFO: inserting into table...
2022-03-31 18:52:24,575: fedml_azure.logger INFO: INSERT INTO Retail_Predictions (retailer, productsku, calendar_year, calendar_month, Predictions, ID, INSERTED_AT) VALUES (:retailer, :productsku, :calendar_year, :calendar_month, :Predictions, :ID, :INSERTED_AT)
