****Important – Do not use in production, for demonstration purposes only – please review the legal notices before continuing****

## Retail Sales Forecasting Model Building with AutoML

Using AutoML, we will train and evaluate a retail forecasting model in this notebook.

 ![Sales Forecasting](https://stretailprod.blob.core.windows.net/notebookimages/sales_forecasting.jpg?sp=r&st=2022-02-24T20:44:58Z&se=2024-02-25T04:44:58Z&sv=2020-08-04&sr=b&sig=8GqfYHhGM1xthmngl8rBAKueOcXfbhj9cJQ%2FE4hQTT8%3D)

### Retail Data Dictionary
- store - store number 
- brand - brand indicator
- week - week number
- logmove - log of units sold
- price - price of a single unit
- feat - feature advertisement
- age60 - percentage of the population that is aged 60 or older
- educucation - percentage of the population that has a college degree
- ethinicity - percent of the population that is black or Hispanic
- income - median income
- hhlarge - percentage of households with 5 or more persons
- workwom - percentage of women with full-time jobs
- hval150 - percentage of households worth more than $150,000
- sstrdist - distance to the nearest warehouse store
- sstrvol - ratio of sales of this store to the nearest warehouse store
- cpdist5 - average distance in miles to the nearest 5 supermarkets
- cpwvol5 - ratio of sales of this store to the average of the nearest five store
- time - Date and time

### Importing libraries

In [1]:
import azureml.core
import pandas as pd
from azureml.core import Experiment, Workspace, Dataset, Datastore
from azureml.train.automl import AutoMLConfig
from azureml.automl.core.forecasting_parameters import ForecastingParameters
import mlflow

### Configuring Workspace and Experiment

In [None]:
#linkedService_name = "AzureMLService"
import azureml.core
import pandas as pd

print("SDK Version:", azureml.core.VERSION)

from azureml.core import Workspace, Datastore, Dataset
ws = Workspace.from_config()
ws

### Creating the dataset for Azure Machine Learning

In [3]:
df = pd.read_csv("retail_sales_datasetv2.csv")
datastore = Datastore.get_default(ws)
dataset = Dataset.Tabular.register_pandas_dataframe(df, datastore, "dataset_from_pandas_df", show_progress=True)

Validating arguments.
Arguments validated.
Successfully obtained datastore reference and path.
Uploading file to managed-dataset/079c868f-0bff-4840-b477-3436e6c23872/
Successfully uploaded file to datastore.
Creating and registering a new dataset.
Successfully created and registered a new dataset.


### Model Configuration

In [4]:
forecasting_parameters = ForecastingParameters(
    time_column_name = "time",
    forecast_horizon = "auto",
    time_series_id_column_names = ["store","brand"],
)

automl_config = AutoMLConfig(
                             task = "forecasting",
                             training_data = dataset,
                             label_column_name = "logmove",
                             primary_metric = "normalized_root_mean_squared_error",
                             experiment_timeout_hours = 0.5,
                             max_concurrent_iterations = 2,
                             n_cross_validations = 5,
                             forecasting_parameters = forecasting_parameters)

### Submitting Experiment

In [None]:
experiment = Experiment(ws, "retail_forecasting")
run = experiment.submit(automl_config)



Experiment,Id,Type,Status,Details Page,Docs Page
retail_forecasting,AutoML_2b845561-db65-4815-b1e2-c2fbba7df8fc,automl,Preparing,Link to Azure Machine Learning studio,Link to Documentation


INFO:interpret_community.common.explanation_utils:Using default datastore for uploads


In [6]:
run.wait_for_completion()

# Get best model from automl run
best_run, non_onnx_model = run.get_output()

artifact_path = experiment_name + "_artifact"

### Registering the best model using mlflow

In [None]:
mlflow.set_tracking_uri(ws.get_mlflow_tracking_uri())
mlflow.set_experiment(experiment_name)

with mlflow.start_run() as run:
    # Save the model to the outputs directory for capture
    mlflow.sklearn.log_model(non_onnx_model, artifact_path)

    # Register the model to AML model registry
    mlflow.register_model("runs:/" + run.info.run_id + "/" + artifact_path, "syndreamdemoretailprod-RetailSalesData-20211231100029-Best")