# EXECUTION SCRIPT

The execution script is the one that actually makes predictions with the model that is already trained.

In technical terms, there is a .predict (prediction) and not a .fit (training).

The script loads the trained execution pipeline and generates stockout risk predictions for new incoming business data (without target). It assumes the dataset contains all required features in the correct format and is used once the model is deployed in production.

**Since there are no real new data available yet, we use the `validation.csv` dataset only as a mock input to test the execution pipeline. In production, the final execution script will be used with actual incoming data from the business system, which will contain all required fields for proper scoring.**

IMPORTANT: This  code must be executed in the exact same environment in which it was originally created.

The enviroment can be installed on a new machine using the *riesgos.yml* file created (or activated) during the project setup.

Copy the file *riesgos.yml* to your working directory and run this command in the terminal (or Anaconda Prompt): 

*conda env create --file riesgos.yml --name riesgos*

## Execution script: TEST

In [4]:
import pandas as pd
import cloudpickle

project_path = '/Users/rober/retail-stockout-risk-scoring/'
file_name_data = 'validation.csv'
path = project_path + '/02_Data/02_Validation/' + file_name_data 

# Load validation as mock input
X_new = pd.read_csv(path)

# Ensure date column exists if missing
if 'date' not in X_new.columns:
    X_new['date'] = '2023-01-01'  # dummy date

# Convert to datetime
X_new['date'] = pd.to_datetime(X_new['date'], errors='coerce')

# Dummy columns for testing with validation.csv
categorical_missing = [
    'weather_condition',
    'seasonality',
    'holiday_promo'
]

numeric_missing = [
    'competitor_pricing'
]

# Create missing categorical columns
for col in categorical_missing:
    if col not in X_new.columns:
        X_new[col] = 'Unknown'

# Create missing numeric columns
for col in numeric_missing:
    if col not in X_new.columns:
        X_new[col] = 0  # numeric dummy value


# Load trained pipeline
pipe_execution_path = project_path + '/04_Models/pipe_execution.pkl'
with open(pipe_execution_path, 'rb') as f:
    pipeline = cloudpickle.load(f)

# Predict
preds = pipeline.predict(X_new)
X_new['stockout_14d_pred'] = preds

# Save test predictions
X_new.to_csv(project_path + '/05_Outputs/predictions_TEST.csv', index=False)

print("Test predictions saved in predictions_TEST.csv")

Test predictions saved in predictions_TEST.csv


## Execution script: FINAL VERSION

In [None]:
import pandas as pd
import cloudpickle

project_path = '/Users/rober/retail-stockout-risk-scoring/'
file_name_data = 'new_data.csv'  # real business input
path = project_path + '/02_Data/02_Input/' + file_name_data
X_new = pd.read_csv(path)

# Ensure date column is datetime
X_new['date'] = pd.to_datetime(X_new['date'], errors='coerce')

# Load trained execution pipeline
pipe_execution_path = project_path + '/04_Models/pipe_execution.pkl'
with open(pipe_execution_path, 'rb') as f:
    pipeline = cloudpickle.load(f)

# Predict
preds = pipeline.predict(X_new)
X_new['stockout_14d_pred'] = preds

# Save results
output_path = project_path + '/02_Data/03_Output/predictions.csv'
X_new.to_csv(output_path, index=False)

print("Final Production Predictions saved in:", output_path)