# ROM demo

#### Author: Erik Vaknin

#### Description

The purpose of this notebook is to demonstrate how the ROM class can be used to train the model and to output recommended discounts.

#### Usage

Before running this notebook, one needs to have preprocessed data locally available. Such data can be obtained by running in the following order:

- `notebooks/0 - data import.ipynb`
- `notebooks/1 - preprocessing - edited - 1 - simplify original logic.ipynb`

Then, simply run this notebook cell by cell to understand how ROM class can be used.

In [1]:
%load_ext autoreload
%autoreload 2

In [14]:
from rom import ROM, ROMConfig
from utils import *
from modelling_pkg.config import *
import pandas as pd
import os
import joblib

countries_dir = "../" + DATADIR + "/countries"

ORGANIZATIONS, ORGANIZATION_ID = get_organizations("../" + DATADIR)

# Load data

<div class="alert alert-block alert-warning"><b>Select country for training</b></div>

In [15]:
country = "CZECHIA"

In [None]:
%%time
# depending on the version of data preprocessing either ',' or '\t' is used as a separator in the csv file.
data = pd.read_csv(
    os.path.join(countries_dir, f"{country}.csv"),  # sep='\t',
    parse_dates=["created_date_off", "updated_date_off"],
    low_memory=False,
)

In [None]:
%%time
# select only data for the selected country
data = data[data["organization_id"] == ORGANIZATION_ID[country]]
# add boolean column for acceptance
data["accepted"] = data.final_outcome.apply(map_final_outcome)
data.head()

In [None]:
# splitting the data; trn is used for training the data, tst for making recommendations
# in production - trn corresponds to historical data, tst corresponds to offers we want to make recommendations for
trn, tst, spans = split_trn_tst(
    data, country, [bin_cols, cat_cols, num_cols], verbose=True
)
tst = tst.drop(columns="accepted")

# Configuration

ROM accepts configuration as a mandatory parameter. For this one must use the class ROMConfig().
There are two ways to initialize configuration for ROM.

1) Set the parameters of ROMConfig manually.
2) Load parameters from json file.

Json file for each country can/will be provided.

In [None]:
# option 1

config = ROMConfig(
    timeout_minutes=0.2,
    augm_variant="basic",
    augm_basic_params=(14, 28),
    gamma=6,
    augm_basic_weight=1.0,
)

# option 2

# cfg_path = f'{CONFIG_PATH}/{country}'
# config = ROMConfig.from_json_file(cfg_path)

config

# Train

In [None]:
%%time

rom = ROM(config, bin_cols, cat_cols, num_cols, verbose=True)
rom.train(trn)

# Recommend

In [None]:
# get recommendations from the model
# the recommended interval is [discount_low, discount_high]
# the optimal discount according to the model is discount_opt

recommendations = rom.recommend(tst, progress_bar=True)
recommendations

# Save model

Two parts of the ROM need to be saved in order to later initialize the same model:

- configuration
- ML model (scikit-learn pipeline)

There are two ways to do it:

1. Save a single dumpable object containing everything
2. Save both objects separately

## Option 1: Use dumpable object

In [None]:
dump_path = "temp_dump.pkl"

# get object that can be dumped and loaded back again to initialize a new identical ROM
dumpable = rom.get_dumpable()

# save to file
joblib.dump(dumpable, dump_path, compress=True)

In [37]:
# load the dumped object
dumpable1 = joblib.load(dump_path)

# remove the temporary file
os.remove(dump_path)

# use the object to initialize a new identical ROM
rom1 = ROM.from_dumpable(dumpable1)

In [None]:
# make recommendations using the newly created model
recommendations1 = rom1.recommend(tst, progress_bar=True)

# verify that the recommendations are identical to the original - should NOT raise error
for c in recommendations.columns:
    assert (recommendations[c] != recommendations1[c]).sum() == 0

## ~Option 2: Save the pipeline and configuration separately~ (deprecated)

In [15]:
# cfg_dump_path = 'temp_cfg_dump.json'
# pipe_dump_path = 'temp_pipe_dump.pkl'

# # get config and pipeline
# cfg_dict = rom.get_config_as_dict()
# pipe = rom.get_pipeline()

# # save them
# # pipeline should be possible to save using mlflow.sklearn.save_model()
# joblib.dump(pipe, pipe_dump_path, compress=True)
# # using json library to save config as a json
# json.dump(cfg_dict, open(cfg_dump_path, 'w'))

# # load objects
# pipe2 = joblib.load(pipe_dump_path)
# cfg_dict2 = json.load(open(cfg_dump_path, 'r'))

# # remove the temporary files
# os.remove(cfg_dump_path)
# os.remove(pipe_dump_path)

# # use the objects to initialize a new identical ROM
# rom2 = ROM.from_config_and_pipe(cfg_dict2, pipe2)

# # make recommendations using the newly created model
# recommendations2 = rom2.recommend(tst, progress_bar=True)

# # verify that the recommendations are identical to the original - should NOT raise error
# for c in recommendations.columns:
#     assert (recommendations[c] != recommendations2[c]).sum() == 0