# Personalized product recommendations Using Gen AI

This demo includes everything you need to set up IBM Decision Optimization CPLEX Modeling for Python (DOcplex), build a Mathematical Programming model, and get its solution by solving the model with IBM ILOG CPLEX Optimizer.

Table of contents:

-  [Describe the business problem](#Describe-the-business-problem)
*  [Prepare the data](#Prepare-the-data)
*  [Use decision optimization](#Use-IBM-Decision-Optimization-CPLEX-Modeling-for-Python)
    -  [Step 1: Set up the prescriptive model](#Step-1:-Set-up-the-prescriptive-model)
        * [Define the decision variables](#Define-the-decision-variables)
        * [Set up the constraints](#Set-up-the-constraints)
        * [Express the objective](#Express-the-objective)
        * [Solve with Decision Optimization](#Solve-with-Decision-Optimization)
    *  [Step 2: Analyze the solution and run an example analysis](#Step-2:-Analyze-the-solution)



## Describe the business problem
* The Self-Learning Response Model (SLRM) node enables you to build a model that you can continually update. Such updates are useful when building a model that assists with predicting which offers are most appropriate for customers and the probability of offers being accepted. These sorts of models are most beneficial in customer relationship management, such as marketing applications or call centers.
* This demo is based on a fictional retail company 'Shop Basket'. 
* The marketing department wants to achieve more profitable results in future campaigns by matching the right offer to each customer. 
* In particular, the data science department identified the characteristics of customers who are most likely to respond favorably based on previous offers and responses and to promote the best current offer based on the results and now need to compute the best offering plan.
<br>

A set of business constraints have to be respected:

* You have a limited budget to run a marketing campaign based on "Facebook", "Whatsapp", "Email"...
* You want to determine which is the best way to contact the customers.
* You need to identify which customers to contact.

## Prepare the data

The predictions show which offers a customer is most likely to accept, and the confidence level that they will accept, depending on each customer’s details.

For example:
(139987, "Bed", 0.13221, "Chairs", 0.10675) indicates that customer Id=139987 will certainly not buy a _Bed_ as the level is only 13.2%, 
whereas
(140030, "Sofa", 0.95678, "Bed", 0.84446) is more than likely to buy _Sofa_ and a _Bed_ as the rates are 95.7% and 84.4%.



In [1]:
import pandas as pd

names = {
    139987 : "shiv prakash", 140030 : "vikram singh", 140089 : "sanjay", 
    140097 : "abhi", 139068 : "ram dutt gupta", 139154 : "khadak singh", 139158 : "gurmit Singh", 
    139169 : "chanderpal", 139220 : "aman", 139261 : "khursid",
    139416 : "rajeev", 139422 : "durgesh", 139532 : "nahar singh", 
    139549 : "ram kumar", 139560 : "sunder paal", 139577 : "maansingh aswal", 139580 : "rohit", 
    139636 : "rohit deshpanday", 139647 : "sparsh", 139649 : "Santosh", 139665 : "Santosh", 
    139667 : "punit khandelwal", 139696 : "dinesh", 139752 : "gulshan"}


data =[(139987, "Chairs", 0.13221, "Bed", 0.10675), (140030, "Sofa", 0.95678, "Chairs", 0.84446), (140089, "Sofa", 0.95678, "Chairs", 0.80233), 
                        (140097, "Chairs", 0.13221, "Bed", 0.10675), (139068, "Chairs", 0.80506, "Sofa", 0.28391), (139154, "Chairs", 0.13221, "Bed", 0.10675), 
                        (139158, "Chairs", 0.13221, "Bed", 0.10675),(139169, "Chairs", 0.13221, "Bed", 0.10675), (139220, "Chairs", 0.13221, "Bed", 0.10675), 
                        (139261, "Chairs", 0.13221, "Bed", 0.10675), (139416, "Chairs", 0.13221, "Bed", 0.10675), (139422, "Chairs", 0.13221, "Bed", 0.10675), 
                        (139532, "Sofa", 0.95676, "Bed", 0.82269), (139549, "Sofa", 0.16428, "Chairs", 0.13221), (139560, "Sofa", 0.95678, "Chairs", 0.86779), 
                        (139577, "Chairs", 0.13225, "Bed", 0.10675), (139580, "Chairs", 0.13221, "Bed", 0.10675), (139636, "Chairs", 0.13221, "Bed", 0.10675), 
                        (139647, "Sofa", 0.28934, "Chairs", 0.13221), (139649, "Chairs", 0.13221, "Bed", 0.10675), (139665, "Sofa", 0.95675, "Chairs", 0.27248), 
                        (139667, "Chairs", 0.13221, "Bed", 0.10675), (139696, "Sofa", 0.16188, "Chairs", 0.13221), (139752, "Chairs", 0.13221, "Bed", 0.10675)]

products = ["Table", "Sofa", "Bed", "Chairs"]
productValue = [100, 200, 300, 400]
budgetShare = [0.6, 0.1, 0.2, 0.1]

availableBudget = 400
channels =  pd.DataFrame(data=[("Whatsapp", 20.0, 0.20), ("Facebook", 15.0, 0.05), ("Email", 23.0, 0.30)], columns=["name", "cost", "factor"])

In [2]:
offers = pd.DataFrame(data=data, index=range(0, len(data)), columns=["customerid", "Product1", "Confidence1", "Product2", "Confidence2"])
offers.insert(0,'name',pd.Series(names[i[0]] for i in data))

Customize the display of this data and show the confidence forecast for each customer.

In [3]:
CSS = """
body {
    margin: 0;
    font-family: Helvetica;
}
table.dataframe {
    border-collapse: collapse;
    border: none;
}
table.dataframe tr {
    border: none;
}
table.dataframe td, table.dataframe th {
    margin: 0;
    border: 1px solid white;
    padding-left: 0.25em;
    padding-right: 0.25em;
}
table.dataframe th:not(:empty) {
    background-color: #fec;
    text-align: left;
    font-weight: normal;
}
table.dataframe tr:nth-child(2) th:empty {
    border-left: none;
    border-right: 1px dashed #888;
}
table.dataframe td {
    border: 2px solid #ccf;
    background-color: #f4f4ff;
}
    table.dataframe thead th:first-child {
        display: none;
    }
    table.dataframe tbody th {
        display: none;
    }
"""

In [4]:
from IPython.core.display import HTML
HTML('<style>{}</style>'.format(CSS))

from IPython.display import display
try: 
    display(offers.drop('customerid',1).sort_values(by='name')) #Pandas >= 0.17
except:
    display(offers.drop('customerid',1).sort('name')) #Pandas < 0.17

  display(offers.drop('customerid',1).sort_values(by='name')) #Pandas >= 0.17


Unnamed: 0,name,Product1,Confidence1,Product2,Confidence2
20,Santosh,Sofa,0.95675,Chairs,0.27248
19,Santosh,Chairs,0.13221,Bed,0.10675
3,abhi,Chairs,0.13221,Bed,0.10675
8,aman,Chairs,0.13221,Bed,0.10675
7,chanderpal,Chairs,0.13221,Bed,0.10675
22,dinesh,Sofa,0.16188,Chairs,0.13221
11,durgesh,Chairs,0.13221,Bed,0.10675
23,gulshan,Chairs,0.13221,Bed,0.10675
6,gurmit Singh,Chairs,0.13221,Bed,0.10675
5,khadak singh,Chairs,0.13221,Bed,0.10675


## Use IBM Decision Optimization CPLEX Modeling for Python

Create the optimization model to select the best ways to contact customers and stay within the limited budget.

### Step 1: Set up the prescriptive model

Set up the prescriptive model using the Mathematical Programming (docplex.mp) modeling package. 

#### Create the model

In [5]:
from docplex.mp.model import Model

mdl = Model(name="marketing_campaign", round_solution=True)

#### Define the decision variables
- The integer decision variables `channelVars`, represent whether or not a customer will be made an offer for a particular product via a particular channel.
- The integer decision variable `totaloffers` represents the total number of offers made.
- The continuous variable `budgetSpent` represents the total cost of the offers made.

In [6]:
offersR = range(0, len(offers))
productsR = range(0, len(products))
channelsR = range(0, len(channels))

channelVars = mdl.binary_var_cube(offersR, productsR, channelsR)
totaloffers = mdl.integer_var(lb=0)
budgetSpent = mdl.continuous_var()

#### Set up the constraints
- Offer only one product per customer.
- Compute the budget and set a maximum for it.
- Compute the number of offers to be made.

In [7]:
# Only 1 product is offered to each customer     
mdl.add_constraints( mdl.sum(channelVars[o,p,c] for p in productsR for c in channelsR) <=1
                   for o in offersR)

mdl.add_constraint( totaloffers == mdl.sum(channelVars[o,p,c] 
                                           for o in offersR 
                                           for p in productsR 
                                           for c in channelsR) )

mdl.add_constraint( budgetSpent == mdl.sum(channelVars[o,p,c]*channels.at[c, "cost"] 
                                           for o in offersR 
                                           for p in productsR 
                                           for c in channelsR) )

# Balance the offers among products   
for p in productsR:
    mdl.add_constraint( mdl.sum(channelVars[o,p,c] for o in offersR for c in channelsR) 
                       <= budgetShare[p] * totaloffers )
            
# Do not exceed the budget
mdl.add_constraint( mdl.sum(channelVars[o,p,c]*channels.at[c, "cost"]
                            for o in offersR 
                            for p in productsR 
                            for c in channelsR)  <= availableBudget )  

mdl.print_information()

Model: marketing_campaign
 - number of variables: 290
   - binary=288, integer=1, continuous=1
 - number of constraints: 31
   - linear=31
 - parameters: defaults
 - objective: none
 - problem type is: MILP


#### Express the objective

Maximize the expected revenue.

In [8]:
mdl.maximize(
    mdl.sum( channelVars[idx,p,idx2] * c.factor * productValue[p]* o.Confidence1  
            for p in productsR 
            for idx,o in offers[offers['Product1'] == products[p]].iterrows()  
            for idx2, c in channels.iterrows())
    +
    mdl.sum( channelVars[idx,p,idx2] * c.factor * productValue[p]* o.Confidence2 
            for p in productsR 
            for idx,o in offers[offers['Product2'] == products[p]].iterrows() 
            for idx2, c in channels.iterrows())
    )

#### Solve with Decision Optimization

Depending on the size of the problem, the solve stage might fail and require the Commercial Edition of CPLEX engines, which is included in the premium environments in Watson Studio.

In [9]:
s = mdl.solve()
assert s, "No Solution !!!"

### Step 2: Analyze the solution

First, display the **Optimal Marketing Channel per customer**.

In [10]:
report = [(channels.at[c, "name"], products[p], names[offers.at[o, "customerid"]]) 
          for c in channelsR 
          for p in productsR 
          for o in offersR  if channelVars[o,p,c].solution_value==1]

assert len(report) == totaloffers.solution_value

print("Marketing plan has {0} offers costing {1}".format(totaloffers.solution_value, budgetSpent.solution_value))

report_bd = pd.DataFrame(report, columns=['channel', 'product', 'customer'])
display(report_bd)

Marketing plan has 20.0 offers costing 364.0


Unnamed: 0,channel,product,customer
0,Facebook,Table,abhi
1,Facebook,Table,ram dutt gupta
2,Facebook,Table,gurmit Singh
3,Facebook,Table,rajeev
4,Facebook,Table,maansingh aswal
5,Facebook,Table,rohit
6,Facebook,Table,rohit deshpanday
7,Facebook,Table,sparsh
8,Facebook,Table,Santosh
9,Facebook,Table,punit khandelwal



Now **focus on Facebook**.

In [12]:
display(report_bd[report_bd['channel'] == "Email"].drop('channel',1))

  display(report_bd[report_bd['channel'] == "Email"].drop('channel',1))


Unnamed: 0,product,customer
12,Sofa,sanjay
13,Sofa,Santosh
14,Bed,khadak singh
15,Bed,aman
16,Bed,khursid
17,Bed,nahar singh
18,Chairs,vikram singh
19,Chairs,sunder paal
