<a href="https://colab.research.google.com/github/acedesci/scanalytics/blob/master/S8_9_retail_analytics/DT_S9_Module2B_Retail_Price_Optimization_Script.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Cbc (COIN-OR branch and cut) is an open-source mixed integer programming solver. Those who wish to have an overview of how the branch-and-cut algorithm works can consult Chapter 9, Section 9.6 of Wolsey's Integer Programming book (Wolsey, L. A. (1998). *Integer programming*. New York: J. Wiley & Sons.). Nonetheless, we need not use that package for this session.

In [1]:
# Install Pyomo and GLPK
!pip install -q pyomo
!apt-get install -y -qq glpk-utils #if GLPK is used
# !apt-get install -y -qq coinor-cbc #if cbc is used

[K     |████████████████████████████████| 2.4MB 2.8MB/s 
[K     |████████████████████████████████| 256kB 59.7MB/s 
[K     |████████████████████████████████| 51kB 7.2MB/s 
[K     |████████████████████████████████| 163kB 54.4MB/s 
[?25hSelecting previously unselected package libsuitesparseconfig5:amd64.
(Reading database ... 134443 files and directories currently installed.)
Preparing to unpack .../libsuitesparseconfig5_1%3a5.1.2-2_amd64.deb ...
Unpacking libsuitesparseconfig5:amd64 (1:5.1.2-2) ...
Selecting previously unselected package libamd2:amd64.
Preparing to unpack .../libamd2_1%3a5.1.2-2_amd64.deb ...
Unpacking libamd2:amd64 (1:5.1.2-2) ...
Selecting previously unselected package libcolamd2:amd64.
Preparing to unpack .../libcolamd2_1%3a5.1.2-2_amd64.deb ...
Unpacking libcolamd2:amd64 (1:5.1.2-2) ...
Selecting previously unselected package libglpk40:amd64.
Preparing to unpack .../libglpk40_4.65-1_amd64.deb ...
Unpacking libglpk40:amd64 (4.65-1) ...
Selecting previously unsele

# Block 1: Data input

In [2]:
from google.colab import drive
drive.mount('/content/drive')
cwd = '/content/drive/My Drive/'


Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&response_type=code&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly

Enter your authorization code:
··········
Mounted at /content/drive


With the new dataset, we first need to check how many average price values are there because we need to run the optimization model for each value of the average price.

In [4]:
import pandas
predDemand = pandas.read_csv(cwd +'predictedSales_Prob2.csv')
avgPriceList = predDemand['avgPriceChoice'].unique()
inputColumns = ['avgPriceChoice', 'UPC', 'PRICE','predictSales']
print("Possible average price choices (k):"+str(avgPriceList))


Possible average price choices (k):[3.]


The data output by this cell's code is for the first average price on the average price list (avgPriceList) compiled above. But in this case, the average price list has only one item, so it's all the data we need to work on for this dataset. Otherwise, we would need to repeat this procedure for each average price value and record the corresponding optimal solution to decide how each product should be priced and at which average price level to generate the optimal revenue.

In [5]:
avgPriceValue =  avgPriceList[0]
predDemand_k = predDemand.loc[predDemand['avgPriceChoice'] == avgPriceValue][inputColumns]
print(predDemand_k)
productList = predDemand['UPC'].unique()
priceList = predDemand['PRICE'].unique()

p = {}
D = {}

for upc in productList:
  for price in priceList:
    p[(upc,price)] = price
    D[(upc,price)] = predDemand_k.loc[(predDemand['UPC'] == upc) & (predDemand_k['PRICE'] == price)]['predictSales'].values[0]

print(p)
print(D)

    avgPriceChoice         UPC  PRICE  predictSales
0              3.0  1600027528    2.5          94.9
1              3.0  1600027528    3.0          67.0
2              3.0  1600027528    3.5          46.4
3              3.0  1600027564    2.5          24.1
4              3.0  1600027564    3.0          22.6
5              3.0  1600027564    3.5          19.8
6              3.0  3000006340    2.5           6.2
7              3.0  3000006340    3.0           4.0
8              3.0  3000006340    3.5           3.0
9              3.0  3800031829    2.5          32.9
10             3.0  3800031829    3.0          24.3
11             3.0  3800031829    3.5          20.4
{(1600027528, 2.5): 2.5, (1600027528, 3.0): 3.0, (1600027528, 3.5): 3.5, (1600027564, 2.5): 2.5, (1600027564, 3.0): 3.0, (1600027564, 3.5): 3.5, (3000006340, 2.5): 2.5, (3000006340, 3.0): 3.0, (3000006340, 3.5): 3.5, (3800031829, 2.5): 2.5, (3800031829, 3.0): 3.0, (3800031829, 3.5): 3.5}
{(1600027528, 2.5): 94.9, (16000275

In [0]:
from pyomo.environ import *

iIndexList = list(range(len(productList)))
jIndexList = list(range(len(priceList)))


# Block 2: Variable declarations

Unlike the first part of today's session, we index the decision variables and demand parameters by the product and the price themselves rather than their index. Indeed, we previously denoted $x_{01}=1$ if product at position 0 on the product list is sold at price at position 1 on the price list. Now, our variable is denoted by $x_{1600027528,\ 3.0}=1$, which means that product '1600027528' is sold for 3.0 dollars. The same notational remark applies to predicted demand ($D_{ij}$) and price ($p_{ij}$) parameters. The average price subscript $k$ of $D_{ijk}$ can be dropped because we only have one average price value here. We can declare the constraint sets first (model.PriceChoiceUPC, model.sumPrice) and then **add** the constraint functions later.

In [7]:
model = ConcreteModel()
# Variables
model.x = Var(productList, priceList, within = Binary)
model.PriceChoiceUPC = ConstraintList()
model.sumPrice = ConstraintList()
model.pprint()

5 Set Declarations
    PriceChoiceUPC_index : Dim=0, Dimen=1, Size=0, Domain=None, Ordered=False, Bounds=None
        []
    sumPrice_index : Dim=0, Dimen=1, Size=0, Domain=None, Ordered=False, Bounds=None
        []
    x_index : Dim=0, Dimen=2, Size=12, Domain=None, Ordered=False, Bounds=None
        Virtual
    x_index_0 : Dim=0, Dimen=1, Size=4, Domain=None, Ordered=False, Bounds=(1600027528, 3800031829)
        [1600027528, 1600027564, 3000006340, 3800031829]
    x_index_1 : Dim=0, Dimen=1, Size=3, Domain=None, Ordered=False, Bounds=(2.5, 3.5)
        [2.5, 3.0, 3.5]

1 Var Declarations
    x : Size=12, Index=x_index
        Key               : Lower : Value : Upper : Fixed : Stale : Domain
        (1600027528, 2.5) :     0 :  None :     1 : False :  True : Binary
        (1600027528, 3.0) :     0 :  None :     1 : False :  True : Binary
        (1600027528, 3.5) :     0 :  None :     1 : False :  True : Binary
        (1600027564, 2.5) :     0 :  None :     1 : False :  True : Bi

# Block 3: Modeling (obj function and constraints)

Instead of iteratively entering the value for each price and predicted sales, we can simply create a loop **for** each product and a loop **for** each price. The code now looks very much like the general equation $\sum_{i} \sum_{j} p_{ij} \cdot D_{ij} \cdot x_{ij}$ we saw in the first part of today's session with some minor changes for notational simplification.

In [8]:
# Objective function

obj_expr = sum(p[(i,j)]*D[(i,j)]*model.x[i,j] for i in productList for j in priceList) 
print(obj_expr)
model.OBJ = Objective(expr = obj_expr, sense = maximize)

237.25*x[1600027528,2.5] + 201.0*x[1600027528,3.0] + 162.4*x[1600027528,3.5] + 60.25*x[1600027564,2.5] + 67.80000000000001*x[1600027564,3.0] + 69.3*x[1600027564,3.5] + 15.5*x[3000006340,2.5] + 12.0*x[3000006340,3.0] + 10.5*x[3000006340,3.5] + 82.25*x[3800031829,2.5] + 72.9*x[3800031829,3.0] + 71.39999999999999*x[3800031829,3.5]


Similarly, we can create a loop to **add** constraint functions to the constraint set **for** each product to ensure that only one price on the list is selected for that product. Unlike the first part of today's session, we need not iteratively type each constraint.

In [9]:
# Constraints #1
for i in productList:
  const1_expr = sum(model.x[i,j] for j in priceList) == 1 
  print(const1_expr)
  model.PriceChoiceUPC.add(expr = const1_expr)


x[1600027528,2.5] + x[1600027528,3.0] + x[1600027528,3.5]  ==  1.0
x[1600027564,2.5] + x[1600027564,3.0] + x[1600027564,3.5]  ==  1.0
x[3000006340,2.5] + x[3000006340,3.0] + x[3000006340,3.5]  ==  1.0
x[3800031829,2.5] + x[3800031829,3.0] + x[3800031829,3.5]  ==  1.0


Similar **for** loops apply to the average price constraint. Please refer to the first part of today's session for detailed elaboration.

In [10]:
# Constraints #2
const2_expr = sum(p[i,j]*model.x[i,j] for i in productList for j in priceList) == avgPriceValue*len(productList) 
print(const2_expr)
model.sumPrice.add(expr = const2_expr)

model.pprint()

2.5*x[1600027528,2.5] + 3.0*x[1600027528,3.0] + 3.5*x[1600027528,3.5] + 2.5*x[1600027564,2.5] + 3.0*x[1600027564,3.0] + 3.5*x[1600027564,3.5] + 2.5*x[3000006340,2.5] + 3.0*x[3000006340,3.0] + 3.5*x[3000006340,3.5] + 2.5*x[3800031829,2.5] + 3.0*x[3800031829,3.0] + 3.5*x[3800031829,3.5]  ==  12.0
5 Set Declarations
    PriceChoiceUPC_index : Dim=0, Dimen=1, Size=4, Domain=None, Ordered=False, Bounds=None
        [1, 2, 3, 4]
    sumPrice_index : Dim=0, Dimen=1, Size=1, Domain=None, Ordered=False, Bounds=None
        [1]
    x_index : Dim=0, Dimen=2, Size=12, Domain=None, Ordered=False, Bounds=None
        Virtual
    x_index_0 : Dim=0, Dimen=1, Size=4, Domain=None, Ordered=False, Bounds=(1600027528, 3800031829)
        [1600027528, 1600027564, 3000006340, 3800031829]
    x_index_1 : Dim=0, Dimen=1, Size=3, Domain=None, Ordered=False, Bounds=(2.5, 3.5)
        [2.5, 3.0, 3.5]

1 Var Declarations
    x : Size=12, Index=x_index
        Key               : Lower : Value : Upper : Fixed : Sta

# Block 4: Solution and results

Finally, we call the solver and obtain the optimal solution. We can see that product '1600027528' is also sold at price $\$2.5$, products '1600027564' and '3000006340' both  at price $\$3.5$ and product '3800031829' at price $\$2.5$, but the optimal objective value is now $\$399.3$. Don't panic. Just check the demand parameters in the objective function and you will notice the reason for this difference. Then, could you please try to explain why the demand parameters changed?

In [11]:
# Solve the model
opt = SolverFactory('glpk')
opt.solve(model) 

model.display()

Model unknown

  Variables:
    x : Size=12, Index=x_index
        Key               : Lower : Value : Upper : Fixed : Stale : Domain
        (1600027528, 2.5) :     0 :   1.0 :     1 : False : False : Binary
        (1600027528, 3.0) :     0 :   0.0 :     1 : False : False : Binary
        (1600027528, 3.5) :     0 :   0.0 :     1 : False : False : Binary
        (1600027564, 2.5) :     0 :   0.0 :     1 : False : False : Binary
        (1600027564, 3.0) :     0 :   0.0 :     1 : False : False : Binary
        (1600027564, 3.5) :     0 :   1.0 :     1 : False : False : Binary
        (3000006340, 2.5) :     0 :   0.0 :     1 : False : False : Binary
        (3000006340, 3.0) :     0 :   0.0 :     1 : False : False : Binary
        (3000006340, 3.5) :     0 :   1.0 :     1 : False : False : Binary
        (3800031829, 2.5) :     0 :   1.0 :     1 : False : False : Binary
        (3800031829, 3.0) :     0 :   0.0 :     1 : False : False : Binary
        (3800031829, 3.5) :     0 :   0.0