# Lecture 1

Today we shall retrieve Stigler’s diet data and compute the optimal diet in order to compare with Stigler’s computations. We shall do so from R, using in turn Gurobi and GLPK.

First let's load up the Gurobi library.

In [2]:
library(gurobi)
library(tictoc)
library(Rglpk)

Using the GLPK callable library version 4.47


Then import the data

In [3]:
# setwd('')
thepath = getwd()
filename = "/StiglerData1939.txt"
thedata = as.matrix(read.csv(paste0(thepath, filename), sep = "\t", header = T))
head(thedata)

Commodity,Unit,Price.Aug.15.1939.cents.,Edible.Weight.per..1.00..grams.,Calories..1000.,Protein.grams.,Calcium.grams.,Iron.mg..,Vitamin.A.1000.I.U.,Thiamine.mg..,Riboflavin.mg..,Niacin.mg..,Asorbic.Acid..mg..
1. Wheat Flour (Enriched),10 lb.,36.0,12600,44.7,1411,2.0,365,,55.4,33.3,441,
2. Macaroni,1 lb.,14.1,3217,11.6,418,0.7,54,,3.2,1.9,68,
3. Wheat Cereal (Enriched),28 oz.,24.2,3280,11.8,377,14.4,175,,14.4,8.8,114,
4. Corn Flakes,8 oz.,7.1,3194,11.4,252,0.1,56,,13.5,2.3,68,
5. Corn Meal,1 lb.,4.6,9861,36.0,897,1.7,99,30.9,17.4,7.9,106,
6. Hominy Grits,24 oz.,8.5,8005,28.6,680,0.8,80,,10.6,1.6,110,


Recall that the problem is 
\begin{align}
\min_{q \geq 0} \, c^T q \\
\text{s.t. }Nq \geq d
\end{align}
$c$ is simply a vector of ones, the size of the number of commodities. $N$ is a matrix of amount of nutrients in each commodity. $d$ is the required daily allowance of each nutrient.

In [4]:
nbCommodities = length(which(thedata[, 1] != "")) - 1
names = thedata[1:nbCommodities, 1]
themat = matrix(as.numeric(thedata[, 3:13]), ncol = 11)
themat[is.na(themat)] = 0
N = t(themat[1:nbCommodities, 3:11])
d = themat[(nbCommodities + 1), 3:11]
c = rep(1, nbCommodities)

Now lets try out gurobi!

In [5]:
?gurobi

In [15]:
?gurobi_model

In [7]:
?gurobi_params

So mapping from gurobis notation to ours, 
* `A` = $N$
* `obj` = $c$
* `sense` = '$>$'
* `rhs` = $d$
* `modelsense` = '$\min$'

In [8]:
tic()
result = gurobi(list(A = N, obj = c, sense = ">", rhs = d, modelsense = "min"))  #, params = list(OutputFlag = 0)) 
toc()

Optimize a model with 9 rows, 77 columns and 570 nonzeros
Coefficient statistics:
  Matrix range     [1e-01, 5e+03]
  Objective range  [1e+00, 1e+00]
  Bounds range     [0e+00, 0e+00]
  RHS range        [8e-01, 8e+01]
Presolve removed 0 rows and 47 columns
Presolve time: 0.01s
Presolved: 9 rows, 30 columns, 240 nonzeros

Iteration    Objective       Primal Inf.    Dual Inf.      Time
       0    0.0000000e+00   1.384688e+01   0.000000e+00      0s
       5    1.0866228e-01   0.000000e+00   0.000000e+00      0s

Solved in 5 iterations and 0.02 seconds
Optimal objective  1.086622782e-01
0.06 sec elapsed


Let's see what is in the `result` list

In [9]:
str(result)

List of 13
 $ status      : chr "OPTIMAL"
 $ runtime     : num 0.0213
 $ itercount   : num 5
 $ baritercount: int 0
 $ nodecount   : num 0
 $ objval      : num 0.109
 $ x           : num [1:77] 0.0295 0 0 0 0 ...
 $ slack       : num [1:9] 0 -77.4 0 -48.5 0 ...
 $ rc          : num [1:77] 0 0.845 0.296 0.859 0.489 ...
 $ pi          : num [1:9] 0.00877 0 0.03174 0 0.0004 ...
 $ vbasis      : int [1:77] 0 -1 -1 -1 -1 -1 -1 -1 -1 -1 ...
 $ cbasis      : int [1:9] -1 0 -1 0 -1 0 -1 0 -1
 $ objbound    : num 0.109


We are after the optimal solutions `x`, the dual solution `pi` and the value function `objval`

In [10]:
q_yearly = result$x * 365  # convert into yearly cost
pi = result$pi
cost_daily = result$objval

Our optimal solution (including only foods which are non-zero)

In [11]:
print("*** optimal solution ***")
toKeep = which(q_yearly != 0)
foods = q_yearly[toKeep]
names(foods) = names[toKeep]
print(foods)
print(paste0("Total cost (optimal)= ", sum(q_yearly * c)))

[1] "*** optimal solution ***"
1. Wheat Flour (Enriched)          30. Liver (Beef)               46. Cabbage 
               10.7744575                 0.6907834                 4.0932689 
             52. Spinach       69. Navy Beans Dried 
                1.8277961                22.2754257 
[1] "Total cost (optimal)= 39.6617315454663"


Compare with Stigler's solution

In [12]:
print("*** Stigler's solution ***")
toKeepStigler = c(1, 15, 46, 52, 69)
foods_stigler = c(13.33, 3.84, 4.11, 1.85, 16.8)
names(foods_stigler) = names[toKeepStigler]
print(foods_stigler)
print(paste0("Total cost (Stigler)= ", sum(foods_stigler * c[toKeepStigler])))

[1] "*** Stigler's solution ***"
1. Wheat Flour (Enriched) 15. Evaporated Milk (can)               46. Cabbage 
                    13.33                      3.84                      4.11 
             52. Spinach       69. Navy Beans Dried 
                     1.85                     16.80 
[1] "Total cost (Stigler)= 39.93"


Alternatively we could use `R's` `glpk`

In [13]:
tic()
print("*** Optimal solution using Rglpk ***")
resGlpk = Rglpk_solve_LP(obj = c, mat = N, dir = rep(">", length(d)), rhs = d, bounds = NULL, 
    max = FALSE, control = list())
print(resGlpk$optimum * 365)
toc()

[1] "*** Optimal solution using Rglpk ***"
[1] 39.66173
0 sec elapsed
