# BUAD 313 - Spring 2025 - Assignment 2

Notes:
 - You may work in teams of up to 3.  Submit one assignment for the three of you, but please ensure it has all 3 of your names and @usc.edu emails.
 - You must submit your work as a .ipynb file (jupyter notebook). The grader has to be able to run your notebook. Code that doesn't run gets zero points.  A Great way to check that your code runs is to Select "Clear All Outputs", "Restart" and then "Run All" and make sure it still shows all your answer properly!
 - Use the existing sections below to submit your answers below.  You can add additional Python/markdown cells to describe and explain your solution, but keep it tidy.  Consider formatting some of your markdown cells for the grader.  [Markdown Guide](https://www.markdownguide.org/basic-syntax/)

The deadline for this assignment is **11:59 PM Pacific Time on Friday February 21, 2024**. Late submissions will not be accepted.

Below are the standard Python packages that we use for optimization models in this course. By running this next Python cell, you will have these packages available to use in all your answers contained in this file.

In [None]:
import numpy as np
from gurobipy import Model, GRB, quicksum
import pandas as pd

ModuleNotFoundError: No module named 'gurobipy'

## Team Names and Emails:
 <font color="blue">**(Edit this cell)**</font>
 - Team Member 1
 - Team Member 2
 - Team Member 3

## Question 1 (45 Points):  Portfolio Allocation Revisited

In this problem, we are revisiting the portfolio allocation problem we started developing in Session 9.  You may want to review that lecture and the mathematical formulation we developed. As a reminder, here was the formulation for the base model from class:

<img src="PortfolioProblem.png" alt="Base Portfolio Allocation Model Model" width="400" height=auto>


Assume a target return of .01.  

Our data for this problem are available on brightspace and include:
- monthly_ret_simple.csv
- asset_metadata.csv

I did all the data-wrangling for you (because I'm a nice guy).  Below I load up monthly_ret_simple.csv into a monthly_returns_dict like we did in class.  I also load up asset_metadata.csv into a dictionary called asset_metadata_dict.  You'll probably need that dictionary later in the question.

In [None]:
#read monthly_ret_simple.csv using numpy, but ignore the first row and ignore first column
monthly_returns = np.genfromtxt('monthly_ret_simple.csv', delimiter=',', skip_header=True)[:,1:]

#read in just the first row of monthly_ret_simple.csv, ignore first column and label as tickers
tickers = np.genfromtxt('monthly_ret_simple.csv', delimiter=',', max_rows=1, dtype=str) [1:]

#read in just the first column of monthly_ret_simple.csv, ignoring the first row and label as dates
dates = np.genfromtxt('monthly_ret_simple.csv', delimiter=',', skip_header=True, usecols=0, dtype=str)

#convert monthly_returns into a dictionary where
# the keys are pairs of (date, ticker) and values are the return
monthly_returns_dict = { (dates[i], tickers[j]) : monthly_returns[i,j] for i in range(len(dates)) for j in range(len(tickers)) }

#compute a dictionary of the average returns for each asset
average_returns = { ticker : np.mean([ monthly_returns_dict[(date, ticker)] for date in dates ]) for ticker in tickers }

In [None]:
#read asset_metadata.csv into a dataframe and convert to a dictionary
asset_metadata = pd.read_csv('asset_metadata.csv')

# Convert to a dictionary where (ticker, column_label) -> value
asset_dict = {(row["Ticker"], col): row[col] for _, row in asset_metadata.iterrows() for col in asset_metadata.columns if col != "Ticker"}

categories = [col for col in asset_metadata.columns if col != "Ticker"]

### Part a) (10 points):
One of the criticisms brought up in class was that absolute deviation penalizes being above the expected return the same way it penalizes being below the expected return.  That seems silly because being above the expected return is a good thing.  

One way to address this problem is to use semi-deviation.  The monthly semi-deviation of a portfolio is
 - 0 if the return of the portfolio in that month is above its expected return.
 - the expected return of the portfolio minus the return of the portfolio in that month, otherwise.

Thus, there is no penalty if the portfolio outperforms the expected return, but there is a penalty if it underperforms.

Modify our base model to minimize the average semi-deviation over the dataset.  Write out a full linear optimization formulation for your new model (decision varaibles, constraints, and objective).  You may add/remove variables and constraints from the base model, and/or change the objective.  All variables should be continuous.  Be sure to explain any new variables, constraints or objective in words and in mathematical formulas.

<font color="blue">  Let
- $x_i$ be the fraction of the portfolio invested in asset i for i in tickers
- $d_t$ be a variable that will be the monthly semideviation at optimality for each t in dates

Our formulation is:

<img src="SemiDeviation_Solution.png" alt="Base Portfolio Allocation Model Model" width="400" height=auto>

All the constraints not involving $d_t$ mean the same as they did before.  The $d_t$ constraints are constructed so that optimality, $d_t$ will be the monthly semideviation at month $t$.  To see this, notice, $d_t$ will always be non-negative from the first $d_t$ constraint. We then have two cases:
1. If the monthly return of the portfolio exceeds expected return (outperformance), then the right side of the second $d_t$ constraint will be negative.  So because $d_t \geq 0$, the first constraint will become tight and $d_t$ will be zero in the optimal solution.  
2. If the expected return exceeds the monthly return (under performance), the right side of the second $d_t$ constraint will be positive, so when we are minimining, $d_t$ will be equal to this difference in the optimal solution.  

Either way, $d_t$ encodes the semi-deviation at optimality.  The objective then correctly represents the average of the semi-deviations.

<font color="red"> **Grading Guidelines** This question was difficult and we were lenient in grading.  Don't assume you really understood it just because you got high marks.  Check your work!

- 2 points were awarded for introducing variables and constraints that define semi-deviation. (Reusing $d_t$ as I did was not necessary as long as what you wrote was clear.) Attempts that represented semi-deviation with a non-linear constraint received no points.  Partial credit on this part was sparingly allocated. While there are possibly other solutions, your response must be essentially correct to earn these two points.
- 3 points were awarded for clearly explaining why the above constraints correctly represented semi-deviation.  Unclear explanations or explanations that did not explain both "halves" of the if received only partial credit. Reversing the direction of semi-deviation lost 1/2 point.  Failure to recognize that d_t only represents semi-deviation at the optimal solution lost 1/2 point.  
- 5 points were awarded if the remainder of the formulation was correcty formulated as a linear optimization problem and described clearly.  (Hence, any reasonable answer should receive at least 5 points from this portion.)

### Part b) (10 points)
The client decides this whole semi-deviation idea is too complicated for their taste.  They propose a simpler optimization model similar to the base model. They want to change the objective to maximize the worst monthly return the portfolio earns over the whole dataset, but otherwise keep the other portions of the problem the same (invest all the wealth, don't short-sell, achieve a target return of .01).  

Write a linear optimization formulation for this new problem (decision variables, constraints, and objective).  Be clear and explain mathematics in words where appropriate. Your formulation should only use continuous decision variables.

**Hint** Even though the client only wants to change the objective, you might need to change the constraints and variables too to achieve what they want


<font color="blue"> Solution:
Let
- $x_i$ be the fraction of the portfolio invested in asset $i$ for i in Tickers
- w be an auxiliary decision variable that will represent the worst monthly return at optimality.

<img src="worst_return_solution.png" alt="Base Portfolio Allocation Model Model" width="400" height=auto>

Most of the contraints mean the same as they did in the base model.  The only difference are the constraints for $w$.  Since we are maximizing, these constraints will make $w$ as big as possible, but it has to be less than the monthly return of hte portfolio in for each $t$. So at optimality, it will be equal to the smallest of these monthly returns, i.e., the worst return.

<font color="red"> Grading Guidelines:
 - 3 points were allocated for correctly introducing a new decision variable w and introducing the appropraite constraints.  If students reversed the $\leq$ to a $\geq$, they lost 1.5 points.  If they failed to specify these constraints held for all $t$ in dates, they lost 1 point.  Other attempts that were getting at the right notion but incorrect might receive some (limited) partial credit.  No points were given for formulation that introduced nonlinear constraints.
 - 3 points were allocated for clearly explaining why $w$ represents the worst monthly return.  Failure to note that this only holds at optimality lost 1/2 a point.
 - 4 points (generous) were given as long as the remainder of the problem was a cogent linear optimization problem.  1 point deduction if students left in the deviation or semi-devaition variables.  

### Part c) 10 points
Using your formulation in part b, code up your model in Gurobi and solve it.  (Use a target_return of .01).  

Include your Python code for the model in one or more Python cells.  Your code should print out the optimal value and optimal solution from your model as its last step. Be sure to label which is which and what the units are! Code that does not run earns no credit!

<font color="blue"> **Solution**

In [None]:
#create a gurobi model called wc
wc = Model("wc")

#add a continuous, nonnegative decision variable for each ticker
x = wc.addVars(tickers, vtype=GRB.CONTINUOUS, name='x')

#add a continuous, unbounded decision variables w
w = wc.addVar(vtype=GRB.CONTINUOUS, name='w', lb = -GRB.INFINITY)

#add a constraint that the sum over i in tickers of x_i = 1
wc.addConstr(quicksum(x[i] for i in tickers) == 1)

#add a constraint that the sum over i in tickers of x_i * average_returns[i] >= .01
wc.addConstr(quicksum(x[i] * average_returns[i] for i in tickers) >= .01)

for t in dates:
    #add a constraint that w <= sum over i in tickers of x_i * monthly_returns[t,i]
    wc.addConstr(w <= quicksum(x[i] * monthly_returns_dict[(t,i)] for i in tickers))

#set a constraint to maximize w
wc.setObjective(w, GRB.MAXIMIZE)

#optimize the model
wc.optimize()


Gurobi Optimizer version 11.0.3 build v11.0.3rc0 (mac64[x86] - Darwin 23.6.0 23G93)

CPU model: Intel(R) Core(TM) i7-1068NG7 CPU @ 2.30GHz
Thread count: 4 physical cores, 8 logical processors, using up to 8 threads

Optimize a model with 28 rows, 23 columns and 642 nonzeros
Model fingerprint: 0xbbffce86
Coefficient statistics:
  Matrix range     [9e-05, 1e+00]
  Objective range  [1e+00, 1e+00]
  Bounds range     [0e+00, 0e+00]
  RHS range        [1e-02, 1e+00]
Presolve time: 0.01s
Presolved: 28 rows, 23 columns, 642 nonzeros

Iteration    Objective       Primal Inf.    Dual Inf.      Time
       0      handle free variables                          0s
       7   -3.4811432e-02   0.000000e+00   0.000000e+00      0s

Solved in 7 iterations and 0.03 seconds (0.00 work units)
Optimal objective -3.481143246e-02


In [None]:
print("Optimal Solution:")
for i in tickers:
    print(f"{i}: {x[i].x}")

#print the optimal value
print(f"Optimal Value: {w.x}")


Optimal Solution:
AAPL: 0.0
ARKK: 0.0
BABA: 0.0
BITO: 0.0
EEM: 0.0
EWJ: 0.0
FSLR: 0.11413340447750517
GLD: 0.6337683048220752
GRN: 0.0
HASI: 0.0
ICLN: 0.0
LIT: 0.0
MSFT: 0.0
NVDA: 0.0
PLD: 0.0
SWBI: 0.0
TSLA: 0.06303073061566174
TSM: 0.0
USO: 0.12476085274813556
VNO: 0.0
VOO: 0.0
XOM: 0.06430670733662228
Optimal Value: -0.03481143246158523


<font color = "red">**Grading guidelines:**
 - Code needs to implement the model defined by the student in the previous parts, i.e, earlier mistakes do *not* carry forward.  Partial credit may be awarded if some, but not all of the coded model matches the mathematical model described in earlier parts.  Full points should be reserved for the implementation matching the formulation exactly, solving, and reporting both the optimal solution (values for all decision variables) and the optimal value.
 - Failure to lable the optimal solution and optimal value separately loses 1 point (as does switching the terminology.)
 - Failure to specify the optimal solution loses 3 points.  Failure to specify optimal value loses 1 point.
 - No code presented receives zero points, as does code that does not run.  
 - Mild rounding of the optimal vlaue or the value of the optimal K does not matter.  But we should NOT round to nearest integer (loses 1/2 point).
</font>

### Part d) (15 points)

The client saw your initial work in class and has now articulated a variety of additional constraints on the portfolio.  For each of the constraints below, write (paper and pencil) how you would represent the constraint mathematically as one or more linear constraints in our model.  You may add additional auxiliary variables if you need them, but clearly define them and what they should mean.  You may use the indexed data originally loaded at the top of this question.  You may also define new Index Sets if you so choose.

Add a Markdown Cell directly below this cell with your answer.  You do NOT need to code these constraints in your model.  Be sure to label each constraint (same way in question) so we know which one is which!

1. (China Tariff Concerns) The client is concerned about the impending trade war with China.  They want at most 10% of the portfolio invested in Chinese assets.
1. (ESG Requirement) No more than 30% of the portfolio can be invested in assets with a "low" ESG rating.  
1. (Liquidity Requirement) The ratio of "Low" liquidity assets to "High" liquidy assets should be no more than 10%.
1. (Commodities Minimum) The amount invested in the Asset Class "Commodities" should be at least 10% of the amount invested in the Asset Class "Equities."
1. (Diversifying Equities) Within the investments in Equities, a third of them should be Small Cap, a third should be Mid Cap, and a third should be Large Cap.  
1. (No Tesla) The client does not want to invest in Tesla (TSLA) at all.

**Hint 1:**  DO THIS WITH PAPER AND PENCIL BEFORE TYPING ANYTHING.  

**Hint 2:** I"m going to solve the first part of the problem for you.  Read my answer and mimic it for remaining parts.

**China Tariff Concerns** Let C be the set of tickers $i$ where asset_data[$i$, "Region] = "China."  (C is an Index Set.)  Then, the constraint can be written as
- sum over i in C of x[i] <= .3.  
  
Or, if I want to make it look a bit prettier, I can write
 - $ \sum_{i \text{ in C}} x_i \leq .1 $

(Either answer is fine.)

<font color = "blue"> Solution:
The constraints are:

#### ESG Requirement
Let LowESG be the set of tickers i where asset_data[i, "ESG Rating"] = "Low".  (L is an index set.)  Then the constraint can be written as:
 - $ \sum_{i \text{ in LowESG}} x_i \leq .3$

 #### Liquidity Requirement
 Let LowLiq be the set of tickers i where asset_data[i, "Liquidity"] = "Low" and let HighLiq be the tickers i where asset_data[i, "Liquidity"] == "High".  (LowLiq and HighLiq are both index sets.)  Then teh constraint can be written as:
 - $\sum_{i \text{ in LowLiq}} x_i \leq .1 \sum_{i \text{ in HighLiq}} x_i$


 ### Commodities Minimum
 Let Com be the set of tickers i where asset_data[i, "Asset Class"] == "Commodities".  Let Eq be the tickers i where asset_data[i, "Asset Class"] == "Equities".  (Both Com and Eq are index sets.)  Then the constraint can be written as:
 - $\sum_{i \text{ in Com}} x_i \geq .1 \sum_{{i \text{ in Eq}}} x_i$

 ### Diversifying Equities
 Let
  - SC be the set of tickers i where asset_data[i, "Sub-Class"] = "Small Cap",
  - MC be the set of tickers i where asset_data[i, "Sub-Class"] = "Mid Cap",
  - LC be the set of tickers i where asset_data[i, "Sub-Class"] = "Large Cap".

  Then we need to add the constraints that
  - $ \sum_{i \text{ in SC} } x_i  = \sum_{i \text{ in MC}}  x_i$
  - $\sum_{i \text{ in SC} } x_i = \sum_{i \text{ in LC}} x_i$

  Note, this is not the only way to do this.  There are many other equivalent formulations, all correct.

  #### No Tesla
  - $x_{\text{TSLA}} = 0$.



<font color = "red">**Grading guidelines:**
- You do not have to use latex in your markdown for full credit.  Any typed response that was clear was ok.  Unclear or vague responses may lose credit.
- Most constraints had many equivalent formulations.  Credit was awarded 3 points per constraint.
- Failure to clearly define and index set loses 1 pt.  Note, an alternate acceptable way to write the indexing without an index set would have been (for the China Tariff Concerns Constraint)
$$
\sum_{\substack{i \text{in Tickers}\\ \text{asset\_data[i, "Region"] = "China"}}} x_i \leq .1
$$
Students who did this correctly (or described it correctly) received full credit.
- The following was also ok:
$$
\sum_{i} x_i \leq .1  \qquad \text{where the sum is over tickers such that asset\_data[i, "Region"] = "China"}
$$
- Logically incorrect constraints that were still syntactically correct received 1 point.  
</font>

## Question 2 (26 Points): After that last question, I could use a ...

Trojan Microbrewers brews four types beers: Light, Dark, Ale, and Premium (abreviated L, D, A, P).  Beer is made from 3 main ingredients: malt, hops and yeast.  Each of the different beers requires different amounts of each ingredient to make one beer, plus a lot of other minor ingredients like artificial flavors and preservatives.

Trojan Microbrewers currently has some inventory of malt, hops and yeast (in pounds), but it has an essentially unlimited supply of artificial flavors and preservatives.

Finally, each beer has its own revenue per bottle sold (in dollars).

Your colleague formulated a linear optimization model to maximize the revenue of Trojan Microbrewers
subject to the constraints on the availability of current inventory of main ingredients, taking into account the different recipes. (They assumed we're not buying any more main ingredients at the moment.)
                                                                      
After formulating the model, they solved it and computed the following sensitivity analysis table, but forgot to share the actual formulation with you.

<img src="beer_sensitivity2.png" alt="Sensitivity Table for the Beer model" width="800" height=auto>


#### Part a (5 points)
From just the sensitivity table, can you say what the optimal objective value is?  If so, be sure to explain your answer and how you deduced it from the table for full credit and give units. If not, explain why not and give the best bounds you can.

<font color="blue"> Solution:
Reading from the first half of the table, the final Value tells us the optimal solution. The third column (Objective Coefficient) gives us the corresponding objective function.  Multiplying and summing gives
$6 \times 30 + 5 \times 20 + 3 \times 35 + 7 \times 0 = 385$.

<font color="red"> Grading Notes:
3 points of partial credit were given to any correct interpretation of the table having to do with the first half., but to earn full credit, a student must show work.  If student shows numeric values but does not explain where they come from, lose 1.5 points.

#### Part b (3 points)
From just the sensitivity table, can you say what the current inventory of Malt, Hops and yeast is?  If so, provide them and explain how you deduced this from the table.  If not, explain why not and provide any bounds you can.

<font color="blue"> **Solution** This is Constr. RHS of the second half of the table.  So we have 50, 150, 85 pounds respectively. </font>

 <font color = "red"> Grading Notes:  If student confused and answered with the final value or did not specify that it was the right hand side of the table, lose 1 point. </font>

#### Part c (3 points)
How would you describe the optimal solution to a non-technical stakeholder?  Explain why your description is sufficient to articulate the optimal solution.

<font color = "blue"> Solution:  Do not make any Premium Beer and make sure you use all your main ingredients.
This description is enough because it encodes all the tight constraints of the model.
</font>

<font color = "red">Grading solution:  For full credit, the description should make clear all the tight constraints, and the explanation must mention tight constraints.  Failure to mention tight constraints in explanation (even with corect description) loses 1.5 points.  A description that just describes the numerical values loses 2 points.  Not mentioning an explanation loses 1.5 points.
</font>

The remaining questions all require you to answer the question and justify your response using the table.  If you cannot provide a precise response without resolving the model, indicate so, and give the best answer you can with the information at hand.

<font color = "red"> Grading Notes: For each of the following parts, the same grading was applied.  A correct answer with no explanation or an incorrect explanation receives no credit.  Failure to discuss whether the change is within the allowable increase/decrease loses 1 point.  For questions that require resolving and stating a bound, simply stating the model needs to be resolved only earns 1 point.  Partial credit may be given for some discussion of the bound, but full credit resolved for a complete answer that provides a suitable (upper or lower) bound for the relevant question.

#### Part d) (3 points)
Upon inspection, your floor manager tells you some mice got into the yeast.  You've lost about 5 lbs of yeast that must be discarded.  How do you expect your optimal value to change?

<font color = "blue"> Solution:  5 is within the allowable decrase of yeast (which is 10) and the shadow price is $1, so the changein optimal value should -$5.

#### Part e) (3 points)
Marketing suggests we can increase the price of ale $3 without affecting demand.  If we did this, what would be the change in objective value?

<font color = "blue"> an increase of 3 is exactly the allowable increase (also 3) in the top table for Ale, so the optimal solution should not change.  As a consequence, we jsut make $3 extra for every bottle of ale we make, i.e., we make $3 * 35 = $105.

#### Part f) (3 points)
Your buddy also runs a microbrewery, and they've had a similar problem with mice.  He wants to buy 20 lbs of Malt off you.  You want to quote him a fair price (i.e. you're not trying to make a profit off your friend). What would you sell it for?

<font color = "blue">  20 lbs decrease is within the allowable decrease of Malt which is 35.  So the shadow price pertains.  So I would sell it for $3 /lb, or $60 overall.

### Part g) (3 points)
Assuming demand stays fixed, what is the minimal price increase you'd ask for in Premium Beer to consider producing it?

<font color="blue"> Solution:  Right now the optimal solution makes no premium Beer (final value is zero in first half of table).  The allowable increase is $7. We don't know that if we increase more than $7 that we'd start making beer, but we certainly have to increase the price at least $7 to consider it.  

#### Part h) (3 points)
An alternate supplier is willing to sell you 25 lbs of hops at $1 per lb.  Should you take the deal?  Would you expect to make a profit, neither make a profit nor lose money, or lose money if you did? If you can't be sure, explain why.  

<font color ="blue">Solution: I'd would not take the deal because I'd expect probably to lose money.  An increase of 25 lbs is outside the allowable increase (which is 20 lbs). Hence, I can't precisely know how much the optimal value will change.  But I do know that for the first 20 lbs, the shadow price is $1 per lb, so they're worth $20.  

The issue is for the remaining 5 lbs, I should expect decreasing marginal returns. In a BEST case scenario, those 5 lbs are also worth 1 dollar lb, in which case I'd breakeven.  But if the marginal price does decrease, I'd lose money.  It's not worth the risk.