Before you turn this problem in, make sure everything runs as expected. First, **restart the kernel** (in the menubar, select Kernel$\rightarrow$Restart) and then **run all cells** (in the menubar, select Cell$\rightarrow$Run All).

Make sure you fill in any place that says `YOUR CODE HERE`, as well as your name below:

In [None]:
NAME = ""

Please save this notebook in a folder of the same name (`CME_PS` plus the corresponding number) and push it to your LRZ-Gitlab repository. Do not change the name of the notebook!

Note: some of the cells below are locked, so you cannot change them, but you can evaluate them!

---

# Computational Methods in Economics

## Problem Set 4 - Numerical Optimization and Pandas 

#### DEADLINE: Wednesday, January 8, 12 pm (Noon)

### Preliminaries

#### Import Modules

In [None]:
import numpy as np
import pandas as pd
import scipy.optimize

import matplotlib.pyplot as plt
%matplotlib inline

## Question 1

Run the cell below which imports the data set on Bundesliga players and performs the same operations as in the lecture, in order to make the players' last names the indices of the DataFrame (Make sure that you have the **BundesligaData.csv** file in the same folder!)

In [None]:
df = pd.read_csv('BundesligaData.csv', sep = ';')

def reverse_name(name):
    L = name.split(" ")
    try:
        S = L[1] + " " + L[0]
    except:
        S = L[0]
    return S
## apply reverse_name function on 'name' column
df['name'] = df['name'].apply(reverse_name)
## create new dataframe with last and first names
names = df['name'].str.split(expand=True)
# replace column 'name'
df['name'] = names[0]
## add column 'first name'
df.insert(1, 'first name', names[1])
## make name the index
df.set_index('name', drop = True, inplace = True)
## check dataframe
df.head()

Using this DataFrame, answer the following questions:

(a) Which player got the most scorer points in the 2016/17 season, and how many points did he get? A scorer point is awarded for both a goal and an assist. Your answer should be a tuple named **answer** consisting of a string (the player's last name) and an integer.

In [None]:
## (a)
# YOUR CODE HERE

In [None]:
## THIS IS A TEST CELL!

(b) Are there any players in the data set that are younger than 23 and have scored at least 8 goals? Your answer should be a tuple named **answer** containing only strings.

In [None]:
## (b)
# YOUR CODE HERE

In [None]:
## THIS IS A TEST CELL!

(c) At which positions do the players with the most red cards, the most assists, and the highest pass success, respectively, play? For your answer, use the following dictionary, where you need to add the corresponding positions.

In [None]:
answer = {'red cards': None,\
          'assists': None,\
          'pass success': None}

In [None]:
## (c)
# YOUR CODE HERE

In [None]:
## THIS IS A TEST CELL!

(d) For the first five players, update the **'goals'** column with the the number of goals they scored on the matchday 34, namely {'Lewandowski' : 0, 'Aubameyang': 2, 'Mueller': 0, Costa': 0, 'Reus': 1}. This should change the DataFrame **df** in place.

**Hint**: All functions or methods needed to solve this question were discussed in the **pd.Series** part of the lecture.

In [None]:
## (d)
# YOUR CODE HERE

In [None]:
## THIS IS A TEST CELL!

(e) At what age do players on average have the highest value? At what age the highest rating? Use the **groupby** and the **apply** method. Your answer should be a tuple named answer consisting of two integers.

In [None]:
## (e)
# YOUR CODE HERE

In [None]:
## THIS IS A TEST CELL!

(f) Write a function **standardize** that takes a series or dataframe column and standardizes its data, i.e. transforms each value by removing the mean and by dividing by the standard deviation:
\begin{equation}
    \tilde{x}_i = \frac{x_i - mean(x)}{std(x)}
\end{equation}
When computing the standard deviation, keep the default settings.

Create a new DataFrame **df_stand** as a copy of **df**. Use the **apply** to standardize **df_stand**. Keep in mind that **apply** only works if all columns have the right data type; use **drop** to get rid of those column that don't.

In [None]:
## (f)
# YOUR CODE HERE

In [None]:
s = pd.Series([1,2,3,4,5])
assert np.allclose(standardize(s), pd.Series([-1.264911,-0.632456,0,0.632456,1.264911]) )

In [None]:
## THIS IS A TEST CELL!

## Question 2

Solve the following problem using *constrained optimization*

$$
    \min_{\mathbf{x}}\ x_1 x_4  (x_1 + x_2 + x_3) + x_3
$$

s.t. 
$$
  x_1  x_2  x_3  x_4 >= 25  
$$

$$
  x_1^2  x_2^2  x_3^2  x_4^2 = 40  
$$

and $1 \le x_j \le 5$ for $j =1,2,3,4$. You can use $x0 = (1,5,5,1)$ as an initial guess. Save your solution vector under the name **x_sol**.

**Hint**: In order to solve the question, apart from the initial guess, you need to define three functions (either using **def** or the **lambda** notation), a list of dictionaries and a list of tuples. 

In [None]:
# YOUR CODE HERE

In [None]:
assert np.allclose(x_sol, np.array([1. , 4.74299607, 3.82115466, 1.37940764]))

## Question 3
*Based on Judd(1998), chapter 4, question 2*. Consider an endowment economy with $m$ agents and $n$ goods. Throughout this question, *superscripts* will indicate agents, while *subscripts* will indicate goods.

Assume that agent $i$'s utility function over the $n$ goods is 

\begin{equation}
    u^{i}(\mathbf{x}^i) = \sum^{n}_{j = 1} a^i_j (x^i_j)^{v^i_j + 1} (1 + v^i_j)^{-1} 
\end{equation}

Suppose that agent $i$'s endowment of good $j$ is $e^i_j$. Assume that $a^i_j, e^i_j > 0$ and $v^i_j < 0$ (for $v^i_j =-1$, we replace $(x^i_j)^{v^i_j + 1} (1 + v^i_j)^{-1}$ with $\ln x^i_j$). 

In this question, we use *numerical optimization* to solve for the outcome of the planning problem, in which the planner maximizes total welfare:

$$
    \max_{\left\{\mathbf{x}^i\right\}_{i = 1}^m} \sum_{i = 1}^m \lambda^i u^{i}(\mathbf{x}^i) 
$$
subject to the resource constraint:
<a id = 'rc'></a>
\begin{equation}
    \sum^{m}_{i = 1} x_j^i = \sum^{m}_{i = 1} e_j^i, \quad \forall j \tag{RC}
\end{equation}


where $\lambda^i$ is the social weight associated with agent $i$. We will consider a case with $m = 2$ (two agents) and $n = 3$ (three goods; however, note that your code should be written in a way that it works for any number $n$!). Assume the following values for the parameters:

In [None]:
## read parameters
A = np.array([ [2.0, 1.5, 1.5],
               [1.5, 2.0, 1.5]])

V = np.array([ [-2.0, -0.5, -0.5],
               [-1.5, -0.5, -1.5]])

E = np.array([ [2.0, 3.0, 0.0],
               [1.0, 2.0, 4.0]])

The way to read these matrices is that an agent corresponds to a row and a good to a column. For example, agent 1's endowment of good 2, $e^1_2$, would be the element in the first row and second column of matrix **E**, and hence $e^1_2 = 3.0$.

(a) Write a function **objective_uncon** that takes as inputs a flat array **x1**, the arrays **A**, **V** and **E** given above, and a flat array **lam** containing the social weights $\lambda^1$ and $\lambda^2 = 1 - \lambda^1$, and returns the *negative* of total welfare as defined above (recall that we face a *maximization* problem here!). 

**Hint**: A slightly tricky issue when answering this question using *unconstrained* numerical optimization methods is how to deal with the constraint that aggregate consumption of good $j$ must equal aggregate endowments, i.e.

\begin{equation}
    \sum^{m}_{i = 1} x_j^i = \sum^{m}_{i = 1} e_j^i
\end{equation}

With two agents, this can be easily addressed by maximizing over the consumption choice of agent 1, and then computing the consumption of agent 2 as the *residual*. Hence, for good $j$:

\begin{equation}
    x_j^2 = \sum^{2}_{i = 1} e^i_j - x_j^1
\end{equation}

Thus, the input in the **objective_uncon** function is a flat array of length $n$, in the example here of length 3, representing the vector $\mathbf{x}^1 = (x^1_1, x^1_2, x^1_3)$.

In [None]:
def objective_uncon(x1, V, A, E, lam):
    """
    Implements the objective function for the social planner problem with two agents and n goods
    
    ((n,)np.array, (m,n)np.array, (m,n)np.array, (m,n)np.array, (m,)np.array ) -> float
    """
    ## get dimension of the problem 
    m, n = V.shape ## number of agents, number of good

    ## check if the parameter matrices have the correct dimensions 
    assert A.shape == (m, n), "The dimensions of A and V must coincide!"
    assert E.shape == (m, n), "The dimensions of E and V must coincide!"
    assert len(lam) == m, "The length of lam is not consistent with the dimensions of V!"
    
    ## The remainder of the function below should include the following steps:
    ## 1. compute x2 for given x1 and E 
    ## 2. compute total welfare from x1, x2, lam, V and A
    
    
    # YOUR CODE HERE

In [None]:
x0 = np.array([2., 2., 2.])
lam1 = 0.5
lam = np.array([lam1, 1-lam1])
objective_uncon(x0, V, A, E, lam)

assert np.allclose(objective_uncon(x0, V, A, E, lam), -4.646082130477218) 

(b) Using Scipy's BFGS implementation, compute the solution to the planning problem. You should report a tuple **x_sol** that consists of arrays **x1** and **x2**, the consumption vector for agents 1 and 2, respectively. 

In [None]:
## b)
x0 = np.array([2., 2., 2.])
lam1 = 0.5
lam = np.array([lam1, 1-lam1])
# YOUR CODE HERE

In [None]:
assert np.allclose(x_sol,  (np.array([1.53646326, 1.79999756, 2.62120338]), np.array([1.46353674, 3.20000244, 1.37879662])) ) 

(c) Alternatively, we can use constrained optimization to solve the same problem. Write a function **objective_con** that takes as inputs a flat array **x**, the arrays **A**, **V** and **E** given above, and a flat array **lam**. This function is very similar to the function defined in question (a). The important difference here is that we maximize over the quantities of agent 1 and agent 2 simultaneously. Hence, the input **x** has length $2n$, here 6. 

**Hint**: Depending on how you set up your function, the following property of the Numpy's **resize** method may be useful (you do not have to use this to answer the question though!): 

In [None]:
B = np.array(['a', 'b', 'c', 'd', 'e', 'f'])
B.resize((2,3))
print(B)

In [None]:
## c)

def objective_con(x, V, A, E, lam):
    """
    Computes the objective function for the m-by-n social planner problem
    
    ((m*n,)np.array, (m,n)np.array, (m,n)np.array, (m,n)np.array, (m,)np.array ) -> float
    """
    ## get dimension of the problem 
    m, n = V.shape ## number of agents, number of good

    ## check if the parameter matrices have the correct dimensions 
    assert A.shape == (m, n), "The dimensions of A and V must coincide!"
    assert E.shape == (m, n), "The dimensions of E and V must coincide!"
    assert len(lam) == m, "The length of lam is not consistent with the dimensions of V!"
    
    # YOUR CODE HERE


In [None]:
# THIS IS A TEST CELL!

(d) Below, I give you a function **resource** which is used to define the resource constraint [RC](#rc) in the **constr** list. With this list, solve the constrained optimization problem (recall to define bounds for the variables!). Report your results in a flat Numpy array called **x_sol2**. 

In [None]:
def resource(x, V, A, E, lam):
    """
    Computes an m-by-n consumption matrix from solution vector x and endowments E;
    Agent n's consumption is computed as the residual comsumption 
    """
    ## get dimension of the endowment matrix
    m, n = E.shape
    ## reshape x into matrix X 
    X = x.copy()
    X.resize((m, n))
    ## compute and return residual consumption
    return E.sum(axis = 0) - X.sum(axis = 0)

constr = [{'type': 'eq', 'fun': resource, 'args': (V, A, E, lam)}]

In [None]:
x0 = np.array([2., 2., 2., 1., 3., 2.])
lam1 = 0.5
lam = np.array([lam1, 1-lam1]) 
# YOUR CODE HERE

In [None]:
assert np.allclose(x_sol2, np.array([1.53677822, 1.79949481, 2.62096144, 1.46322178, 3.20050519, 1.37903856])) 