# Data  with Pandas

1. Load the cleaned data. 

2. Remove the column 'last_pymnt_amnt'.

3. Add a column 'yearly_loan_amnt' showing the YEARLY loan amount for each loan. (Hint: think of what 'term' is)

4. Add a column 'yearly_interest_payable' showing the amount of interest payable at the end of each year. (Hint: int_rate is the yearly interest rate)

5. Add a column 'total_payable' which is the total loan amount plus all the yearly interests the borrower has to pay over the loan term.

6. Bonus question: What amount did the borrowers in each grade paid to Lendmark - including interests? (Assume that defaulted loans did not pay anything).

In [1]:
##
# Load data. No changes needed.
##
import pandas as pd
data = pd.read_pickle('Lendmark_clean.pkl')

** Your solution below: **

# Loan default prediction

There is a dataset with the following target distribution:
* Target value 0: 5% of the cases
* Target value 1: 95% of the cases


1. Create a confusion matrix by hand for a model that predicts value 0 for every observation.
2. What would be the accuracy of the above model?
3. Explain the difference between the following performance metrics: Accuracy, Area under ROC curve (AUC)

** Your answers below: **

# Binomial tree

Construct a 2-step binomial tree model in which the underlying is an interest rate, $\ r\ (t)\ $. The three layers of the binomial tree are numbered: $\ i=0\ $ indicates the initial node, $\ i=1\ $ are the two nodes after the first time step, and $\ i=2\ $ are the three nodes after the second time step. At $\ t=0\ $ the interest rate is $\ r_{0} = 3\%\ $. When we step forward in time, the interest rate can change to either $\ u=1.25\ $ times its previous value or $\ d=0.8\ $ times its previous value. For both steps the 1-year risk-free rate is $\ R=3\%\ $. At time $\ 0\ $ what is the fair value of a zero coupon bond (ZCB) that has a notional of $\ N=\$100\ $ and a maturity of $\ T=2\ $ ? In other words, what is the present value of $\ \$100\ $  that we receive after the second step ?

## Suggested steps
1. Based on $\ r_{0}\ $, $\ u\ $ and $\ d\ $: calculate the interest rate values at all three end points of the tree.
2. Based on $\ r_{0}\ $, $\ u\ $, $\ d\ $ and $\ R\ $: calculate $\ p\ $, the probability of stepping up.
3. Place $\ \$100\ $ at each of the three end points.
4. The probability of arriving at the three end points (from top to bottom) is $\ p^{\ 2}\ $, $\ 2\ p\ (1-p)\ $ and $\ (1-p)^{\ 2}\ $.
5. Discount values back to the original node.

In [None]:
### --- Insert your code at "WRITE YOUR CODE HERE" ---

# importing libraries
import numpy as np

# 'BDT' stands for Black-Derman-Toy, the name of the related model
def ZCB_BDT_2(r0,R,u,d,N):
    # pricing a zero coupon bond (ZCB) with a two-step binomial tree (Black-Derman-Toy model)
    
    r0 = float(r0) # initial rate
    R = float(R) # risk-free rate, for example, 1% is R=0.01
    u = float(u) # ratio for stepping up
    d = float(d) # ratio for stepping down
    N = float(N) # notional
    
    # --- WRITE YOUR CODE HERE: interest rate values at the two nodes after the first step ---
    r_1_top = r0 * u # top node
    r_1_bot =        # bottom node
    
    # --- WRITE YOUR CODE HERE: interest rate values at the three nodes after the second step ---
    r_2_top = r0 * u * u # top node
    r_2_mid =            # middle node
    r_2_bot =            # bottom node
    
    # --- WRITE YOUR CODE HERE: calculate p ---
    p = 
    
    # --- WRITE YOUR CODE HERE: probability of arriving at the two nodes after the first step ---
    p_1_top =      # top node
    p_1_bot =      # bottom node

    # --- WRITE YOUR CODE HERE: probability of arriving at each of the final nodes ---
    p_2_top =               # top node
    p_2_mid =               # middle node
    p_2_bot =               # bottom node
    
    # --- WRITE YOUR CODE HERE: fair value at each of the three final nodes ---
    f_2_top = N             # top node
    f_2_mid =               # middle node
    f_2_bot =               # bottom node
    
    # --- WRITE YOUR CODE HERE: from the 3rd layer discount to the 2nd layer ---
    f_1_top = ( p * f_2_top + (1-p) * f_2_mid ) / ( 1 + r_1_top )
    f_1_bot =

    # --- WRITE YOUR CODE HERE: from the 2nd layer discount to the 1st layer ---
    f = 
    
    # final value: fair value of the $100 received at time t=2
    return f

# Print your solution
ZCB_BDT_2(0.03,0.03,1.25,0.8,100)

# American call option pricing

Construct a 2-step binomial tree model for an American call option on a single share of Microsoft stock. Assume that the spot price is $\ S_{\ 0}=\$100\ $. Both steps correspond to a time length of 1 year. The call option matures after the second time step and the strike price is $\ K=\$99\ $. When we step forward in time along the tree, the price of the underlying can change to either $\ u=1.05\ $ times its previous value or $\ d=1/u\ $ times its previous value. Assume that the risk-free rate is constant $\ R=2\%\ $.

What is the fair value of this option ?

## Suggested steps
Calculate 
1. the value of the underlying at each node of the first ($i=1$) layer and the second ($i=2$) layer,
2. the parameter $\ p\ $ (probability of moving up) of the binomial tree model,
3. the value of the call option at each node of the 2nd layer,
4. the value of the call option at each node of the 1st layer,
5. the value of the call option at the starting node (the node of the 0th layer).

Note that the definition of the American call option implies that in this model the option can be excerised not only at the nodes of the 2nd layer (at maturity), but at each node inside the tree. Decide at each node whether to keep the option or to excerise it only by calculating which decision gives a higher value.

In [None]:
### --- Insert your code at "WRITE YOUR CODE HERE" ---

# importing libraries
import numpy as np

def AmCall(S0,R,u,K):
    # pricing an ATM American call option on an underlying stock
    
    # --- WRITE YOUR CODE HERE: convert all four intput parameters from string to float ---
    S0 = float(S0) # initial spot rate
    R = float(R)   # risk-free rate, for example, 1% is R=0.01
    u = float(u)   # ratio for stepping up
    d =            # ratio for stepping down
    N = 1.0        # notional
    T =            # time length of single step along the tree
    K = float(K)   # strike price
    
    # --- WRITE YOUR CODE HERE: value of the underlying at the two nodes after the first step ---
    S_1_top =        # top node
    S_1_bot =        # bottom node
    
    # --- WRITE YOUR CODE HERE: value of the underlying at the three nodes after the second step ---
    S_2_top = S0 * u * u # top node
    S_2_mid =            # middle node
    S_2_bot =            # bottom node
    
    # --- WRITE YOUR CODE HERE: p (probability of stepping up) ---
    p =
    
    # --- WRITE YOUR CODE HERE: probability of arriving at the two nodes after the first step ---
    p_1_top =     # top node
    p_1_bot =     # bottom node

    # --- WRITE YOUR CODE HERE: probability of arriving at each of the final nodes ---
    p_2_top =               # top node
    p_2_mid =               # middle node
    p_2_bot =               # bottom node
    
    # --- WRITE YOUR CODE HERE: value of the American call option at the three final nodes in layer i=2 ---
    f_2_top = np.maximum( 0.0, S_2_top - K ) # top node
    f_2_mid =                                # middle node
    f_2_bot =                                # bottom node
    
    # --- WRITE YOUR CODE HERE: value of the call option: from the i=2 layer calculate the i=1 layer ---
    # --- maximum of the following two values: (1) excercising (2) discounted from the i=2 layer ---
    f_1_top = np.maximum( S_1_top - K, ( p * f_2_top + (1-p) * f_2_mid ) / ( 1 + R ) )
    f_1_bot =

    # --- WRITE YOUR CODE HERE: American ATM call option: from the i=1 layer discount to the i=0 layer ---
    # --- maximum of these two: (1) immediately excercising the option (2) discounting from the i=1 layer ---
    f = 
    
    # final value: fair value of the $100 received at time t=2
    return f

# Print your solution
AmCall(100,0.02,1.05,99)

# Monte Carlo

## 1.
When pricing an option, how can we decide whether to use the Black-Scholes formula, or a Monte-Carlo simulation? (Please answer with text.)

## 2.

Consider the EUR/HUF currency pair, assume its process is driven by the following stochastic differential equation:
$$S(t+\Delta t) - S(t) = \mu S(t) \Delta t + \sigma S(t) \epsilon \sqrt{\Delta t} ,$$
where $\epsilon$ is a random sample from a standard normal distribution. Assume that:
- $S(0) = 320$;
- $\mu = 0.02$;
- $\sigma = 0.05$.

a) Use Monte Carlo simulation to estimate the probability of the Euro price being cheaper than 310 HUF anytime in the following year (52 weeks).

I.e. $P(S(t) < 310, 0 < t \leq T) = ?$ if $T=1$.

b) What is the 95% confidence interval of this estimation?

# Trade compression

## 1.

When doing trade compression with linear programming, why is it better to minimize the notional instead of the number of trades? (Please answer with text.)

## 2.

Imagine you are the manager of a Scandinavian furniture factory, where the furnitures are delivered as a package of standardized parts and pieces. Let's consider you can make the followings from your inventory:

* TV stand: 2 screws and 5 furniture panels are needed, \$40 profit per piece
* Coffee table: 6 screws and 4 furniture panels are needed, \$55 profit per piece

The current inventory consists of 150 screws and 155 furniture panels.

How many TV stands and coffee tables should you package to maximize profits?