In [76]:
%pylab inline
import pandas as pd
import numpy as np
import fmt
from scipy.stats import norm

Populating the interactive namespace from numpy and matplotlib


`%matplotlib` prevents importing * from pylab and numpy


# Homework Set 7

This homework is to price [synthetic CDO](https://en.wikipedia.org/wiki/Synthetic_CDO) using the one factor Gaussian Copula model. 

A synthetic CDO consists of $n$ CDS, the total loss of the portfolio is defned as:

$$ l(t) = \sum_i^n w_i \tilde {\mathbb{1}}_i(t) (1-r_i(t)) $$

where $w_i$ and $r_i(t)$ are the notional weights and recovery rate of the i-th name in the portfolio. The notional weighs sum up to 1: $\sum_i w_i = 1 $. The $ \tilde {\mathbb{1}}_i(t) $ is the default indicator of the i-th name defaulted before time $t$, the default probability is therefore $p_i(t) = \mathbb E[\tilde {\mathbb{1}}_i(t) ]$

For the purpose of this homework, we consider a simplified synthetic CDO that has no coupon payments, therefore the PV of a \$1 notional synthetic CDO tranche with maturity $t$, attachment $a$ and detachment $d$ is:

$$ v(a, d) = \frac{d(t)}{d-a} \min\left((l(t) - a)^+, d-a\right) $$

where $d(t)$ is the discount factor.

The following are the parameters to the synthetic CDO, and a straight forward Monte Carlo pricer:

In [55]:
n = 125
t = 5.
defProbs = 1 - exp(-(np.random.uniform(size=n)*.03)*t)
recovery = 0.4*np.ones(n)
w = 1./n*np.ones(n)
rho = 0.5
discf = .9
npath = 1000

# a list of attachements and detachements, they pair up by elements
attachements = np.array([0, .03, .07, .1, .15, .3])
detachements = np.array([.03, .07, .1, .15, .3, .6])

#portfolio expected loss
el = np.sum(w*defProbs*(1-recovery))
print "portfolio expected loss is ", el

portfolio expected loss is  0.0418900853308


In [60]:
from scipy.stats import norm

class CDO(object) :
    def __init__(self, w, defProbs, recovery, a, d) :
        self.w = w/np.sum(w)
        self.p = defProbs
        self.rec = recovery
        self.rho = rho
        self.a = a
        self.d = d

    def drawDefaultIndicator(self, z, rho) :
        '''return a list of default indicators given common factor z, using one factor Gaussian Copula
        '''
        e = np.random.normal(size=np.shape(self.p))
        x = z*np.sqrt(self.rho) + np.sqrt(1-self.rho)*e
        return np.less(norm.cdf(x), self.p)

    def portfolioLoss(self, defIndicator) :
        '''compute portfolio loss given default indicators'''
        return np.sum(defIndicator*self.w*(1-self.rec))

    def tranchePV(self, portfLoss, discf) :
        '''compute tranche PV from portfolio loss
        Args:
            portfLoss: the total portfolio loss
            discf: discount factor
        Returns:
            tranche PVs'''
        
        sz = self.d - self.a
        return discf/sz*np.minimum(np.maximum(portfLoss - self.a, 0), sz)

    def drawPV(self, z, rho, discf) :
        ''' compute PV and portfolio Loss conditioned on a common factor z'''
        di = self.drawDefaultIndicator(z, rho)
        pfLoss = self.portfolioLoss(di)
        return self.tranchePV(pfLoss, discf), pfLoss
    
    
cdo = CDO(w, defProbs, recovery, attachements, detachements)

In [61]:
## price the tranches using simulation
def simCDO(cdo, rho, disc, paths) :
    zs = np.random.normal(size=[paths])
    pv = np.zeros(np.shape(cdo.a))
    pv2 = np.zeros(np.shape(cdo.d))
    for z in zs:
        thisPV, _ = cdo.drawPV(z, rho, discf)
        pv += thisPV
        pv2 += thisPV*thisPV
        
    v = pv/paths
    var = pv2/paths - v**2
    return pv/paths, np.sqrt(var/paths), var

In [62]:
pv_0, err_0, var_0 = simCDO(cdo, rho, discf, npath)
df = pd.DataFrame(np.array([cdo.a, cdo.d, pv_0, err_0, var_0]), index=['Attach', 'Detach', 'PV', 'MC err', 'Var'])

fmt.displayDF(df, fmt='4g')

Unnamed: 0,0,1,2,3,4,5
Attach,0.0,0.03,0.07,0.1,0.15,0.3
Detach,0.03,0.07,0.1,0.15,0.3,0.6
PV,0.442,0.2217,0.1387,0.1003,0.03504,0.001728
MC err,0.01217,0.01137,0.009989,0.008639,0.004538,0.0006412
Var,0.1481,0.1292,0.09977,0.07463,0.02059,0.0004111


## Problem 1

Modify the simCDO function to implement the following variance reduction techniques, and show whether the technique is effective:

For this homework, we only apply the variance reduction in the common market factor $z$, you should not change the random number $e$ that were drew with in the drawDefaultIndicator function, i.e., only modify the simCDO code, re-use but do not modify the CDO class. Unless explicitly mentioned, keep the simulation path the same as the base case above.

1. anti-thetic variate, reduce the number of paths by half to account for the 2x increase in computation
1. importance sampling, shift $z$ by -1
1. sobol sequence
1. stratified sampling: sample $z$ using an equal sized grid

Compute the **variance** reduction factor for each technique, and comment on the effectiveness of these variance reduction techniques.

## Solution:
#### 1.

In [71]:
def simCDO_1(cdo, rho, disc, paths) :
    zs = np.random.normal(size=[paths])
    pv = np.zeros(np.shape(cdo.a))
    pv2 = np.zeros(np.shape(cdo.d))
    for z in zs:
        thisPV1, _ = cdo.drawPV(z, rho, discf)
        thisPV2, _ = cdo.drawPV(-z, rho, discf)
        thisPV = 0.5*(thisPV1 + thisPV2)
        pv += thisPV
        pv2 += thisPV*thisPV
        
    v = pv/paths
    var = pv2/paths - v**2
    return pv/paths, np.sqrt(var/paths), var

In [72]:
pv_1, err_1, var_1 = simCDO_1(cdo, rho, discf, npath/2)
df = pd.DataFrame(np.array([cdo.a, cdo.d, pv_1, err_1, var_1]), index=['Attach', 'Detach', 'PV', 'MC err', 'Var'])

fmt.displayDF(df, fmt='4g')

Unnamed: 0,0,1,2,3,4,5
Attach,0.0,0.03,0.07,0.1,0.15,0.3
Detach,0.03,0.07,0.1,0.15,0.3,0.6
PV,0.4608,0.2312,0.1402,0.08302,0.03128,0.001613
MC err,0.004404,0.009073,0.008816,0.007285,0.004347,0.0007046
Var,0.009697,0.04116,0.03886,0.02654,0.00945,0.0002482


In [73]:
print " variance reduction factor:", var_0/var_1

 variance reduction factor: [ 15.27515584   3.13876415   2.56735642   2.81218782   2.17911876
   1.65636832]


#### 2.

In [177]:
1000**0.5

31.622776601683793

In [65]:
def simCDO_2(cdo, rho, disc, paths) :
    #zs = np.random.normal(size=[paths])
    u = 1.
    xs_q = np.random.normal(size=[paths]) # Q sample
    xs_p = xs_q + u # P sample
    qs = 1./paths*np.ones(paths)
    zs = np.exp(-u*xs_p + .5*u*u) # R-N derivative
    #ps = qs*zs
    #ps = ps/sum(ps)  # normalize
    
    pv = np.zeros(np.shape(cdo.a))
    pv2 = np.zeros(np.shape(cdo.d))
    for i in xrange(len(xs_p)):
        thisPV, _ = cdo.drawPV(xs_p[i], rho, discf)
        thisPV *= zs[i]
        pv += thisPV
        pv2 += thisPV*thisPV
        
    v = pv/paths
    var = pv2/paths - v**2
    return pv/paths, np.sqrt(var/paths), var

In [66]:
pv_2, err_2, var_2 = simCDO_2(cdo, rho, discf, npath)
df = pd.DataFrame(np.array([cdo.a, cdo.d, pv_2, err_2, var_2]), index=['Attach', 'Detach', 'PV', 'MC err', 'Var'])

fmt.displayDF(df, fmt='4g')

Unnamed: 0,0,1,2,3,4,5
Attach,0.0,0.03,0.07,0.1,0.15,0.3
Detach,0.03,0.07,0.1,0.15,0.3,0.6
PV,0.4975,0.2674,0.1694,0.09264,0.03165,0.001284
MC err,0.03871,0.03519,0.03107,0.025,0.01434,0.001283
Var,1.499,1.238,0.965,0.625,0.2057,0.001646


In [67]:
print " variance reduction factor:", var_0/var_2

 variance reduction factor: [ 0.09883645  0.10434466  0.10338973  0.11941929  0.10012345  0.24969943]


#### 3.

In [77]:
import sobol

In [165]:
e = np.zeros(1000)
for i in xrange(1, 26):   
    a, _ = sobol.i4_sobol(40, i+100)
    e[(i-1)*40:i*40] = norm.ppf(a)

In [166]:
def simCDO_3(cdo, rho, disc, paths, e) :
    #zs = np.random.normal(size=[paths])
    zs = e
    pv = np.zeros(np.shape(cdo.a))
    pv2 = np.zeros(np.shape(cdo.d))
    for z in zs:
        thisPV, _ = cdo.drawPV(z, rho, discf)
        pv += thisPV
        pv2 += thisPV*thisPV
        
    v = pv/paths
    var = pv2/paths - v**2
    return pv/paths, np.sqrt(var/paths), var

In [167]:
pv_3, err_3, var_3 = simCDO_3(cdo, rho, discf, npath, e)
df = pd.DataFrame(np.array([cdo.a, cdo.d, pv_3, err_3, var_3]), index=['Attach', 'Detach', 'PV', 'MC err', 'Var'])

fmt.displayDF(df, fmt='4g')

Unnamed: 0,0,1,2,3,4,5
Attach,0.0,0.03,0.07,0.1,0.15,0.3
Detach,0.03,0.07,0.1,0.15,0.3,0.6
PV,0.4557,0.2402,0.1387,0.09262,0.03408,0.002455
MC err,0.01233,0.01167,0.00986,0.008266,0.004673,0.0006217
Var,0.1521,0.1363,0.09722,0.06832,0.02184,0.0003865


In [168]:
print " variance reduction factor:", var_0/var_3

 variance reduction factor: [ 0.973985    0.94787893  1.02623364  1.09232528  0.94299882  1.06380938]


#### 4.

In [169]:
def stratify(u, bs, shuffle) :
    b = len(bs)
    r = len(u)/b + 1
    sb = []
    
    for i in range(r) :
        if shuffle :
            np.random.shuffle(bs)
        sb = sb + bs.tolist()
            
    return [1.*(i + x)/b for x, i in zip(u, sb)]

In [174]:
def simCDO_4(cdo, rho, disc, paths, e) :
    bs = np.arange(100)
    u = np.random.uniform(size=[paths])
    v = stratify(u, bs, False)
    zs = norm.ppf(v)
    pv = np.zeros(np.shape(cdo.a))
    pv2 = np.zeros(np.shape(cdo.d))
    for z in zs:
        thisPV, _ = cdo.drawPV(z, rho, discf)
        pv += thisPV
        pv2 += thisPV*thisPV
        
    v = pv/paths
    var = pv2/paths - v**2
    return pv/paths, np.sqrt(var/paths), var

In [175]:
pv_4, err_4, var_4 = simCDO_4(cdo, rho, discf, npath, e)
df = pd.DataFrame(np.array([cdo.a, cdo.d, pv_4, err_4, var_4]), index=['Attach', 'Detach', 'PV', 'MC err', 'Var'])

fmt.displayDF(df, fmt='4g')

Unnamed: 0,0,1,2,3,4,5
Attach,0.0,0.03,0.07,0.1,0.15,0.3
Detach,0.03,0.07,0.1,0.15,0.3,0.6
PV,0.456,0.2338,0.1491,0.09612,0.03384,0.003326
MC err,0.01221,0.0116,0.01022,0.008262,0.004566,0.00115
Var,0.149,0.1347,0.1044,0.06826,0.02085,0.001323


In [176]:
print " variance reduction factor:", var_0/var_4

 variance reduction factor: [ 0.99376501  0.95941141  0.95600537  1.09335726  0.9875876   0.3106482 ]


## (Extra Credit) Problem 2

Consider a control variate for the problem above. The large pool model assumes that the portfolio is a large homogeneous pool, using the average default rate: $\bar p = \frac{1}{n}\sum_i p_i$. Then the portfolio loss conditioned on market factor $z$ under the large pool model is a determinsitic scalar:

$$ l(z) = (1-r)\Phi\left(\frac{\Phi^{-1}(\bar p) - \sqrt \rho z}{\sqrt{1-\rho}}\right)$$

where $r$ is the constant recovery of all names. $\Phi()$ is the normal CDF function; $\Phi^{-1}()$ is its inverse. The tranche PVs can then be computed from the $l(z)$.

Please investigate if the large pool model can be used as an effective control variate. Does it work better for some tranches?

Hint: to answer this question, you only need to compute the correlation between the actual and control variates. 