## Quantiles

Given a probability distribution $f(x)$ (e.g. Gaussian, Binomial, Poisson...) the *cumulative distribution function* (CDF) is defined as 

$$ F(x) = \mathbb{P}(X\le x)\qquad\textrm{or equivalently}\int_{-\infty}^{x}f(x)dx$$


In [1]:
from matplotlib import pyplot as plt
from scipy.stats import norm
import numpy as np

fig = plt.figure(figsize=(10,4))
sub1 = plt.subplot(121)
x=np.arange(-4, 4, 0.01)
sub1.plot(x, norm.pdf(x), label='Gaussian PDF')
sub1.set_xlim(-4,4)
sub1.set_ylim(0, 0.5)
sub1.grid(True)
xfill=np.arange(-4,-0.5244, 0.01)
sub1.fill_between(xfill, 0, norm.pdf(xfill), color='red', label='30% area')
sub1.legend()

sub2 = plt.subplot(122)
x=np.arange(-4, 4, 0.01)
sub2.plot(x, norm.cdf(x), label='Gaussian CDF', linestyle='--')
sub2.set_xlim(-4,4)
sub2.set_ylim(0, 1.1)
sub2.grid(True)
xfill=np.arange(-4,-0.5244, 0.01)
sub2.plot(xfill, norm.cdf(xfill), color='red', 
          label='30% percentile')
sub2.legend()
sub2.hlines(0.3, -4, -0.5244, color='red', linestyle='-.')
sub2.text(-3.5, 0.35, "30%-percentile", color='red')
sub2.vlines(-0.5244, 0, 0.3, color='red', linestyle='-.')
sub2.text(-0.4, 0.15, "-0.5244", color='red')
plt.show()

<Figure size 1000x400 with 2 Axes>

The quantile function $Q$ returns a threshold value $x$ below which random draws from the given CDF sould fail $p$ percent of the times. So in other words it inverts the CDF

$$Q = F^{-1}\qquad\textrm{returns $x_p$ such that}~F(x_p)=p$$

50%-percentile is equivalent to 0.5-quantile

In [2]:
from scipy.stats import norm

# ppf invert cdf
quantile = norm.ppf(0.3)
cdf = norm.cdf(quantile)

print ("quantile: ", quantile)
print ("CDF :", cdf)

quantile:  -0.5244005127080409
CDF : 0.29999999999999993


Percentile (or quantile) can be computed also if there is not a probability distribution but a list of measurements with $\tt{numpy.percentile}$ 

In [3]:
import numpy as np
dist = [1, 2, 3, 4, 5, 6, 7, 8, 9]
percentile = np.percentile(dist, [1, 50])

print (percentile)

[1.08 5.  ]


## Credit curves

Just like a discount curve is a way of representing the underlying interest rates (or equivalently discount factors) implicit in the market quotes of a collection of real-world interest rate products, **credit curves** are a way of representing survival probabilities implied by credit default swaps.

**Credit default swaps** (**CDS**) are instruments whose value depends on the likelihood that a given company (the curve's **issuer**) will suffer a credit event over a given period.

A **credit event** can be a default, the failure to make payments, the issuer entering into bankruptcy proceedings, or the occurence of other legal events. The exact definition of what constitutes a credit event depends on a series of factors and is usually defined in some kind of ISDA (International Swaps and Derivatives Association) master agreement.

In any case, we will generically call a credit event a *default*, and talk about **non-default probabilities** (**NDP**), i.e. the probability that the issuer will not suffer a credit event before a given date i.e. *non-default probability is a cumulative probability* since refers to a time period. 

<table>
  <tr width=200>
    <th>Discount Curve</th>
    <th>Credit Curve</th>
  </tr>
  <tr>
    <td>Represents underlying rates implicit in market quotes of IR products</td>
    <td>Represents default probability implied by credit default swaps</td>
  </tr>
  <tr>
    <td>made of pillar_dates and discount factors</td>
    <td>made of pillar_dates and survival probabilities</td>
  </tr>
  <tr>
    <td>discount factors</td>
    <td>non-default probabilities</td>
  </tr>
  <tr>
    <td>short rate</td>
    <td>hazard rate</td>
  </tr>
</table>   

We will implement a $\tt{CreditCurve}$ class that provides a method which interpolates between the pillars of the curve to return the NDP at an arbitrary value date between the pricing date and the last pillar date.
In addition, we'll also write a method which returns the **hazard rate** at an arbitrary value date. The hazard rate is the credit curve equivalent of the short rate or overnight rate for discount curves.


### Hazard Rate

Hazard rate is often called a *conditional failure rate* since it's expression is a direct
application of the conditional probability concept.

Conditional probability answers to the question "how should you update probabilities of events when there is additional information available ?". To derive the general formula let's start with an example.

A fair die is rolled. Let $A$ be the event that the outcome is an odd number ($A={1,3,5}$). Also let $B$ be the event that the outcome is less than or equal to $3$ ($B={1,2,3}$). What is the probability of $A$ ($P(A)$) ? What is the probability of $A$ given $B$ ($P(A|B)$) ?

Being a simple example we can compute the result by hand:

$$P(A) = \frac{|A|}{|S|} = \frac{|\{1,3,5\}|}{6} = \frac{1}{2}\qquad\textrm{(where S is the entire sample space)}$$

Now let's find the conditional probability of $A$ given that $B$ occurred. If we know $B$ has occurred, the outcome must be among $\{1,2,3\}$. For $A$ to also happen the outcome must be in $A\cap B = \{1,3\}$. Since all die rolls are equally likely, we argue that $P(A|B)$ must be equal to

$$P(A|B) = \frac{|A\cap B|}{|B|} = \frac{2}{3}$$

To generalize our example we can rewrite the calculation by dividing the numerator and denominator by the entire space of the events $|S|$ hence:

$$P(A|B) = \cfrac{|A\cap B|}{|B|} = \cfrac{\cfrac{|A\cap B|}{|S|}}{\cfrac{|B|}{|S|}} = \cfrac{P(A\cap B)}{P(B)}$$

<img src="conditional_b.png" width=500>

Hazard rate represents the instantaneous probability of the issuer defaulting *conditioned* on it not having defaulted until that moment. In practice we will calculate it numerically, and therefore it'll be the (annualized) conditional probability of the issuer defaulting between the value date and the day after.

In formula if the non-default probability is indicated by $N$ and the hazard rate by $\lambda$:

$$\lambda = \cfrac{\mathbb{P}(A\cap B)}{\mathbb{P}(B)} = \cfrac{\mathbb{P}(\tau \in (t, t+dt))}{\mathbb{P}(\tau\gt t)} = \cfrac{\cfrac{d(1-N)}{dt}}{N(t_0, t_1)} = -\cfrac{dN}{dt}\cfrac{1}{N(t_0, t_1)}$$

where the minus sign derives from the fact that $N$ is a **non** default probability while the hazard rate is defined in terms of the probability of default.

Conversly given the hazard rate the non-default probability can be determined as:

$$\lambda = -\cfrac{1}{dt}\cdot\cfrac{dN}{N} = -\cfrac{d(\textrm{log}N)}{dt}$$

$$N(t_0, t) = e^{-\int_{t_0}^{t}\lambda dt}$$

In [4]:
# implement CreditCurve class
import math, numpy
from dateutil.relativedelta import relativedelta

class CreditCurve(object):
    
    def __init__(self, pillar_dates, ndps):
        self.pillar_dates = pillar_dates
        
        self.pillar_days = [
            (pd - pillar_dates[0]).days
            for pd in pillar_dates
        ]
        
        self.ndps = ndps
        
    def ndp(self, value_date):
        value_days = (value_date - self.pillar_dates[0]).days
        return numpy.interp(value_days,
                         self.pillar_days,
                         self.ndps)
    
    def hazard(self, value_date):
        ndp_1 = self.ndp(value_date)
        ndp_2 = self.ndp(value_date + relativedelta(days=1))
        delta_t = 1.0 / 365.0
        h = -1.0 / ndp_1 * (ndp_2 - ndp_1) / delta_t
        return h

As usual we test the newly developed class with some dummy data.

In [5]:
from datetime import date

pricing_date = date.today()

cc = CreditCurve(
    [pricing_date, pricing_date + relativedelta(years=2)],
    [1.0, 0.8]
)

In [6]:
cc.ndp(pricing_date + relativedelta(years=1))

0.9

In [7]:
cc.hazard(pricing_date + relativedelta(years=1))

0.11111111111112416

## Credit Deafult Swaps

Once we have implemented a $\tt{CreditCurve}$ class which allows us to interpolate survival probabilities, and also to calculate the hazard rate at arbitrary dates, we can use it to price **credit default swaps** (CDSs).

CDSs are made up of two legs:

* the *default* leg: which pays $LGD = 1 - R$, known as the **loss given default**, if and when the credit event occurs;
* the *premium* leg: which pays a *spread* $S$ every m months until the credit event occurs.

### Premium leg

Let's start with the premium leg. We will use the following notation:

* $d$ today's date;
* $d_0$ the start date of the CDS (could be different from $d$);
* $d_1, ..., d_n$ the payment dates of the premium leg, which occur at a m-month frequency (we assume that $d_n$ is the end date of the CDS);
* $D(d')$ the discount factor between $d$ and $d'$;
* $N(d')$ the non-default probability between $d$ and $d'$;
* $\tau$ the random variable representing the date of the credit event.

At each payment date $d_i$, a flow $S$ is paid if and only if the credit event has not occurred before that date. Therefore the NPV of the each flow is

$$f_{\textrm{premium}^i = \mathbb{E}\left[ S \times D(t_i) \times \mathbb{1}(\tau > t_i) \right] = S \cdot D(t_i) \cdot N(t_i)$$
where $\mathbb{1}(\tau > t_i)$ means that the expectation value has to be evaluated when $\tau > d_i$ therefore it has to be *weighted* with the survival probability.

The NPV of the leg is then

$$\textrm{NPV}_{premium} = \sum_{i=1}^{n} S \cdot D(t_i) \cdot N(t_i)$$

## Default leg

The LGD $(1-R)$ is paid out on the same date on which the credit event occurs, i.e. it can potentially be paid out on any date between $t_0$ and $t_n$. Mathematically, therefore, the NPV of the premium leg can be expressed as follows:

$$\mathrm{NPV_{default}} = \mathbb{E} \left[(1-R) \times D(\tau) \times \mathbb{1}(\tau \leq t_n) \right] $$

Using the laws of probability, we can break this down into the sum of "daily NPVs" calculated as a function of the daily default probabilities $\mathbb{P}$:

$$
\begin{align*}
\mathbb{E}\left[(1-R) \times D(\tau) \times \mathbb{1}(\tau \leq d_n) \right]
&= \sum_{d'=d_0}^{d_n} \mathbb{E}[ (1-R) \times D(\tau) | \tau = d'] \mathbb{P}[ \tau = d' ] \\
&= (1-R) \sum_{d'=d_0}^{d_n} D(d') \left( \mathbb{P}[ \tau \geq d' ] - \mathbb{P}[ \tau \geq t'+1 ] \right) \\
&= (1-R) \sum_{d'=d_0}^{d_n} D(d') \left( N(d') - N(d'+1) \right)
\end{align*}
$$

where the last step holds since $\mathbb{P}[\tau\geq d'] = 1 - \mathbb{P}[\tau < d'] = 1 - (1-N(d')) = N(d')$.

<img src="timeline.png">

In [11]:
# credit default swap class with breakeven method
from finmarkets import generate_swap_dates

class CreditDefaultSwap:
    
    def __init__(self, notional, start_date, fixed_spread, 
                 maturity, tenor=3, recovery=0.4):
        self.notional = notional
        self.payment_dates = generate_swap_dates(start_date, 
                                                 maturity*12, 
                                                 tenor)
        self.fixed_spread = fixed_spread
        self.recovery = recovery
    
    def premium_leg_npv(self, discount_curve, credit_curve):
        npv = 0
        for i in range(1, len(self.payment_dates)):
            npv += (
                self.fixed_spread *
                discount_curve.df(self.payment_dates[i]) *
                credit_curve.ndp(self.payment_dates[i])
            )
        return npv * self.notional
    
    def default_leg_npv(self, discount_curve, credit_curve):
        npv = 0
        d = self.payment_dates[0]
        while d <= self.payment_dates[-1]:
            npv += discount_curve.df(d) * (
                credit_curve.ndp(d) -
                credit_curve.ndp(d + relativedelta(days=1))
            )
            d += relativedelta(days=1)
        return npv * self.notional * (1 - self.recovery)
    
    def npv(self, discount_curve, credit_curve):
        return self.default_leg_npv(discount_curve, credit_curve) - \
               self.premium_leg_npv(discount_curve, credit_curve)

Below a simple test of the class.

In [18]:
import pandas as pd
from finmarkets import DiscountCurve

dc_data = pd.read_excel('discount_curve.xlsx')
dc = DiscountCurve(pricing_date, 
                   dc_data['pillars'].dt.date.tolist(),
                   dc_data['discount_factors'].tolist())

credit_curve = CreditCurve([pricing_date, 
                            pricing_date + relativedelta(months=36)], 
                           [1.0, 0.7])

cds = CreditDefaultSwap(1e6, pricing_date, 0.03, 3)
cds.premium_leg_npv(dc, credit_curve)

72599.55760600582

In [20]:
cds.default_leg_npv(dc, credit_curve)

181346.1197957481

In [21]:
cds.npv(dc, credit_curve)

108746.56218974227

## Estimate Default Probabilities from CDS

Pretty much like the discount curves could be derived from swap market quotes, we can estimate default probabilities (hence credit curves) from CDS quotes using bootstrap.
Following the same steps seen for the discount curve we can determine default probabilites at discrete dates to fill our curve:

* collect market quotes for a number of CDS with different maturities;
* create the corresponding CDS objects;
* define a $\tt{CreditCurve}$ whose pillars are the CDS maturity dates and with a set of unknown default probabilities;
* define an objective function to minimize the sum of the squared CDS's NPVs;
* set the non-default probabilities to an initial value and define their range of variability between $[0, 1]$ since they are probabilities and fix "today's" probability to 1 since there hasn't been any default;
* run the minimization.


In [31]:
quotes = pd.read_excel('cds_quotes.xlsx')

pillars = [pricing_date]
cdss = []
for i in range(len(quotes)):
    cds = CreditDefaultSwap(1e6, pricing_date, 
                            quotes['quote'][i],
                            quotes['months'][i]//12)
    cdss.append(cds)
    pillars.append(cds.payment_dates[-1])

def objective_function(x):
    cc = CreditCurve(pillars, x)
    
    s=0
    for cds in cdss:
        s += cds.npv(dc, cc)**2
    return s

x0 = [1.0 for _ in range(len(pillars))]
bounds = [(0.001, 1.0) for _ in range(len(pillars))]
bounds[0] = (1, 1)

from scipy.optimize import minimize 
r = minimize(objective_function, x0, bounds=bounds)
print (r)

      fun: 4.418270168346569e-05
 hess_inv: <7x7 LbfgsInvHessProduct with dtype=float64>
      jac: array([ 4.14334144e+04, -1.82525181e-04, -1.81563231e-04, -2.66978239e-04,
       -2.85546564e-04,  5.65959667e-05,  4.73440591e-04])
  message: b'CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH'
     nfev: 96
      nit: 10
   status: 0
  success: True
        x: array([1.        , 0.97614123, 0.9471263 , 0.91808604, 0.83545363,
       0.74039854, 0.54530604])


### Determine Default Probabilities from Bond Prices

### Credit Ratings

A credit rating is a quantified assessment of the creditworthiness of a borrower either in general
terms or with respect to a particular debt or financial obligation. A credit rating can be assigned
to any entity that seeks to borrow money (e.g. an individual, corporation, state or provincial
authority, or sovereign government).
A loan is a essentially a promise and the credit rating determines the likelihood that the bor-
rower will be able to pay back it within the loan agreement terms. A high credit rating indicates
a high possibility of paying back the loan in its entirety without any issues; a poor credit rating118
CHAPTER 10. CREDIT DEFAULT SWAPS
suggests that the borrower has had trouble paying back loans in the past and might follow the
same pattern in the future.
Individual credit is scored from credit bureaus (e.g. Experian and TransUnion) and it is re-
ported as a number, generally ranging from 300 to 850.
Credit assessment and evaluation for companies and governments instead is generally done
by credit rating agencies (e.g. Standard & Poor’s (S&P), Moody’s, or Fitch), which typically assign
letter grades to indicate ratings. Standard & Poor’s, for instance, has a credit rating scale ranging
from AAA (excellent) to C and D. A debt instrument with a rating below BB is considered to be a
speculative grade or a junk bond, which means it is more likely to default on loans.
10.4.1
Why Credit Ratings Are Important
A borrowing entity will strive to have the highest possible credit rating since it has a major impact
on interest rates charged by lenders. Rating agencies, on the other hand, must take a balanced and
objective view of the borrower’s financial situation and capacity to service/repay the debt.
A credit rating not only determines whether or not a borrower will be approved for a loan
but also determines the interest rate at which the loan will need to be repaid. Since companies
depend on loans for many expenses, being denied a loan could spell disaster, and in any case a
high interest rate is much more difficult to pay back. Credit ratings also play a large role in a
potential investor’s determining whether or not to purchase bonds. A poor credit rating is a risky
investment; it indicates a larger probability that the company will be unable to make its bond
payments.
It is important for a borrower to remain diligent in maintaining a high credit rating. Credit
ratings are never static; in fact, they change all the time based on the newest data, and one negative
debt will bring down even the best score. Credit also takes time to build up. An entity with good
credit but a short credit history is not seen as positively as another entity with the same quality of
credit but a longer history. Debtors want to know a borrower can maintain good credit consistently
over time.