## Simple Discrete Aggregate Distributions 

### Aggregate Frequency and Severity Models

Simulation algorithm for insurance losses

```
    for i = 1 to 10000
    	agg = 0
        simulate number of events n 
        for j = 1 to n
            simulate loss amount X
            agg = agg + X
        output agg for simulation i
```

* Write $A = X_1 + \cdots X_N$, $X_i$ and $N$ random and independent, and $X_i$ identically distributed
* Model insured losses via number of claims $N$ the **frequency** and the amount $X_i$ of each claim, the **severity**

### Aggregate Statistics: the Mean
* Mean of sum = sum of means
* $A = X_1 + \cdots + X_N$
* If $N=n$ is fixed then $E[A] = nE(X)$, because all $E[X_i]=E[X]$
* In general, $E[A] = E[X]E[N]$ by conditional probability

### Aggregate Statistics: the Variance
* For independent random variables, variance of sum = sum of variances
* $A = X_1 + \cdots + X_N$
* If $N=n$ is fixed then $Var(A) = nVar(X)$ and $Var(N)=0$
* If $X=x$ is fixed then $Var(A) = x^2Var(N)$ and $Var(X)=0$
* Obvious choices: $n=E[N]$, $x=E[X]$
* Combine $Var(A) = E[N]Var(X) + E[X]^2Var(N)$
* Miraculously this is the correct answer!

### Simple Aggregate Model

In a given year there can be 1, 2 or 3 events. There is a 50% chance of 1 event, 25% chance of 2, and 25% chance of 3. Each event randomly  causes a loss of 5, 10 or 15, each with equal probability.

1. What is the average annual event frequency?
1. What is the average event severity?
1. What are the average losses each year?
1. What is the coefficient of variation of losses for each year?
1. Create a table showing all possible outcomes from the model
1. What is the probability of an annual loss of 5? How can it occur?
1. What is the probability of an annual loss of 10? How can it occur?
1. What is the highest amount of total losses that can occur in one year? What is the chances that occurs?

In [None]:
sys.path.append('c:\\s\\telos\\python\\aggregate_project')

In [None]:
from aggregate import build
# build.logger_level(30)

In [None]:
sam = build('agg SAM dfreq [1 2 3] [.5 .25 .25] dsev [5 10 15]')
sam.plot()
sam

In [None]:
# useful computed quantities 
sam.density_df

In [None]:
sam.density_df.query('p_total > 0')[['p_total', 'p_sev']]

In [None]:
# highest outcome of 45 has probability 0.25 * (1/3)**3 (1/4 one for count, three outcomes of 50); check accuracy
a, e = (1/4) * (1/3)**3, sam.density_df.loc[45, 'p_total']
pd.DataFrame([a, e, e/a-1], index=['Actual worst', 'Computed worst', 'error'], columns=['value']).style.format(lambda x: f'{x:.15g}')

### More Complex Aggregate Model

In a given year there can be 1, 2, 3 or 20 events. There is a 45% chance of 1 event, 25% chance of 2, 25% chance of 3, and 5% chance of 100 events. Each event randomly  causes a loss of 5, 10 or 50, each with equal probability.

1. What is the average annual event frequency?
1. What are the average losses each year?
1. What is the coefficient of variation of losses for each year?
1. What are the probabilities of each possible outcome? 
1. What are the 99 and 99.6 percentiles of aggregate losses?
1. What is the probability of a maximum loss of 1000?

In [None]:
cam = build('agg CAM dfreq [1 2 3 20] [.45 .25 .25 0.05] dsev [5 10 50] [1/3 1/3 1/3]', log2=11, bs=1)
cam.plot()
cam

In [None]:
# percentiles
cam.q(0.99), cam.q(0.996), cam.cdf(570)

In [None]:
# highest outcome of 1000 has probability 0.05 * (1/3)**20 (1/4 one for count, three outcomes of 50); check accuracy
a, e = 0.05 * (1/3)**20, cam.density_df.loc[1000, 'p_total']
pd.DataFrame([a, e, e/a-1], index=['Actual worst', 'Computed worst', 'error'], columns=['value'])

In [None]:
cam.density_df.query('p_total > 0')[['p_total', 'p_sev', 'F', 'S']]

\
\
Created July 6, 2022