# Cybersecurity Pricing Example

Big Data Analytics is looking to purchase additional cybersecurity insurance to indemnify lost profits in the event of a security breach or other cyber event, such as a network outage or service interruption. A prior breach has made it difficult for Big Data Analytics to find coverage at in the standard market, so they have decided to insure this risk through a captive.

Additionally, the company has stated that their policy should have the following conditions:
 - A per occurrence limit of 1,000,000
 - A deductible of 100,000
 - A maximum benefit of 1,000,000

In [11]:
# Policy specifications
OCCURRENCE_LIMIT = 1_000_000
DEDUCTIBLE = 100_000
MAX_BENEFIT = 1_000_000

In order to establish an appropriate premium for our coverage, Big Data Analytics has contracted you, a pricing actuary specializing in cyber risk policies. Because of the immaturity of the company's loss information, you have decided to rely on Monte Carlo simulation methods with Python to determine a premium. Based on discussion with management, a review of company loss history, and industry statistics, you have determine that the models below are appropriate.

$$ N \sim Poisson(\lambda=0.2) $$
$$ X \sim Lognormal(\mu=10.5, \sigma=2.5) $$

### The Python Ecosystem

Similar to R, Python has a huge community of data scientists and developers who have open-sourced libraries for you to use in your projects. Using these tools speeds up develpoment time and lets you focus on your business questions rather than worrying about implementation details.

The two libraries we are going to use today are some of the most commonly used libraries in any Python project - Numpy and Pandas.

- [Pandas Documentation](https://pandas.pydata.org/pandas-docs/stable/)
- [Numpy Documentation](https://docs.scipy.org/doc/numpy-1.14.0/reference/)



In [12]:
import numpy
import pandas

### Think About Maintenance

Big data projects can quickly grow beyond what you can keep in your head. Deliberate organization of your code into smaller pieces can help both those reviewing your work, and yourself when you come back six months later and are trying to figure out what you did.

In [13]:
def apply_per_occurrence_limit(loss, limit):
    return min(loss, limit)

def apply_deductible(loss, ded):
    return loss - ded

### Double Check

Tests allow you to be confident that your code is free from errors. Especially as projects scale, tests can help you be confident that new features don't break work that you have already done.

In [14]:
assert(apply_deductible(200, 500) == 0)

AssertionError: 

In [15]:
def apply_deductible(loss, ded):
    return max(loss - ded, 0)

### Price Our Policy!

In [16]:
# Simulate claims
claims = numpy.random.poisson(lam=0.2, size=100000)

In [17]:
# For each claim, simulate a loss
losses = numpy.random.lognormal(mean=10.5, sigma=2.5, size=numpy.sum(claims))

In [18]:
claims_data = pandas.DataFrame(
    {
        'year': numpy.repeat(numpy.arange(claims.size), claims),
        'loss': losses
    },
    columns=['year', 'loss']
)

In [19]:
claims_data['capped_loss'] = claims_data['loss'].apply(apply_per_occurrence_limit, limit=OCCURRENCE_LIMIT)

In [20]:
claims_data['loss_less_deductible'] = claims_data['capped_loss'].apply(apply_deductible, ded=DEDUCTIBLE)

In [21]:
aggregate_losses = claims_data.groupby(by='year').sum()

In [22]:
aggregate_losses['benefit'] = aggregate_losses['loss_less_deductible'].apply(apply_per_occurrence_limit, limit=MAX_BENEFIT)

In [23]:
def calculate_premium(losses, plr):
    """Calculates the premium given loss history and a PLR."""
    
    # Get the loss_cost
    loss_cost = losses.mean()
    # Premium = Loss Cost / PLR
    premium = loss_cost / plr
    return premium

In [24]:
calculate_premium(losses=aggregate_losses['benefit'], plr=0.85)

183324.85620034547