Skip to content

Patrick-David/Insurance-Data-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 

Repository files navigation

Insurance Data Analysis

The Setup...

Here we will look at a Data Science challenge within the Insurance space. Of all the industries rife with vast amounts of data, the Insurance market surely has to be one of the greatest treasure troves for both data scientist and insurers alike.

However, despite this bounty, much of the Insurance industry is still built around 17th century 'Actuarial' math, meaning this data is either under utilised or not used at all.

Even with the integration of more modern financial economics into the insurance process, much of it relies on 'assumption based' approaches - such as determining the Discount Rate to be used - this is where Machine Learning comes in.

The Challenge...

Using a data set provided by Prudential Insurance as part of their recent Kaggle Challenge https://www.kaggle.com/c/prudential-life-insurance-assessment/download/train.csv.zip), we will apply a number data science techniques to visualise, better understand, statistically analyse and prepare the data for prediction.

The focus of this script will not be on outright 'predictive performance', but rather we will take a more 'data science' / research oriented approach, focusing on model robustness and data understanding.