The aim of this project is to analyse data of Health Insurance customers and predict whether they would be interested in an additional Vehicle Insurance. More specifically, this analysis is aimed to answer the following questions:
- Is there any age range in which customers are more likely to be interested in a Vehicle Insurance? What is the difference between men and women?
- Are there any communication channels that are more successful in convincing customers?
- Taking into account all the data, predict whether a customer would be interested in a Vehicle Insurance.
More information about the context of this task can be found in Kaggle here.
The above analysis requires Python 3 versions. No libraries other than the ones coming with the Anaconda distribution are required.
- 0-Initial.ipynb: This is an initial exploratory analysis of the training data, to see the correlation of the response to the other variables.
- 1-Age-Insurance.ipynb: Here I do a further exploratory analysis to get information about the age distribution of customers and to answer question 1.
- 2-Channels-Insurance.ipynb: Here I analyse the Policy Sales Channels in order to answer question 2.
- 3-Predicting-Response.ipynb: Here I preprocess the data and use various models in order to answer question 3.
- data: The train.csv and test.csv (obtained from Kaggle, see Acknowledgements).
The findings of this analysis are discussed in an article that can be found here.
The data used in the analysis and other related information can be found in Kaggle here.