Leveraging customer information is of paramount importance for most businesses. In the case of an insurance company, attributes of customers like the ones in the given dataset below can be crucial in making business decisions. The problem statement aims at finding out different relationships and correlations between the different attributes in the dataset and gather information about the dataset.
Link to Dataset: https://www.kaggle.com/datasets/mirichoi0218/insurance
Dataset Information:
-
age: age of primary beneficiary
-
sex: insurance contractor gender, female, male
-
bmi: Body mass index, providing an understanding of body, weights that are relatively high or low relative to height, objective index of body weight (kg / m ^ 2) using the ratio of height to weight, ideally 18.5 to 24.9
-
children: Number of children covered by health insurance / Number of dependents
-
smoker: Smoking
-
region: the beneficiary's residential area in the US, northeast, southeast, southwest, northwest.
-
charges: Individual medical costs billed by health insurance
Our dataset has the following types of variables:
- Categorical varibles: sex, smoker, region, children
- Quantitative variables: age, bmi, charges. Here children is a discrete variable where as age, bmi, and charges are continous variables.