# Medical Insurance Cost Prediction Analysis

- Author : Ibrahim
- Date : 09 January 2026
- Tools : Python (Variables, Control Flow)
- Data Source : [Medical Cost Personal Datasets (Kaggle)](https://www.kaggle.com/mirichoi0218/insurance)

### Project Background
The objective of this analysis is to quantify the impact of individual health metrics—specifically age, gender, BMI, and smoking status—on projected medical insurance costs. By utilizing a predictive cost formula derived from historical data, we perform sensitivity analysis to determine which factors contribute most significantly to premium increases.

## 1. Initial Data Setup

Defining the baseline patient parameters for analysis.
Baseline Profile :
- Age: 28
- Sex: Female (0)
- BMI: 26.2
- Children: 3
- Smoker: Non-smoker (0)

In [1]:
age = 28
sex = 0
bmi = 26.2
num_of_children = 3
smoker = 0

## 2. Cost Estimation Model
Implementing the insurance cost formula to calculate the baseline premium.

   $$
   \begin{aligned}
   insurance\_cost = 250*age - 128*sex \\
   + 370*bmi + 425*num\_of\_children \\
   + 24000*smoker - 12500 \\
   \end{aligned}
   $$

In [2]:
insurance_cost = 250 * age - 128 * sex + 370 * bmi + 425 * num_of_children + 24000 * smoker - 12500

In [3]:
print ("This person's insurance cost is " + str(insurance_cost) + " dollars. ")

This person's insurance cost is 5469.0 dollars. 


## 3. Sensitivity Analysis: Age Factor
Analyzing the financial impact of increasing the patient's age by 4 years, assuming all other health metrics remain constant.

In [4]:
age += 4

In [5]:
new_insurance_cost = 250 * age - 128 * sex + 370 * bmi + 425 * num_of_children + 24000 * smoker - 12500

In [6]:
change_in_insurance_cost = new_insurance_cost - insurance_cost

In [7]:
print ("The change in cost of insurance after increasing the age by 4 years is " + str(change_in_insurance_cost) + " dollars.")

The change in cost of insurance after increasing the age by 4 years is 1000.0 dollars.


## 4. Sensitivity Analysis: BMI Factor
Calculating the change in insurance cost resulting from a 3.1 point increase in Body Mass Index (BMI).

In [8]:
age = 28
bmi += 3.1

In [9]:
new_insurance_cost = 250 * age - 128 * sex + 370 * bmi + 425 * num_of_children + 24000 * smoker - 12500 

In [10]:
change_in_insurance_cost = new_insurance_cost - insurance_cost

In [11]:
print ("The change in estimated cost after increasing BMI by 3.1 is " + str(change_in_insurance_cost) + " dollars.")

The change in estimated cost after increasing BMI by 3.1 is 1147.0 dollars.


## 5. Gender Impact Analysis
Comparing the estimated insurance premium for a Male patient vs a Female patient with identical profiles.

In [12]:
bmi = 26.2
sex = 1

In [13]:
new_insurance_cost = 250 * age - 128 * sex + 370 * bmi + 425 * num_of_children + 24000 * smoker - 12500

Observation on Gender
The analysis reveals a decrease in cost when the gender variable is switched to Male. This suggests that, within this specific formula/dataset, men tend to have lower baseline medical costs compared to women with the same parameters.

In [14]:
change_in_insurance_cost = new_insurance_cost - insurance_cost

In [15]:

print("Cost difference when factor is Male instead of Female is " + str(change_in_insurance_cost) + " dollars.")

Cost difference when factor is Male instead of Female is -128.0 dollars.


# Conclusion
Based on the sensitivity analysis, we found that:
1. Age is a significant driver of cost; getting older directly increases premiums.
2. BMI has a strong positive correlation with cost.
3. Gender factors indicate lower premiums for males in this specific model.

Future recommendation: Further analysis is needed to quantify the impact of smoking status, which carries the highest coefficient in the formula.