# U.S. Medical Insurance Costs

## Introduction:
This notebook explores trends and relationships within a U.S. medical insurance dataset. The dataset contains attributes like age, sex, BMI, number of children, smoking status, region, and insurance charges.

The goal is to answer several exploratory questions to uncover patterns related to smoking, gender, region, and insurance cost.


## Goals:
We aim to answer the following questions:
- Do people in the Southwest smoke more than people in the Northwest?
- What is the average age of someone with at least one child or more?
- What is the average insurance cost for smokers in the Northwest?
- What is the average BMI for people who smoke?
- What is the average BMI for people who do not smoke?
- What is the average insurance cost for females?
- What is the average insurance cost for males?
- Is there a significant difference in average insurance cost between smokers and non-smokers?
- Which region has the highest number of smokers?
- Are there more male smokers than female smokers?


## Load Data



In [None]:
import csv

with open("insurance.csv") as insurance_data:
    insurance_data_object = csv.DictReader(insurance_data)
    insurance_object = list(insurance_data_object)
print(insurance_object)

## Exploratory Data Analysis (EDA)

### Do people in the Southwest smoke more than people in the Northwest?

In [9]:
southwest_smokers = []
northwest_smokers = []
for i in insurance_object:
    if i["smoker"] == "yes" and i["region"] == "southwest":
        southwest_smokers.append(i)
    elif i["smoker"] == "yes" and i["region"] == "northwest":
        northwest_smokers.append(i)
print(len(southwest_smokers), len(northwest_smokers))

58 58


### What is the average age of someone with at least one child or more?

In [19]:
people_with_children = []

for i in insurance_object:
    if i["children"] or i["age"]:
        i["children"] = int(i["children"])
        i["age"] = int(i["age"])
# print(insurance_object)

sum_of_ages = 0

for j in insurance_object:
    if j["children"] > 0:
        people_with_children.append(j)
        sum_of_ages += j["age"]
average_age = round(sum_of_ages/len(people_with_children), 1)
print(average_age)

39.2


### What is the average insurance cost for smokers in the Northwest?

In [31]:
sum_of_insurance_cost = 0
count = 0

for i in insurance_object:
    if i["charges"]:
        i["charges"] = float(i["charges"])


for j in insurance_object:
    if j["region"] == "northwest":
        sum_of_insurance_cost += j["charges"]
        count += 1
average_insurance_northwest = round(sum_of_insurance_cost/count, 2)
print(average_insurance_northwest)

12417.58


### What is the average BMI for people who smoke?

In [30]:
bmi_sum = 0
count = 0

for i in insurance_object:
    if i["bmi"]:
        i["bmi"] = float(i["bmi"])

for j in insurance_object:
    if j["smoker"] == "yes":
        bmi_sum += j["bmi"]
        count += 1
average_smoker_bmi = round(bmi_sum/count, 2)
print(average_smoker_bmi)


30.71
