# U.S. Medical Insurance Cost Analysis

## Dataset Overview
The Medical Cost Personal Dataset is an aggregation of medical data from 2017. Each row within the dataset accounts for a single benificiary, with each column account for the insurance cost variables. The first six (6) columns account for the variable determinate of the final insurance cost, which is stored in the final, seventh (7), column. In total, the dataset contained 1,340 rows of data and seven columns. The variables for the dataset are:

		Age: Age of the primary beneficiary
		Sex: Insurance beneficiary's gender: female, male
		Body Mass Index (BMI): Body mass index of the insurance benificiary (BMI = weight / height)
		Children: Number of children / dependents covered by the benificiary's health insurance
		Smoker: Record of whether the benificiary's smoking history
		Region: The beneficiary's residential area in the United States:
			Northeast, Southeast, Southwest, Northwest.
		Charges: Benificiary's medical costs billed by health insurance (yearly)

Additional information for the the Medical Insurance Dataset, including licensing, authors, collaborators, and more can be found on the Kaggle website.

## Project Scope
The scope of the Medical Cost Analysis project is to analyze how different health variables, such as age, sex, bmi, number of children, smoker-status and region affect yearly medical insurance costs. The project includes individual and relational analysis of variables to better understand how insurance costs are impacted. Throughout the course of this project, the following analysies will be preformed:

  **Demographic Overview:**

  **Medical Cost Analysis**

## Building Data Structure

### Variable Lists

In [227]:
age = list()
sex = list()
body_mass_index = list()
nummber_of_children = list()
smoker_status = list()
geographic_region = list()
medical_charges = list()

### Building Lists From CSV

In [228]:
import csv
with open('insurance.csv') as medical_data_raw:
  medical_data = csv.DictReader(medical_data_raw)
  for item in medical_data:
    age.append(int(item['age']))
    sex.append(item['sex'])
    body_mass_index.append(round(float(item['bmi']), 1))
    nummber_of_children.append(int(item['children']))
    smoker_status.append(item['smoker'])
    geographic_region.append(item['region'])
    medical_charges.append(round(float(item['charges']), 2))

### Building Master Dictionary

In [229]:
def build_master_dictionary(age, sex, bmi, children, smoker, region, charges):
  master_dict = {}
  item_length = len(age)
  key_counter = 1
  for i in range(item_length):
    master_dict[key_counter] = {"Age": age[i], "Sex": sex[i], "BMI": bmi[i], "Children": children[i], "Smoker": smoker[i], "Region": region[i], "Charges": charges[i]}
    key_counter += 1
  return master_dict
master_dict = build_master_dictionary(age, sex, body_mass_index, nummber_of_children, smoker_status, geographic_region, medical_charges)
print(master_dict)

{1: {'Age': 19, 'Sex': 'female', 'BMI': 27.9, 'Children': 0, 'Smoker': 'yes', 'Region': 'southwest', 'Charges': 16884.92}, 2: {'Age': 18, 'Sex': 'male', 'BMI': 33.8, 'Children': 1, 'Smoker': 'no', 'Region': 'southeast', 'Charges': 1725.55}, 3: {'Age': 28, 'Sex': 'male', 'BMI': 33.0, 'Children': 3, 'Smoker': 'no', 'Region': 'southeast', 'Charges': 4449.46}, 4: {'Age': 33, 'Sex': 'male', 'BMI': 22.7, 'Children': 0, 'Smoker': 'no', 'Region': 'northwest', 'Charges': 21984.47}, 5: {'Age': 32, 'Sex': 'male', 'BMI': 28.9, 'Children': 0, 'Smoker': 'no', 'Region': 'northwest', 'Charges': 3866.86}, 6: {'Age': 31, 'Sex': 'female', 'BMI': 25.7, 'Children': 0, 'Smoker': 'no', 'Region': 'southeast', 'Charges': 3756.62}, 7: {'Age': 46, 'Sex': 'female', 'BMI': 33.4, 'Children': 1, 'Smoker': 'no', 'Region': 'southeast', 'Charges': 8240.59}, 8: {'Age': 37, 'Sex': 'female', 'BMI': 27.7, 'Children': 3, 'Smoker': 'no', 'Region': 'northwest', 'Charges': 7281.51}, 9: {'Age': 37, 'Sex': 'male', 'BMI': 29.8, '

## Analyzing Medical Data

### Key Insights

#### *Average Age of Benificiary*

In [230]:
def calculate_average(var):
  total = 0
  for v in var:
    total += v
  average = total / len(var)
  return average
average_age = calculate_average(age)
print(f"Average Benificiary Age: {round(average_age, 2)}")

Average Benificiary Age: 39.2


#### *Average Body Mass Index of Benificiary*

In [231]:
average_bmi = calculate_average(body_mass_index)
print(f"Average Benificiary Body Mass Index: {round(average_bmi, 2)}")

Average Benificiary Body Mass Index: 30.66


#### *Average Medical Cost of Benificiary*

In [232]:
average_cost = calculate_average(medical_charges)
print(f"Average Benificiary Medical Cost: {round(average_cost, 2)}")

Average Benificiary Medical Cost: 13265.17


#### *Average Dependent Per Benificiary*

In [233]:
average_child = calculate_average(nummber_of_children)
print(f"Average Benificiary Dependent: {round(average_child, 2)}")

Average Benificiary Dependent: 1.09


### Exploritory Analysis

#### *Age Distribution Of Beneficiaries*

In [234]:
def age_distribution_of_benificiaries(age):
  benificiary_age_groups = {'18 to 24': 0, '25 to 34': 0, '35 to 44': 0, '45 to 54': 0, '55 to 64': 0}
  for a in age:
    if a >= 18  and a <= 24:
      benificiary_age_groups['18 to 24'] += 1
    elif a >= 25 and a <= 34:
      benificiary_age_groups['25 to 34'] += 1
    elif a >= 35 and a <= 44:
      benificiary_age_groups['35 to 44'] += 1
    elif a >= 45 and a <= 54:
      benificiary_age_groups['45 to 54'] += 1
    elif a >= 55 and a <= 64:
      benificiary_age_groups['55 to 64'] += 1
  return benificiary_age_groups
benificiary_age_groups = age_distribution_of_benificiaries(age)
print(f"Benificiary Age Distribution: {benificiary_age_groups}")

Benificiary Age Distribution: {'18 to 24': 278, '25 to 34': 272, '35 to 44': 260, '45 to 54': 287, '55 to 64': 242}


#### *Gender Distribution Of Benificiaries*

In [235]:
def gender_distribution_of_benificiaries(sex):
  benificiary_gender_groups = {'Male': 0, 'Female': 0}
  for s in sex:
    if s == 'male':
      benificiary_gender_groups['Male'] += 1
    else:
      benificiary_gender_groups['Female'] += 1
  return benificiary_gender_groups
benificiary_gender_groups = gender_distribution_of_benificiaries(sex)
print(f"Benificiary Gender Distribution: {benificiary_gender_groups}")

Benificiary Gender Distribution: {'Male': 677, 'Female': 662}


#### *Body Mass Index Distribution Of Benificiaries*

In [236]:
def bmi_distribution_of_benificiaries(body_mass_index):
  benificiary_bmi_groupings = {'Under Weight': 0, 'Healthy Weight': 0, 'Over Weight': 0, 'Obese': 0, 'Morbidly Obese': 0}
  for bmi in body_mass_index:
    if bmi < 18.5:
      benificiary_bmi_groupings['Under Weight'] += 1
    elif bmi >= 18.5 and bmi <= 24.9:
      benificiary_bmi_groupings['Healthy Weight'] += 1
    elif bmi >= 25.0 and bmi <= 29.9:
      benificiary_bmi_groupings['Over Weight'] += 1
    elif bmi >= 30.0 and bmi <= 39.9:
      benificiary_bmi_groupings['Obese'] += 1
    elif bmi >= 40.0:
      benificiary_bmi_groupings['Morbidly Obese'] += 1
  return benificiary_bmi_groupings
benificiary_bmi_groupings = bmi_distribution_of_benificiaries(body_mass_index)
print(f"Benificiary Body Mass Index Distribution: {benificiary_bmi_groupings}")

Benificiary Body Mass Index Distribution: {'Under Weight': 20, 'Healthy Weight': 222, 'Over Weight': 390, 'Obese': 615, 'Morbidly Obese': 92}


#### *Children Distribution Of Benificiaries*

In [237]:
def children_distribution_of_benificiaries(children):
  benificiary_children_groupings = {'0': 0, '1': 0, '2': 0, '3': 0, '4': 0, '5+': 0}
  for child in children:
    if child == 0:
      benificiary_children_groupings['0'] += 1
    elif child == 1:
      benificiary_children_groupings['1'] += 1
    elif child == 2:
      benificiary_children_groupings['2'] += 1
    elif child == 3:
      benificiary_children_groupings['3'] += 1
    elif child == 4:
      benificiary_children_groupings['4'] += 1
    elif child >= 5:
      benificiary_children_groupings['5+'] += 1
  return benificiary_children_groupings
benificiary_children_groupings = children_distribution_of_benificiaries(nummber_of_children)
print(f"Benificiary Child Distribution: {benificiary_children_groupings}")

Benificiary Child Distribution: {'0': 575, '1': 324, '2': 240, '3': 157, '4': 25, '5+': 18}


#### *Smoking Distribution of Benificiaries*

In [238]:
def smoker_distribution_of_benificiaries(smoker):
  benificiary_smoker_groupings = {'Smoker': 0, 'Non-Smoker': 0}
  for s in smoker:
    if s == 'yes':
      benificiary_smoker_groupings['Smoker'] += 1
    else:
      benificiary_smoker_groupings['Non-Smoker'] += 1
  return benificiary_smoker_groupings
benificiary_smoker_groupings = smoker_distribution_of_benificiaries(smoker_status)
print(f"Benificiary Smoker Status: {benificiary_smoker_groupings}")

Benificiary Smoker Status: {'Smoker': 274, 'Non-Smoker': 1065}


#### *Regional Distribution Of Benificiaries*

In [239]:
def regional_distribution_of_benificiaries(region):
  benificiary_region_groupings = {'Northeast': 0, 'Southeast': 0, 'Northwest': 0, 'Southwest': 0}
  for r in region:
    if r == 'northeast':
      benificiary_region_groupings['Northeast'] += 1
    elif r == 'southeast':
      benificiary_region_groupings['Southeast'] += 1
    elif r == 'northwest':
      benificiary_region_groupings['Northwest'] += 1
    elif r == 'southwest':
      benificiary_region_groupings['Southwest'] += 1
  return benificiary_region_groupings
benificiary_region_groupings = regional_distribution_of_benificiaries(geographic_region)
print(f"Benificiary Regional Distribution: {benificiary_region_groupings}")

Benificiary Regional Distribution: {'Northeast': 324, 'Southeast': 365, 'Northwest': 325, 'Southwest': 325}


#### *Medical Cost Distribution Of Benificiaries*

In [240]:
def cost_distribution_of_benificiaries(costs):
  benificiary_cost_groupings = {'$1,000.00 - $14,999.99': 0, '$15,000.00 - $29,999.99': 0, '$30,000.00 - $44,999.99': 0, '$45,000.00 - $59,999.99': 0, '$60,000.00 - $74,999.99': 0}
  for c in costs:
    if c >= 1000.00 and c <= 14999.99:
      benificiary_cost_groupings['$1,000.00 - $14,999.99'] += 1
    elif c >= 15000.00 and c <= 29999.99:
      benificiary_cost_groupings['$15,000.00 - $29,999.99'] += 1
    elif c >= 30000.00 and c <= 44999.99:
      benificiary_cost_groupings['$30,000.00 - $44,999.99'] += 1
    elif c >= 45000.00 and c <= 59999.99:
      benificiary_cost_groupings['$45,000.00 - $59,999.99'] += 1
    elif c >= 60000.00 and c <= 74999.99:
      benificiary_cost_groupings['$60,000.00 - $74,999.99'] += 1
  return benificiary_cost_groupings
benificiary_cost_groupings = cost_distribution_of_benificiaries(medical_charges)
print(f"Benificiary Cost Distribution: {benificiary_cost_groupings}")

Benificiary Cost Distribution: {'$1,000.00 - $14,999.99': 981, '$15,000.00 - $29,999.99': 196, '$30,000.00 - $44,999.99': 124, '$45,000.00 - $59,999.99': 35, '$60,000.00 - $74,999.99': 3}


### Medical Cost Analysis

#### *Relationship Between Age & Medical Charges*

In [241]:
def age_grouping_medical_charge_distribution (cost, age):
  benificiary_cost_grouping_by_age = {'$1,000.00 - $14,999.99': {'18 to 24': 0, '25 to 34': 0, '35 to 44': 0, '45 to 54': 0, '55 to 64': 0},
                                      '$15,000.00 - $29,999.99': {'18 to 24': 0, '25 to 34': 0, '35 to 44': 0, '45 to 54': 0, '55 to 64': 0},
                                      '$30,000.00 - $44,999.99': {'18 to 24': 0, '25 to 34': 0, '35 to 44': 0, '45 to 54': 0, '55 to 64': 0},
                                      '$45,000.00 - $59,999.99': {'18 to 24': 0, '25 to 34': 0, '35 to 44': 0, '45 to 54': 0, '55 to 64': 0},
                                      '$60,000.00 - $74,999.99': {'18 to 24': 0, '25 to 34': 0, '35 to 44': 0, '45 to 54': 0, '55 to 64': 0}}
  for c, a in zip(cost, age):
    if c >= 1000.00 and c <= 14999.99:
      if a >= 18  and a <= 24:
        benificiary_cost_grouping_by_age['$1,000.00 - $14,999.99']['18 to 24'] += 1
      elif a >= 25 and a <= 34:
        benificiary_cost_grouping_by_age['$1,000.00 - $14,999.99']['25 to 34'] += 1
      elif a >= 35 and a <= 44:
        benificiary_cost_grouping_by_age['$1,000.00 - $14,999.99']['35 to 44'] += 1
      elif a >= 45 and a <= 54:
        benificiary_cost_grouping_by_age['$1,000.00 - $14,999.99']['45 to 54'] += 1
      elif a >= 55 and a <= 64:
        benificiary_cost_grouping_by_age['$1,000.00 - $14,999.99']['55 to 64'] += 1
    elif c >= 15000.00 and c <= 29999.99:
      if a >= 18  and a <= 24:
        benificiary_cost_grouping_by_age['$15,000.00 - $29,999.99']['18 to 24'] += 1
      elif a >= 25 and a <= 34:
        benificiary_cost_grouping_by_age['$15,000.00 - $29,999.99']['25 to 34'] += 1
      elif a >= 35 and a <= 44:
        benificiary_cost_grouping_by_age['$15,000.00 - $29,999.99']['35 to 44'] += 1
      elif a >= 45 and a <= 54:
        benificiary_cost_grouping_by_age['$15,000.00 - $29,999.99']['45 to 54'] += 1
      elif a >= 55 and a <= 64:
        benificiary_cost_grouping_by_age['$15,000.00 - $29,999.99']['55 to 64'] += 1
    elif c >= 30000.00 and c <= 44999.99:
      if a >= 18  and a <= 24:
        benificiary_cost_grouping_by_age['$30,000.00 - $44,999.99']['18 to 24'] += 1
      elif a >= 25 and a <= 34:
        benificiary_cost_grouping_by_age['$30,000.00 - $44,999.99']['25 to 34'] += 1
      elif a >= 35 and a <= 44:
        benificiary_cost_grouping_by_age['$30,000.00 - $44,999.99']['35 to 44'] += 1
      elif a >= 45 and a <= 54:
        benificiary_cost_grouping_by_age['$30,000.00 - $44,999.99']['45 to 54'] += 1
      elif a >= 55 and a <= 64:
        benificiary_cost_grouping_by_age['$30,000.00 - $44,999.99']['55 to 64'] += 1
    elif c >= 45000.00 and c <= 59999.99:
      if a >= 18  and a <= 24:
        benificiary_cost_grouping_by_age['$45,000.00 - $59,999.99']['18 to 24'] += 1
      elif a >= 25 and a <= 34:
        benificiary_cost_grouping_by_age['$45,000.00 - $59,999.99']['25 to 34'] += 1
      elif a >= 35 and a <= 44:
        benificiary_cost_grouping_by_age['$45,000.00 - $59,999.99']['35 to 44'] += 1
      elif a >= 45 and a <= 54:
        benificiary_cost_grouping_by_age['$45,000.00 - $59,999.99']['45 to 54'] += 1
      elif a >= 55 and a <= 64:
        benificiary_cost_grouping_by_age['$45,000.00 - $59,999.99']['55 to 64'] += 1
    elif c >= 60000.00 and c <= 74999.99:
      if a >= 18  and a <= 24:
        benificiary_cost_grouping_by_age['$60,000.00 - $74,999.99']['18 to 24'] += 1
      elif a >= 25 and a <= 34:
        benificiary_cost_grouping_by_age['$60,000.00 - $74,999.99']['25 to 34'] += 1
      elif a >= 35 and a <= 44:
        benificiary_cost_grouping_by_age['$60,000.00 - $74,999.99']['35 to 44'] += 1
      elif a >= 45 and a <= 54:
        benificiary_cost_grouping_by_age['$60,000.00 - $74,999.99']['45 to 54'] += 1
      elif a >= 55 and a <= 64:
        benificiary_cost_grouping_by_age['$60,000.00 - $74,999.99']['55 to 64'] += 1 
  return benificiary_cost_grouping_by_age
benificiary_cost_grouping_by_age = age_grouping_medical_charge_distribution(medical_charges, age)
print(f"Benificiary Cost Distribution By Age Group: {benificiary_cost_grouping_by_age}")

Benificiary Cost Distribution By Age Group: {'$1,000.00 - $14,999.99': {'18 to 24': 211, '25 to 34': 200, '35 to 44': 189, '45 to 54': 208, '55 to 64': 173}, '$15,000.00 - $29,999.99': {'18 to 24': 33, '25 to 34': 45, '35 to 44': 37, '45 to 54': 47, '55 to 64': 34}, '$30,000.00 - $44,999.99': {'18 to 24': 34, '25 to 34': 24, '35 to 44': 30, '45 to 54': 22, '55 to 64': 14}, '$45,000.00 - $59,999.99': {'18 to 24': 0, '25 to 34': 3, '35 to 44': 4, '45 to 54': 7, '55 to 64': 21}, '$60,000.00 - $74,999.99': {'18 to 24': 0, '25 to 34': 0, '35 to 44': 0, '45 to 54': 3, '55 to 64': 0}}


#### *Relationship Between Gender & Medical Charges*

In [242]:
def gender_grouping_medical_charge_distribution (cost, gender):
  benificiary_cost_groupings_by_gender = {'$1,000.00 - $14,999.99': {'Male': 0, 'Female': 0},
                                          '$15,000.00 - $29,999.99': {'Male': 0, 'Female': 0},
                                          '$30,000.00 - $44,999.99': {'Male': 0, 'Female': 0},
                                          '$45,000.00 - $59,999.99': {'Male': 0, 'Female': 0},
                                          '$60,000.00 - $74,999.99': {'Male': 0, 'Female': 0}}
  for c, g in zip(cost, gender):
    if c >= 1000.00 and c <= 14999.99:
      if g == 'male':
        benificiary_cost_groupings_by_gender['$1,000.00 - $14,999.99']['Male'] += 1
      else:
        benificiary_cost_groupings_by_gender['$1,000.00 - $14,999.99']['Female'] += 1
    elif c >= 15000.00 and c <= 29999.99:
      if g == 'male':
        benificiary_cost_groupings_by_gender['$15,000.00 - $29,999.99']['Male'] += 1
      else:
        benificiary_cost_groupings_by_gender['$15,000.00 - $29,999.99']['Female'] += 1
    elif c >= 30000.00 and c <= 44999.99:
      if g == 'male':
        benificiary_cost_groupings_by_gender['$30,000.00 - $44,999.99']['Male'] += 1
      else:
        benificiary_cost_groupings_by_gender['$30,000.00 - $44,999.99']['Female'] += 1
    elif c >= 45000.00 and c <= 59999.99:
      if g == 'male':
        benificiary_cost_groupings_by_gender['$45,000.00 - $59,999.99']['Male'] += 1
      else:
        benificiary_cost_groupings_by_gender['$45,000.00 - $59,999.99']['Female'] += 1
    elif c >= 60000.00 and c <= 74999.99:
      if g == 'male':
        benificiary_cost_groupings_by_gender['$60,000.00 - $74,999.99']['Male'] += 1
      else:
        benificiary_cost_groupings_by_gender['$60,000.00 - $74,999.99']['Female'] += 1
  return benificiary_cost_groupings_by_gender
benificiary_cost_groupings_by_gender = gender_grouping_medical_charge_distribution (medical_charges, sex)
print(f"Benificiary Cost Distribution By Gender: {benificiary_cost_groupings_by_gender}")

Benificiary Cost Distribution By Gender: {'$1,000.00 - $14,999.99': {'Male': 478, 'Female': 503}, '$15,000.00 - $29,999.99': {'Male': 96, 'Female': 100}, '$30,000.00 - $44,999.99': {'Male': 81, 'Female': 43}, '$45,000.00 - $59,999.99': {'Male': 20, 'Female': 15}, '$60,000.00 - $74,999.99': {'Male': 2, 'Female': 1}}


#### *Relationship Between Body Mass Index & Medical Charges*

In [243]:
def bmi_grouping_medical_charge_distribution (cost, body_mass_index):
  benificiary_cost_groupings_by_bmi = {'$1,000.00 - $14,999.99': {'Under Weight': 0, 'Healthy Weight': 0, 'Over Weight': 0, 'Obese': 0, 'Morbidly Obese': 0},
                                       '$15,000.00 - $29,999.99': {'Under Weight': 0, 'Healthy Weight': 0, 'Over Weight': 0, 'Obese': 0, 'Morbidly Obese': 0},
                                       '$30,000.00 - $44,999.99': {'Under Weight': 0, 'Healthy Weight': 0, 'Over Weight': 0, 'Obese': 0, 'Morbidly Obese': 0},
                                       '$45,000.00 - $59,999.99': {'Under Weight': 0, 'Healthy Weight': 0, 'Over Weight': 0, 'Obese': 0, 'Morbidly Obese': 0},
                                       '$60,000.00 - $74,999.99': {'Under Weight': 0, 'Healthy Weight': 0, 'Over Weight': 0, 'Obese': 0, 'Morbidly Obese': 0}}
  for c, bmi in zip(cost, body_mass_index):
    if c >= 1000.00 and c <= 14999.99:
      if bmi < 18.5:
        benificiary_cost_groupings_by_bmi['$1,000.00 - $14,999.99']['Under Weight'] += 1
      elif bmi >= 18.5 and bmi <= 24.9:
        benificiary_cost_groupings_by_bmi['$1,000.00 - $14,999.99']['Healthy Weight'] += 1
      elif bmi >= 25.0 and bmi <= 29.9:
        benificiary_cost_groupings_by_bmi['$1,000.00 - $14,999.99']['Over Weight'] += 1
      elif bmi >= 30.0 and bmi <= 39.9:
        benificiary_cost_groupings_by_bmi['$1,000.00 - $14,999.99']['Obese'] += 1
      elif bmi >= 40.0:
        benificiary_cost_groupings_by_bmi['$1,000.00 - $14,999.99']['Morbidly Obese'] += 1
    elif c >= 15000.00 and c <= 29999.99:
      if bmi < 18.5:
        benificiary_cost_groupings_by_bmi['$15,000.00 - $29,999.99']['Under Weight'] += 1
      elif bmi >= 18.5 and bmi <= 24.9:
        benificiary_cost_groupings_by_bmi['$15,000.00 - $29,999.99']['Healthy Weight'] += 1
      elif bmi >= 25.0 and bmi <= 29.9:
        benificiary_cost_groupings_by_bmi['$15,000.00 - $29,999.99']['Over Weight'] += 1
      elif bmi >= 30.0 and bmi <= 39.9:
        benificiary_cost_groupings_by_bmi['$15,000.00 - $29,999.99']['Obese'] += 1
      elif bmi >= 40.0:
        benificiary_cost_groupings_by_bmi['$15,000.00 - $29,999.99']['Morbidly Obese'] += 1
    elif c >= 30000.00 and c <= 44999.99:
      if bmi < 18.5:
        benificiary_cost_groupings_by_bmi['$30,000.00 - $44,999.99']['Under Weight'] += 1
      elif bmi >= 18.5 and bmi <= 24.9:
        benificiary_cost_groupings_by_bmi['$30,000.00 - $44,999.99']['Healthy Weight'] += 1
      elif bmi >= 25.0 and bmi <= 29.9:
        benificiary_cost_groupings_by_bmi['$30,000.00 - $44,999.99']['Over Weight'] += 1
      elif bmi >= 30.0 and bmi <= 39.9:
        benificiary_cost_groupings_by_bmi['$30,000.00 - $44,999.99']['Obese'] += 1
      elif bmi >= 40.0:
        benificiary_cost_groupings_by_bmi['$30,000.00 - $44,999.99']['Morbidly Obese'] += 1
    elif c >= 45000.00 and c <= 59999.99:
      if bmi < 18.5:
        benificiary_cost_groupings_by_bmi['$45,000.00 - $59,999.99']['Under Weight'] += 1
      elif bmi >= 18.5 and bmi <= 24.9:
        benificiary_cost_groupings_by_bmi['$45,000.00 - $59,999.99']['Healthy Weight'] += 1
      elif bmi >= 25.0 and bmi <= 29.9:
        benificiary_cost_groupings_by_bmi['$45,000.00 - $59,999.99']['Over Weight'] += 1
      elif bmi >= 30.0 and bmi <= 39.9:
        benificiary_cost_groupings_by_bmi['$45,000.00 - $59,999.99']['Obese'] += 1
      elif bmi >= 40.0:
        benificiary_cost_groupings_by_bmi['$45,000.00 - $59,999.99']['Morbidly Obese'] += 1
    elif c >= 60000.00 and c <= 74999.99:
      if bmi < 18.5:
        benificiary_cost_groupings_by_bmi['$60,000.00 - $74,999.99']['Under Weight'] += 1
      elif bmi >= 18.5 and bmi <= 24.9:
        benificiary_cost_groupings_by_bmi['$60,000.00 - $74,999.99']['Healthy Weight'] += 1
      elif bmi >= 25.0 and bmi <= 29.9:
        benificiary_cost_groupings_by_bmi['$60,000.00 - $74,999.99']['Over Weight'] += 1
      elif bmi >= 30.0 and bmi <= 39.9:
        benificiary_cost_groupings_by_bmi['$60,000.00 - $74,999.99']['Obese'] += 1
      elif bmi >= 40.0:
        benificiary_cost_groupings_by_bmi['$60,000.00 - $74,999.99']['Morbidly Obese'] += 1
  return benificiary_cost_groupings_by_bmi
benificiary_cost_grouping_by_bmi = bmi_grouping_medical_charge_distribution (medical_charges, body_mass_index)
print(f"Benificiary Cost Distribution By Body Mass Index: {benificiary_cost_grouping_by_bmi}")

Benificiary Cost Distribution By Body Mass Index: {'$1,000.00 - $14,999.99': {'Under Weight': 17, 'Healthy Weight': 166, 'Over Weight': 292, 'Obese': 440, 'Morbidly Obese': 66}, '$15,000.00 - $29,999.99': {'Under Weight': 2, 'Healthy Weight': 54, 'Over Weight': 88, 'Obese': 47, 'Morbidly Obese': 5}, '$30,000.00 - $44,999.99': {'Under Weight': 1, 'Healthy Weight': 2, 'Over Weight': 10, 'Obese': 102, 'Morbidly Obese': 9}, '$45,000.00 - $59,999.99': {'Under Weight': 0, 'Healthy Weight': 0, 'Over Weight': 0, 'Obese': 24, 'Morbidly Obese': 11}, '$60,000.00 - $74,999.99': {'Under Weight': 0, 'Healthy Weight': 0, 'Over Weight': 0, 'Obese': 2, 'Morbidly Obese': 1}}


#### *Relationship Between Dependents & Medical Charges*

In [244]:
def children_grouping_medical_charge_distribution (cost, children):
  benificiary_cost_groupings_by_children = {'$1,000.00 - $14,999.99': {'0': 0, '1': 0, '2': 0, '3': 0, '4': 0, '5+': 0},
                                            '$15,000.00 - $29,999.99': {'0': 0, '1': 0, '2': 0, '3': 0, '4': 0, '5+': 0},
                                            '$30,000.00 - $44,999.99': {'0': 0, '1': 0, '2': 0, '3': 0, '4': 0, '5+': 0},
                                            '$45,000.00 - $59,999.99': {'0': 0, '1': 0, '2': 0, '3': 0, '4': 0, '5+': 0},
                                            '$60,000.00 - $74,999.99': {'0': 0, '1': 0, '2': 0, '3': 0, '4': 0, '5+': 0}}
  for c, ch in zip(cost, children):
    if c >= 1000.00 and c <= 14999.99:
      if ch == 0:
        benificiary_cost_groupings_by_children['$1,000.00 - $14,999.99']['0'] += 1
      elif ch == 1:
        benificiary_cost_groupings_by_children['$1,000.00 - $14,999.99']['1'] += 1
      elif ch == 2:
        benificiary_cost_groupings_by_children['$1,000.00 - $14,999.99']['2'] += 1
      elif ch == 3:
        benificiary_cost_groupings_by_children['$1,000.00 - $14,999.99']['3'] += 1
      elif ch == 4:
        benificiary_cost_groupings_by_children['$1,000.00 - $14,999.99']['4'] += 1
      elif ch >= 5:
        benificiary_cost_groupings_by_children['$1,000.00 - $14,999.99']['5+'] += 1
    elif c >= 15000.00 and c <= 29999.99:
      if ch == 0:
        benificiary_cost_groupings_by_children['$15,000.00 - $29,999.99']['0'] += 1
      elif ch == 1:
        benificiary_cost_groupings_by_children['$15,000.00 - $29,999.99']['1'] += 1
      elif ch == 2:
        benificiary_cost_groupings_by_children['$15,000.00 - $29,999.99']['2'] += 1
      elif ch == 3:
        benificiary_cost_groupings_by_children['$15,000.00 - $29,999.99']['3'] += 1
      elif ch == 4:
        benificiary_cost_groupings_by_children['$15,000.00 - $29,999.99']['4'] += 1
      elif ch >= 5:
        benificiary_cost_groupings_by_children['$15,000.00 - $29,999.99']['5+'] += 1
    elif c >= 30000.00 and c <= 44999.99:
      if ch == 0:
        benificiary_cost_groupings_by_children['$30,000.00 - $44,999.99']['0'] += 1
      elif ch == 1:
        benificiary_cost_groupings_by_children['$30,000.00 - $44,999.99']['1'] += 1
      elif ch == 2:
        benificiary_cost_groupings_by_children['$30,000.00 - $44,999.99']['2'] += 1
      elif ch == 3:
        benificiary_cost_groupings_by_children['$30,000.00 - $44,999.99']['3'] += 1
      elif ch == 4:
        benificiary_cost_groupings_by_children['$30,000.00 - $44,999.99']['4'] += 1
      elif ch >= 5:
        benificiary_cost_groupings_by_children['$30,000.00 - $44,999.99']['5+'] += 1
    elif c >= 45000.00 and c <= 59999.99:
      if ch == 0:
        benificiary_cost_groupings_by_children['$45,000.00 - $59,999.99']['0'] += 1
      elif ch == 1:
        benificiary_cost_groupings_by_children['$45,000.00 - $59,999.99']['1'] += 1
      elif ch == 2:
        benificiary_cost_groupings_by_children['$45,000.00 - $59,999.99']['2'] += 1
      elif ch == 3:
        benificiary_cost_groupings_by_children['$45,000.00 - $59,999.99']['3'] += 1
      elif ch == 4:
        benificiary_cost_groupings_by_children['$45,000.00 - $59,999.99']['4'] += 1
      elif ch >= 5:
        benificiary_cost_groupings_by_children['$45,000.00 - $59,999.99']['5+'] += 1
    elif c >= 60000.00 and c <= 74999.99:
      if ch == 0:
        benificiary_cost_groupings_by_children['$60,000.00 - $74,999.99']['0'] += 1
      elif ch == 1:
        benificiary_cost_groupings_by_children['$60,000.00 - $74,999.99']['1'] += 1
      elif ch == 2:
        benificiary_cost_groupings_by_children['$60,000.00 - $74,999.99']['2'] += 1
      elif ch == 3:
        benificiary_cost_groupings_by_children['$60,000.00 - $74,999.99']['3'] += 1
      elif ch == 4:
        benificiary_cost_groupings_by_children['$60,000.00 - $74,999.99']['4'] += 1
      elif ch >= 5:
        benificiary_cost_groupings_by_children['$60,000.00 - $74,999.99']['5+'] += 1
  return benificiary_cost_groupings_by_children
benificiary_cost_groupings_by_children = children_grouping_medical_charge_distribution(medical_charges, nummber_of_children)
print(f"Benificiary Cost Distribution By Children: {benificiary_cost_groupings_by_children}")

Benificiary Cost Distribution By Children: {'$1,000.00 - $14,999.99': {'0': 437, '1': 242, '2': 164, '3': 104, '4': 17, '5+': 17}, '$15,000.00 - $29,999.99': {'0': 74, '1': 48, '2': 37, '3': 30, '4': 6, '5+': 1}, '$30,000.00 - $44,999.99': {'0': 50, '1': 26, '2': 30, '3': 16, '4': 2, '5+': 0}, '$45,000.00 - $59,999.99': {'0': 12, '1': 8, '2': 9, '3': 6, '4': 0, '5+': 0}, '$60,000.00 - $74,999.99': {'0': 2, '1': 0, '2': 0, '3': 1, '4': 0, '5+': 0}}


#### *Relationship Between Region & Medical Charges*

In [245]:
def region_grouping_medical_charge_distribution (cost, region):
  benificiary_cost_groupings_by_region = {'$1,000.00 - $14,999.99': {'Northeast': 0, 'Southeast': 0, 'Northwest': 0, 'Southwest': 0},
                                          '$15,000.00 - $29,999.99': {'Northeast': 0, 'Southeast': 0, 'Northwest': 0, 'Southwest': 0},
                                          '$30,000.00 - $44,999.99': {'Northeast': 0, 'Southeast': 0, 'Northwest': 0, 'Southwest': 0},
                                          '$45,000.00 - $59,999.99': {'Northeast': 0, 'Southeast': 0, 'Northwest': 0, 'Southwest': 0},
                                          '$60,000.00 - $74,999.99': {'Northeast': 0, 'Southeast': 0, 'Northwest': 0, 'Southwest': 0}}
  for c, r in zip(cost, region):
    if c >= 1000.00 and c <= 14999.99:
      if r == 'northeast':
        benificiary_cost_groupings_by_region['$1,000.00 - $14,999.99']['Northeast'] += 1
      elif r == 'southeast':
        benificiary_cost_groupings_by_region['$1,000.00 - $14,999.99']['Southeast'] += 1
      elif r == 'northwest':
        benificiary_cost_groupings_by_region['$1,000.00 - $14,999.99']['Northwest'] += 1
      elif r == 'southwest':
        benificiary_cost_groupings_by_region['$1,000.00 - $14,999.99']['Southwest'] += 1
    elif c >= 15000.00 and c <= 29999.99:
      if r == 'northeast':
        benificiary_cost_groupings_by_region['$15,000.00 - $29,999.99']['Northeast'] += 1
      elif r == 'southeast':
        benificiary_cost_groupings_by_region['$15,000.00 - $29,999.99']['Southeast'] += 1
      elif r == 'northwest':
        benificiary_cost_groupings_by_region['$15,000.00 - $29,999.99']['Northwest'] += 1
      elif r == 'southwest':
        benificiary_cost_groupings_by_region['$15,000.00 - $29,999.99']['Southwest'] += 1
    elif c >= 30000.00 and c <= 44999.99:
      if r == 'northeast':
        benificiary_cost_groupings_by_region['$30,000.00 - $44,999.99']['Northeast'] += 1
      elif r == 'southeast':
        benificiary_cost_groupings_by_region['$30,000.00 - $44,999.99']['Southeast'] += 1
      elif r == 'northwest':
        benificiary_cost_groupings_by_region['$30,000.00 - $44,999.99']['Northwest'] += 1
      elif r == 'southwest':
        benificiary_cost_groupings_by_region['$30,000.00 - $44,999.99']['Southwest'] += 1
    elif c >= 45000.00 and c <= 59999.99:
      if r == 'northeast':
        benificiary_cost_groupings_by_region['$45,000.00 - $59,999.99']['Northeast'] += 1
      elif r == 'southeast':
        benificiary_cost_groupings_by_region['$45,000.00 - $59,999.99']['Southeast'] += 1
      elif r == 'northwest':
        benificiary_cost_groupings_by_region['$45,000.00 - $59,999.99']['Northwest'] += 1
      elif r == 'southwest':
        benificiary_cost_groupings_by_region['$45,000.00 - $59,999.99']['Southwest'] += 1
    elif c >= 60000.00 and c <= 74999.99:
      if r == 'northeast':
        benificiary_cost_groupings_by_region['$60,000.00 - $74,999.99']['Northeast'] += 1
      elif r == 'southeast':
        benificiary_cost_groupings_by_region['$60,000.00 - $74,999.99']['Southeast'] += 1
      elif r == 'northwest':
        benificiary_cost_groupings_by_region['$60,000.00 - $74,999.99']['Northwest'] += 1
      elif r == 'southwest':
        benificiary_cost_groupings_by_region['$60,000.00 - $74,999.99']['Southwest'] += 1
  return benificiary_cost_groupings_by_region
benificiary_cost_groupings_by_region = region_grouping_medical_charge_distribution(medical_charges, geographic_region)
print(f"Benificiary Cost Distribution By Region: {benificiary_cost_groupings_by_region}")

Benificiary Cost Distribution By Region: {'$1,000.00 - $14,999.99': {'Northeast': 235, 'Southeast': 249, 'Northwest': 244, 'Southwest': 253}, '$15,000.00 - $29,999.99': {'Northeast': 54, 'Southeast': 55, 'Northwest': 52, 'Southwest': 35}, '$30,000.00 - $44,999.99': {'Northeast': 29, 'Southeast': 45, 'Northwest': 22, 'Southwest': 28}, '$45,000.00 - $59,999.99': {'Northeast': 6, 'Southeast': 14, 'Northwest': 6, 'Southwest': 9}, '$60,000.00 - $74,999.99': {'Northeast': 0, 'Southeast': 2, 'Northwest': 1, 'Southwest': 0}}


#### *Relationship Between Smoking History & Medical Charges*

In [246]:
def smoking_history_medical_charge_distribution(cost, smoking_history):
  benificiary_smoking_groupings_by_cost = {'$1,000.00 - $14,999.99': {'Smoker': 0, 'Non-Smoker': 0},
                                           '$15,000.00 - $29,999.99': {'Smoker': 0, 'Non-Smoker': 0},
                                           '$30,000.00 - $44,999.99': {'Smoker': 0, 'Non-Smoker': 0},
                                           '$45,000.00 - $59,999.99': {'Smoker': 0, 'Non-Smoker': 0},
                                           '$60,000.00 - $74,999.99': {'Smoker': 0, 'Non-Smoker': 0}}
  for c, sh in zip(cost, smoking_history):
    if c >= 1000.00 and c <= 14999.99:
      if sh == 'yes':
        benificiary_smoking_groupings_by_cost['$1,000.00 - $14,999.99']['Smoker'] += 1
      else:
        benificiary_smoking_groupings_by_cost['$1,000.00 - $14,999.99']['Non-Smoker'] += 1
    elif c >= 15000.00 and c <= 29999.99:
      if sh == 'yes':
        benificiary_smoking_groupings_by_cost['$15,000.00 - $29,999.99']['Smoker'] += 1
      else:
        benificiary_smoking_groupings_by_cost['$15,000.00 - $29,999.99']['Non-Smoker'] += 1
    elif c >= 30000.00 and c <= 44999.99:
      if sh == 'yes':
        benificiary_smoking_groupings_by_cost['$30,000.00 - $44,999.99']['Smoker'] += 1
      else:
        benificiary_smoking_groupings_by_cost['$30,000.00 - $44,999.99']['Non-Smoker'] += 1
    elif c >= 45000.00 and c <= 59999.99:
      if sh == 'yes':
        benificiary_smoking_groupings_by_cost['$45,000.00 - $59,999.99']['Smoker'] += 1
      else:
        benificiary_smoking_groupings_by_cost['$45,000.00 - $59,999.99']['Non-Smoker'] += 1
    elif c >= 60000.00 and c <= 74999.99:
      if sh == 'yes':
        benificiary_smoking_groupings_by_cost['$60,000.00 - $74,999.99']['Smoker'] += 1
      else:
        benificiary_smoking_groupings_by_cost['$60,000.00 - $74,999.99']['Non-Smoker'] += 1
  return benificiary_smoking_groupings_by_cost
smoking_history_medical_charge_distribution = smoking_history_medical_charge_distribution(medical_charges, smoker_status)
print(f"Medical Cost Variation By Smoking Status: {smoking_history_medical_charge_distribution}")

Medical Cost Variation By Smoking Status: {'$1,000.00 - $14,999.99': {'Smoker': 7, 'Non-Smoker': 974}, '$15,000.00 - $29,999.99': {'Smoker': 115, 'Non-Smoker': 81}, '$30,000.00 - $44,999.99': {'Smoker': 114, 'Non-Smoker': 10}, '$45,000.00 - $59,999.99': {'Smoker': 35, 'Non-Smoker': 0}, '$60,000.00 - $74,999.99': {'Smoker': 3, 'Non-Smoker': 0}}
