# Learn Python with U.S. Medical Insurance Costs
***

## Python Syntax: Medical Insurance Project

Suppose you are a medical professional curious about how certain factors contribute to medical insurance costs. Using a formula that estimates a person's yearly insurance costs, you will investigate how different factors such as age, sex, BMI, etc. affect the prediction.

In [1]:
# create the initial variables below for a given patient information
age = 28
sex = 0 # 0 for female, 1 for male
bmi = 26.2 
num_of_children = 3
smoker = 0 # 0 for a nonsmoker, 1 for a smoker

In [2]:
# Add insurance estimate formula below
insurance_cost = 250 * age\
                 - 128 * sex\
                 + 370 * bmi\
                 + 425 * num_of_children\
                 + 24000 * smoker - 12500

# Print the result
print(f"This person's insurance cost is {insurance_cost} dollars.")

This person's insurance cost is 5469.0 dollars.


In [3]:
# What happens if we change the age to 4 years older
age += 4 # adds two values together and saves result to the variable age

new_insurance_cost = 250 * age\
                 - 128 * sex\
                 + 370 * bmi\
                 + 425 * num_of_children\
                 + 24000 * smoker - 12500
print(f"This person's insurance cost is {new_insurance_cost} dollars.")

change_in_insurance_cost = new_insurance_cost - insurance_cost
print(f"The change in cost of insurance after increasing the age by 4 years is {change_in_insurance_cost} dollars.")

This person's insurance cost is 6469.0 dollars.
The change in cost of insurance after increasing the age by 4 years is 1000.0 dollars.


In [4]:
# looking at male vs female factor
age = 28 # return value to the same as female
sex = 1

new_insurance_cost = 250 * age\
                 - 128 * sex\
                 + 370 * bmi\
                 + 425 * num_of_children\
                 + 24000 * smoker - 12500
print(f"This person's insurance cost is {new_insurance_cost} dollars.")

change_in_insurance_cost = new_insurance_cost - insurance_cost
print(f"The change in estimated cost for being male instead of female is {change_in_insurance_cost} dollars.")

# This means that men tend to have lower medical costs on average than women.

This person's insurance cost is 5341.0 dollars.
The change in estimated cost for being male instead of female is -128.0 dollars.


***

## Python Strings: Medical Insurance Project


You are a doctor who needs to clean up medical patient records, which are currently stored in one large string.

In this project, you will use your new knowledge of Python strings to obtain and clean up medical data so that it is easier to read and analyze.

Let's get started!

In [5]:
# The string `medical_data` stores the medical records for ten individuals. 
# Each record is separated by a `;` and contains the name, age, BMI 
# (body mass index), and insurance cost for an individual, in that order.

medical_data = \
"""Marina Allison   ,27   ,   31.1 , 
#7010.0   ;Markus Valdez   ,   30, 
22.4,   #4050.0 ;Connie Ballard ,43 
,   25.3 , #12060.0 ;Darnell Weber   
,   35   , 20.6   , #7500.0;
Sylvie Charles   ,22, 22.1 
,#3022.0   ;   Vinay Padilla,24,   
26.9 ,#4620.0 ;Meredith Santiago, 51   , 
29.3 ,#16330.0;   Andre Mccarty, 
19,22.7 , #2900.0 ; 
Lorena Hodson ,65, 33.1 , #19370.0; 
Isaac Vu ,34, 24.8,   #7045.0"""

print(medical_data)

Marina Allison   ,27   ,   31.1 , 
#7010.0   ;Markus Valdez   ,   30, 
22.4,   #4050.0 ;Connie Ballard ,43 
,   25.3 , #12060.0 ;Darnell Weber   
,   35   , 20.6   , #7500.0;
Sylvie Charles   ,22, 22.1 
,#3022.0   ;   Vinay Padilla,24,   
26.9 ,#4620.0 ;Meredith Santiago, 51   , 
29.3 ,#16330.0;   Andre Mccarty, 
19,22.7 , #2900.0 ; 
Lorena Hodson ,65, 33.1 , #19370.0; 
Isaac Vu ,34, 24.8,   #7045.0


In [6]:
# Clean the string so easier to analyze
# Replace # too $ to reflect that it is for insurance cost
updated_medical_data = medical_data.replace('#','$')
print(updated_medical_data)

Marina Allison   ,27   ,   31.1 , 
$7010.0   ;Markus Valdez   ,   30, 
22.4,   $4050.0 ;Connie Ballard ,43 
,   25.3 , $12060.0 ;Darnell Weber   
,   35   , 20.6   , $7500.0;
Sylvie Charles   ,22, 22.1 
,$3022.0   ;   Vinay Padilla,24,   
26.9 ,$4620.0 ;Meredith Santiago, 51   , 
29.3 ,$16330.0;   Andre Mccarty, 
19,22.7 , $2900.0 ; 
Lorena Hodson ,65, 33.1 , $19370.0; 
Isaac Vu ,34, 24.8,   $7045.0


In [7]:
# Calculate the number of medical records in our data
num_records = 0
for character in updated_medical_data:
  if character == '$':
    num_records += 1 
print(f"There are {num_records} medical records in the data.")

There are 10 medical records in the data.


In [8]:
# Clean the string so easier to analyze
# Splitting strings
medical_data_split = updated_medical_data.split(';')
print(medical_data_split)


['Marina Allison   ,27   ,   31.1 , \n$7010.0   ', 'Markus Valdez   ,   30, \n22.4,   $4050.0 ', 'Connie Ballard ,43 \n,   25.3 , $12060.0 ', 'Darnell Weber   \n,   35   , 20.6   , $7500.0', '\nSylvie Charles   ,22, 22.1 \n,$3022.0   ', '   Vinay Padilla,24,   \n26.9 ,$4620.0 ', 'Meredith Santiago, 51   , \n29.3 ,$16330.0', '   Andre Mccarty, \n19,22.7 , $2900.0 ', ' \nLorena Hodson ,65, 33.1 , $19370.0', ' \nIsaac Vu ,34, 24.8,   $7045.0']


In [9]:
# The data is now stored in one list. Split each medical record into its own list.
medical_records = []
for record in medical_data_split:
  medical_records.append(record.split(','))

print(medical_records)

[['Marina Allison   ', '27   ', '   31.1 ', ' \n$7010.0   '], ['Markus Valdez   ', '   30', ' \n22.4', '   $4050.0 '], ['Connie Ballard ', '43 \n', '   25.3 ', ' $12060.0 '], ['Darnell Weber   \n', '   35   ', ' 20.6   ', ' $7500.0'], ['\nSylvie Charles   ', '22', ' 22.1 \n', '$3022.0   '], ['   Vinay Padilla', '24', '   \n26.9 ', '$4620.0 '], ['Meredith Santiago', ' 51   ', ' \n29.3 ', '$16330.0'], ['   Andre Mccarty', ' \n19', '22.7 ', ' $2900.0 '], [' \nLorena Hodson ', '65', ' 33.1 ', ' $19370.0'], [' \nIsaac Vu ', '34', ' 24.8', '   $7045.0']]


In [10]:
# There is unnecessary white spaces, remove them using a nested for loop
medical_records_clean = []

for record in medical_records:
    record_clean = []
    for item in record:
        record_clean.append(item.strip())
    medical_records_clean.append(record_clean)

# Print each record
for record in medical_records_clean:
    print(record)

# Print each record using for loop
#for record in medical_records_clean:
    #print(f"Names: {record[0]}")
    #print(f"Ages: {record[1]}")
    #print(f"BMI: {record[2]}")
    #print(f"Insurance Costs: {record[3]}")  

# Want to uppercase the name 'MARKINA ALLISON'
#for record in medical_records_clean:
    #record[0] = record[0].upper()
    #print(record)

# To return back format to 'Markina Allison'
#for record in medical_records_clean:
    #record[0] = record[0].title()
    #print(record)

# Store each value of the lists to separate lists of the following:
# names, ages, bmis, insurance_costs
names = []
ages = []
bmis = []
insurance_costs = []
for record in medical_records_clean:
    names.append(record[0])
    ages.append(record[1])
    bmis.append(record[2])
    insurance_costs.append(record[3])

# print each list
print(f"Names: {names}")
print(f"Ages: {ages}")
print(f"BMI: {bmis}")
print(f"Insurance Costs: {insurance_costs}")


['Marina Allison', '27', '31.1', '$7010.0']
['Markus Valdez', '30', '22.4', '$4050.0']
['Connie Ballard', '43', '25.3', '$12060.0']
['Darnell Weber', '35', '20.6', '$7500.0']
['Sylvie Charles', '22', '22.1', '$3022.0']
['Vinay Padilla', '24', '26.9', '$4620.0']
['Meredith Santiago', '51', '29.3', '$16330.0']
['Andre Mccarty', '19', '22.7', '$2900.0']
['Lorena Hodson', '65', '33.1', '$19370.0']
['Isaac Vu', '34', '24.8', '$7045.0']
Names: ['Marina Allison', 'Markus Valdez', 'Connie Ballard', 'Darnell Weber', 'Sylvie Charles', 'Vinay Padilla', 'Meredith Santiago', 'Andre Mccarty', 'Lorena Hodson', 'Isaac Vu']
Ages: ['27', '30', '43', '35', '22', '24', '51', '19', '65', '34']
BMI: ['31.1', '22.4', '25.3', '20.6', '22.1', '26.9', '29.3', '22.7', '33.1', '24.8']
Insurance Costs: ['$7010.0', '$4050.0', '$12060.0', '$7500.0', '$3022.0', '$4620.0', '$16330.0', '$2900.0', '$19370.0', '$7045.0']


In [11]:
# Now that all of the data is in separate lists, can easily perform analysis on 
# that data.
# Calculate the average BMI in our dataset.
total_bmi = 0
for score in bmis:
  total_bmi += float(bmis[0])
avg_bmi = total_bmi/len(bmis)
print(f"Average BMI: {avg_bmi}")

Average BMI: 31.1


In [12]:
# Calculate the average insurance cost.
# Need to clean the data before analysis by removing special character $
insurance_costs_clean = []
for costs in insurance_costs:
  insurance_costs_clean.append(costs.strip('$'))
print(insurance_costs_clean)

# Other method to do the same thing
#insurance_costs_clean = list(map(lambda each:each.strip('$'),insurance_costs))
#print(insurance_costs_clean)

['7010.0', '4050.0', '12060.0', '7500.0', '3022.0', '4620.0', '16330.0', '2900.0', '19370.0', '7045.0']


In [13]:
# Calculations
# for loop method practiced in this activity
total_cost = 0
for element in insurance_costs_clean:
  total_cost += float(element)
print(insurance_costs_clean)

avg_cost = total_cost/len(insurance_costs_clean)
print(f"The total cost of insurance for this dataset is ${total_cost}.")
print(f"The average cost of insurance for this dataset is ${avg_cost}.")

# list comprehension method
#floats = [float(x) for x in insurance_costs_clean]
#print(floats)

#total_cost = sum(floats)
#avg_cost = total_cost/len(floats)
#print(f"The total cost of insurance for this dataset is ${total_cost}.")
#print(f"The average cost of insurance for this dataset is ${avg_cost}.")


['7010.0', '4050.0', '12060.0', '7500.0', '3022.0', '4620.0', '16330.0', '2900.0', '19370.0', '7045.0']
The total cost of insurance for this dataset is $83907.0.
The average cost of insurance for this dataset is $8390.7.


In [14]:
# Upper case the name in names list
names_upper = list(map(lambda each:each.upper(),names))
print(names_upper)

# Clean the names list to be back to original format
names_clean = list(map(lambda each:each.title(),names_upper))
print(names_clean)

# Write a for loop that outputs a string for each individual
for i in range(len(names_clean)):
  print(f"{names_clean[i]} is {ages[i]} years old with a BMI of {bmis[i]} and an insurance cost of {insurance_costs[i]}.")

['MARINA ALLISON', 'MARKUS VALDEZ', 'CONNIE BALLARD', 'DARNELL WEBER', 'SYLVIE CHARLES', 'VINAY PADILLA', 'MEREDITH SANTIAGO', 'ANDRE MCCARTY', 'LORENA HODSON', 'ISAAC VU']
['Marina Allison', 'Markus Valdez', 'Connie Ballard', 'Darnell Weber', 'Sylvie Charles', 'Vinay Padilla', 'Meredith Santiago', 'Andre Mccarty', 'Lorena Hodson', 'Isaac Vu']
Marina Allison is 27 years old with a BMI of 31.1 and an insurance cost of $7010.0.
Markus Valdez is 30 years old with a BMI of 22.4 and an insurance cost of $4050.0.
Connie Ballard is 43 years old with a BMI of 25.3 and an insurance cost of $12060.0.
Darnell Weber is 35 years old with a BMI of 20.6 and an insurance cost of $7500.0.
Sylvie Charles is 22 years old with a BMI of 22.1 and an insurance cost of $3022.0.
Vinay Padilla is 24 years old with a BMI of 26.9 and an insurance cost of $4620.0.
Meredith Santiago is 51 years old with a BMI of 29.3 and an insurance cost of $16330.0.
Andre Mccarty is 19 years old with a BMI of 22.7 and an insuranc

***

## Python Lists: Medical Insurance Estimation Project

In this project, you will examine how factors such as age, sex, BMI, number of children, and smoking status contribute to medical insurance costs.

You will apply your new knowledge of Python Lists to store insurance cost data in a list as well as compare estimated insurance costs to actual insurance costs.

In [15]:
# Create a lists
# Function to estimate insurance cost:
def estimate_insurance_cost(name, age, sex, bmi, num_of_children, smoker):
    estimated_cost = 250*age - 128*sex + 370*bmi + 425*num_of_children + 24000*smoker - 12500
    print(name + "'s Estimated Insurance Cost: " + str(estimated_cost) + " dollars.")
    return estimated_cost

# Estimate Maria's insurance cost
maria_insurance_cost = estimate_insurance_cost(name = "Maria", age = 31, sex = 0, bmi = 23.1, num_of_children = 1, smoker = 0)

# Estimate Rohan's insurance cost
rohan_insurance_cost = estimate_insurance_cost(name = "Rohan", age = 25, sex = 1, bmi = 28.5, num_of_children = 3, smoker = 0)

# Estimate Valentina's insurance cost
valentina_insurance_cost = estimate_insurance_cost(name = "Valentina", age = 53, sex = 0, bmi = 31.4, num_of_children = 0, smoker = 1)

# Lists for names and insurance_costs
names = ['Maria','Rohan','Valentina']
insurance_costs = [4150.0,5320.0,35210.0]

Maria's Estimated Insurance Cost: 4222.0 dollars.
Rohan's Estimated Insurance Cost: 5442.0 dollars.
Valentina's Estimated Insurance Cost: 36368.0 dollars.


In [16]:
# Combine lists
insurance_data = list(zip(names,insurance_costs)) # need to use list function outside zip function to return a list
print(f"Here is the actual insurance cost data: {insurance_data}")

Here is the actual insurance cost data: [('Maria', 4150.0), ('Rohan', 5320.0), ('Valentina', 35210.0)]


In [17]:
# Appending to a list.
# Append the separate lists of the estimated insurance costs
estimated_insurance_data = []
estimated_insurance_data.append(('Maria',maria_insurance_cost))
estimated_insurance_data.append(('Rohan',rohan_insurance_cost))
estimated_insurance_data.append(('Valentina',valentina_insurance_cost))
print(f"Here is the estimated insurance cost data: {estimated_insurance_data}")

Here is the estimated insurance cost data: [('Maria', 4222.0), ('Rohan', 5442.0), ('Valentina', 36368.0)]


It should be much more clear from the output what each of the two lists represents, helping you better understand the data you're working with.

You may notice that there are differences between the actual insurance costs and estimated insurance costs. This means that our estimate_insurance_cost() function does not calculate insurance costs with 100% accuracy.

Compare the estimated insurance data to the actual insurance data. Do the estimated insurance costs seem to be overestimated or underestimated?

In [18]:
insurance_data

[('Maria', 4150.0), ('Rohan', 5320.0), ('Valentina', 35210.0)]

In [19]:
# Create a new list for actual cost and estimated cost from the current lists 
# of insurance_data and estimated_insurance_data
actual_insurance_cost = []
for data in insurance_data:
  actual_insurance_cost.append(data[1])

estimated_insurance_cost = []
for data in estimated_insurance_data:
  estimated_insurance_cost.append(data[1])

# Can subtract the two lists
difference = [actual_cost - estimated_cost for (actual_cost,estimated_cost) in zip(actual_insurance_cost,estimated_insurance_cost)]
print(f"Here is the difference between actual_cost and estimated_cost: {difference}")
for value in difference:
  if value < 0:
    print(f"The estimated insurance data is overestimated by {value}.")
  elif value > 0:
    print(f"The estimated insurance data is underestimated by {value}.")
  else:
    print("The estimated insurance data matches insurance cost.")

Here is the difference between actual_cost and estimated_cost: [-72.0, -122.0, -1158.0]
The estimated insurance data is overestimated by -72.0.
The estimated insurance data is overestimated by -122.0.
The estimated insurance data is overestimated by -1158.0.


***

## Working with Python Lists: Medical Insurance Costs Project

You are a doctor sorting through medical insurance cost data for some patients.

Using your knowledge of Python lists, you will store medical data and see what valuable insights you can gain from that data.

In [20]:
# Exploring list data
names = ["Mohamed", "Sara", "Xia", "Paul", "Valentina", "Jide", "Aaron", "Emily", "Nikita", "Paul"]
insurance_costs = [13262.0, 4816.0, 6839.0, 5054.0, 14724.0, 5360.0, 7640.0, 6072.0, 2750.0, 12064.0]

names.append("Priscilla")
insurance_costs.append(8320.0)

print(names)
print(insurance_costs)

# combine the two separate lists into a single list
medical_records = list(zip(insurance_costs,names))
print(medical_records)

# count how many records we have
num_medical_records = len(medical_records)
print(f"There are {num_medical_records} medical records")

# selecting list elements
# select the first record
first_medical_record = medical_records[0]
print(f"Here is the first medical record: {first_medical_record}")

# sort the lists 
# list the individuals with the lowest insurance costs first on the list
sorted_mr = sorted(medical_records,key = lambda x:x[0], reverse = False)
print(f"Here are the medical records sorted by insurance cost: {sorted_mr}")

# another method to sort
#medical_records.sort()
#print(f"Here are the medical records sorted by insurance cost: {medical_records}")

#sorted_mr = medical_records.sort() Does not work because .sort() does not return the sorted list and instead sorts the list in place. 
#in other words, .sort() returns same list vs. sorted() returns new list

['Mohamed', 'Sara', 'Xia', 'Paul', 'Valentina', 'Jide', 'Aaron', 'Emily', 'Nikita', 'Paul', 'Priscilla']
[13262.0, 4816.0, 6839.0, 5054.0, 14724.0, 5360.0, 7640.0, 6072.0, 2750.0, 12064.0, 8320.0]
[(13262.0, 'Mohamed'), (4816.0, 'Sara'), (6839.0, 'Xia'), (5054.0, 'Paul'), (14724.0, 'Valentina'), (5360.0, 'Jide'), (7640.0, 'Aaron'), (6072.0, 'Emily'), (2750.0, 'Nikita'), (12064.0, 'Paul'), (8320.0, 'Priscilla')]
There are 11 medical records
Here is the first medical record: (13262.0, 'Mohamed')
Here are the medical records sorted by insurance cost: [(2750.0, 'Nikita'), (4816.0, 'Sara'), (5054.0, 'Paul'), (5360.0, 'Jide'), (6072.0, 'Emily'), (6839.0, 'Xia'), (7640.0, 'Aaron'), (8320.0, 'Priscilla'), (12064.0, 'Paul'), (13262.0, 'Mohamed'), (14724.0, 'Valentina')]


In [21]:
# slicing lists and selecting
# find the cheapest three, recall python uses indexing (start value,end value+1]
cheapest_three = medical_records[:3]
print(f"Here are the three cheapest insurance costs in our medical records: {cheapest_three}")

# select index 3 and ending at index 7
middle_five_records = medical_records[3:8]
print(f"The medical records for the middle five records are as follows: {middle_five_records}")

# counting elements in a list
occurrences_paul = names.count('Paul')
print(f"There are {occurrences_paul} individuals with the name Paul in our medical records.")

# sort medical records alphabetically
# notice had to create a new list using zip before sorting
medical_records = list(zip(names,insurance_costs))
medical_records.sort()
print(f"The medical records are as follows in alphabetical order: {medical_records}")

Here are the three cheapest insurance costs in our medical records: [(13262.0, 'Mohamed'), (4816.0, 'Sara'), (6839.0, 'Xia')]
The medical records for the middle five records are as follows: [(5054.0, 'Paul'), (14724.0, 'Valentina'), (5360.0, 'Jide'), (7640.0, 'Aaron'), (6072.0, 'Emily')]
There are 2 individuals with the name Paul in our medical records.
The medical records are as follows in alphabetical order: [('Aaron', 7640.0), ('Emily', 6072.0), ('Jide', 5360.0), ('Mohamed', 13262.0), ('Nikita', 2750.0), ('Paul', 5054.0), ('Paul', 12064.0), ('Priscilla', 8320.0), ('Sara', 4816.0), ('Valentina', 14724.0), ('Xia', 6839.0)]


***

## Python Dictionaries: Medical Insurance Project

You have been asked to create a program that organizes and updates medical records efficiently.

In this project, you will use your new knowledge of Python dictionaries to create a database of medical records for patients.

### Python Collections (Arrays) 
There are four collection data types in the Python programming language:

* **List** is a collection which is ordered and changeable. Allows duplicate members. Create empty list []

* **Tuple** is a collection which is ordered and unchangeable. Allows duplicate members.

* **Set** is a collection which is unordered, unchangeable*, and unindexed. No duplicate members.

* **Dictionary** is a collection which is ordered** and changeable. No duplicate members. Create empty dictionary {}

In [22]:
# Storing patient names and costs
# create an empty dictionary
medical_costs = {}

# populate dictionary with key-value pairs
medical_costs['Marina'] = 6607.0
medical_costs['Vinay'] = 3225.0

# using one line of code to update with more data
medical_costs.update({"Connie": 8886.0, "Isaac": 16444.0, "Valentina": 6420.0})

print(medical_costs)

{'Marina': 6607.0, 'Vinay': 3225.0, 'Connie': 8886.0, 'Isaac': 16444.0, 'Valentina': 6420.0}


In [23]:
# found the Vinay's record is incorrect so update it
medical_costs["Vinay"] = 3325.0
print(medical_costs)

{'Marina': 6607.0, 'Vinay': 3325.0, 'Connie': 8886.0, 'Isaac': 16444.0, 'Valentina': 6420.0}


In [24]:
# calculate the average medical cost of each patient
# create varible total_cost and set it to 0. Then iterate through the values in 
# `medical_costs` and add each value to the `total_cost` variable.
total_cost = 0
for cost in medical_costs.values():
  total_cost =+ cost
  
print(f"The Total Cost: {total_cost}")

average_cost = total_cost/len(medical_costs)
print(f"Average Insurance Cost: {average_cost}")

The Total Cost: 6420.0
Average Insurance Cost: 1284.0


### List Comprehenstion to Dictionary
You have been asked to create a second dictionary that maps patient names to their ages.
First, create two lists called names and ages with the following data:
names | ages --- | --- Marina | 27 Vinay | 24 Connie | 43 Isaac | 35 Valentina | 52

In [25]:
names = ['Marina','Vinay','Connie','Isaac','Valentina']
ages = [27,24,43,35,52]
zipped_ages = list(zip(names,ages))

# Create a dictionary called names_to_ages by using a list comprehension that iterates 
# through zipped_ages and turns each pair into a key : value item.
names_to_ages = {key: value for key, value in zipped_ages}
print(names_to_ages)

# to get values, use .get()
# get Marina's age and store it to marina_age. Use None as a default value if the key doesn't exist.
marina_age = names_to_ages.get("Marina", None)
print(f"Marina's age is {marina_age}.")

# get each name and print it out
for i in names_to_ages:
  age = names_to_ages.get(i,None)
  print(f"{i}'s age is {age}.")


{'Marina': 27, 'Vinay': 24, 'Connie': 43, 'Isaac': 35, 'Valentina': 52}
Marina's age is 27.
Marina's age is 27.
Vinay's age is 24.
Connie's age is 43.
Isaac's age is 35.
Valentina's age is 52.


### Using a Dictionary to Create a Medical Database

Let's create a third dictionary to represent a database of medical records that contains information such as a patient's name, age, sex, gender, BMI, number of children, smoker status, and insurance cost.

In [26]:
medical_records = {}
# add data into dictionary
medical_records["Marina"] = {"Age": 27, "Sex": "Female", "BMI": 31.1, "Children": 2, "Smoker": "Non-smoker", "Insurance_cost": 6607.0}
medical_records["Vinay"] = {"Age": 24, "Sex": "Male", "BMI": 26.9, "Children": 0, "Smoker": "Non-smoker", "Insurance_cost": 3225.0}
medical_records["Connie"] = {"Age": 43, "Sex": "Female", "BMI": 25.3, "Children": 3, "Smoker": "Non-smoker", "Insurance_cost": 8886.0}
medical_records["Isaac"] = {"Age": 35, "Sex": "Male", "BMI": 20.6, "Children": 4, "Smoker": "Smoker", "Insurance_cost": 16444.0}
medical_records["Valentina"] = {"Age": 52, "Sex": "Female", "BMI": 18.7, "Children": 1, "Smoker": "Non-smoker", "Insurance_cost": 6420.0}

print(f"The medical records are the following: {medical_records}")


The medical records are the following: {'Marina': {'Age': 27, 'Sex': 'Female', 'BMI': 31.1, 'Children': 2, 'Smoker': 'Non-smoker', 'Insurance_cost': 6607.0}, 'Vinay': {'Age': 24, 'Sex': 'Male', 'BMI': 26.9, 'Children': 0, 'Smoker': 'Non-smoker', 'Insurance_cost': 3225.0}, 'Connie': {'Age': 43, 'Sex': 'Female', 'BMI': 25.3, 'Children': 3, 'Smoker': 'Non-smoker', 'Insurance_cost': 8886.0}, 'Isaac': {'Age': 35, 'Sex': 'Male', 'BMI': 20.6, 'Children': 4, 'Smoker': 'Smoker', 'Insurance_cost': 16444.0}, 'Valentina': {'Age': 52, 'Sex': 'Female', 'BMI': 18.7, 'Children': 1, 'Smoker': 'Non-smoker', 'Insurance_cost': 6420.0}}


In [27]:
# The medical_records dictionary acts like a database of medical records. 
# Let's access a specific piece of data in medical_records.
#print("Connie's insurance cost is " + str(medical_records["Connie"]["Insurance_cost"]) + " dollars.")
print(f"Connie's insurance cost is {medical_records['Connie']['Insurance_cost']} dollars.")

Connie's insurance cost is 8886.0 dollars.


In [28]:
# Remove Vanay's record because he moved away
medical_records.pop("Vinay")

# Check that record was removed
print(medical_records)

{'Marina': {'Age': 27, 'Sex': 'Female', 'BMI': 31.1, 'Children': 2, 'Smoker': 'Non-smoker', 'Insurance_cost': 6607.0}, 'Connie': {'Age': 43, 'Sex': 'Female', 'BMI': 25.3, 'Children': 3, 'Smoker': 'Non-smoker', 'Insurance_cost': 8886.0}, 'Isaac': {'Age': 35, 'Sex': 'Male', 'BMI': 20.6, 'Children': 4, 'Smoker': 'Smoker', 'Insurance_cost': 16444.0}, 'Valentina': {'Age': 52, 'Sex': 'Female', 'BMI': 18.7, 'Children': 1, 'Smoker': 'Non-smoker', 'Insurance_cost': 6420.0}}


In [29]:
for name, record in medical_records.items():
    print(name + " is a " + str(record["Age"]) + \
          " year old " + record["Sex"] + " " + record["Smoker"] \
         + " with a BMI of " + str(record["BMI"]) + \
         " and insurance cost of " + str(record["Insurance_cost"]))

Marina is a 27 year old Female Non-smoker with a BMI of 31.1 and insurance cost of 6607.0
Connie is a 43 year old Female Non-smoker with a BMI of 25.3 and insurance cost of 8886.0
Isaac is a 35 year old Male Smoker with a BMI of 20.6 and insurance cost of 16444.0
Valentina is a 52 year old Female Non-smoker with a BMI of 18.7 and insurance cost of 6420.0


In [30]:
# Use a for loop to iterate through the items of medical_records. For each key-value pair, print out a string
for name, record in medical_records.items():
    print(f"{name} is a {record['Age']} year old {record['Sex']} {record['Smoker']} with a BMI of {record['BMI']} and insurance cost of {record['Insurance_cost']}.")

Marina is a 27 year old Female Non-smoker with a BMI of 31.1 and insurance cost of 6607.0.
Connie is a 43 year old Female Non-smoker with a BMI of 25.3 and insurance cost of 8886.0.
Isaac is a 35 year old Male Smoker with a BMI of 20.6 and insurance cost of 16444.0.
Valentina is a 52 year old Female Non-smoker with a BMI of 18.7 and insurance cost of 6420.0.


In [31]:
# Create a function called update_medical_records() that takes in the name of an individual 
# as well as their medical data, and then updates the medical_records dictionary accordingly.
def update_medical_records(name, age, sex, bmi, children, smoker, insurance_cost):
  medical_records[name] = {"Age": age, "Sex": sex, "BMI": bmi, "Children": children, "Smoker": smoker, "Insurance_cost": insurance_cost}

update_medical_records('Vinay', 25, 'Female', 23, 2, 0, 555)

medical_records

{'Marina': {'Age': 27,
  'Sex': 'Female',
  'BMI': 31.1,
  'Children': 2,
  'Smoker': 'Non-smoker',
  'Insurance_cost': 6607.0},
 'Connie': {'Age': 43,
  'Sex': 'Female',
  'BMI': 25.3,
  'Children': 3,
  'Smoker': 'Non-smoker',
  'Insurance_cost': 8886.0},
 'Isaac': {'Age': 35,
  'Sex': 'Male',
  'BMI': 20.6,
  'Children': 4,
  'Smoker': 'Smoker',
  'Insurance_cost': 16444.0},
 'Valentina': {'Age': 52,
  'Sex': 'Female',
  'BMI': 18.7,
  'Children': 1,
  'Smoker': 'Non-smoker',
  'Insurance_cost': 6420.0},
 'Vinay': {'Age': 25,
  'Sex': 'Female',
  'BMI': 23,
  'Children': 2,
  'Smoker': 0,
  'Insurance_cost': 555}}

In [32]:
def update_medical_records(dictionary):
  medical_records.update(dictionary)

dict_update = {}
dict_update["Vinay"] = {"Age": 23, "Sex": "Male", "BMI": 33, "Children": 3, "Smoker": "smoker", "Insurance_cost": 34423}
dict_update["Hande"] = {"Age": 29, "Sex": "Male", "BMI": 22, "Children": 4, "Smoker": "smoker", "Insurance_cost": 34556}
update_medical_records(dict_update)

medical_records

{'Marina': {'Age': 27,
  'Sex': 'Female',
  'BMI': 31.1,
  'Children': 2,
  'Smoker': 'Non-smoker',
  'Insurance_cost': 6607.0},
 'Connie': {'Age': 43,
  'Sex': 'Female',
  'BMI': 25.3,
  'Children': 3,
  'Smoker': 'Non-smoker',
  'Insurance_cost': 8886.0},
 'Isaac': {'Age': 35,
  'Sex': 'Male',
  'BMI': 20.6,
  'Children': 4,
  'Smoker': 'Smoker',
  'Insurance_cost': 16444.0},
 'Valentina': {'Age': 52,
  'Sex': 'Female',
  'BMI': 18.7,
  'Children': 1,
  'Smoker': 'Non-smoker',
  'Insurance_cost': 6420.0},
 'Vinay': {'Age': 23,
  'Sex': 'Male',
  'BMI': 33,
  'Children': 3,
  'Smoker': 'smoker',
  'Insurance_cost': 34423},
 'Hande': {'Age': 29,
  'Sex': 'Male',
  'BMI': 22,
  'Children': 4,
  'Smoker': 'smoker',
  'Insurance_cost': 34556}}

***

## Python Loops: Medical Insurance Estimates vs. Costs Project

You are interested in analyzing medical insurance cost data efficiently without writing repetitive code.

In this project, you will use your new knowledge of Python loops to iterate through and analyze medical insurance cost data.

### Creating a For loop

In [33]:
names = ["Judith", "Abel", "Tyson", "Martha", "Beverley", "David", "Anabel"]
estimated_insurance_costs = [1000.0, 2000.0, 3000.0, 4000.0, 5000.0, 6000.0, 7000.0]
actual_insurance_costs = [1100.0, 2200.0, 3300.0, 4400.0, 5500.0, 6600.0, 7700.0]
total_cost = 0

# Use a for loop to iterate through actual_insurance_costs and add each insurance cost to the variable total_cost
for insurance_cost in actual_insurance_costs:
    total_cost += insurance_cost
print(total_cost)

# Check if the loop worked
total_cost_check = sum(actual_insurance_costs)
if total_cost_check == total_cost:
  print("The for loop worked.")
else:
  print("The for loop did not work.")

average_cost = total_cost/len(actual_insurance_costs)
print(f'Average Insurance Cost: {average_cost} dollars.')

30800.0
The for loop worked.
Average Insurance Cost: 4400.0 dollars.


### Using Range in Loops

Write a for loop with variable i that goes from 0 to len(names).

Inside of the for loop, do the following:
* Create a variable name, which stores names[i].
* Create a variable insurance_cost, which stores actual_insurance_costs[i].
* Print out the insurance cost for each individual

In [34]:
for i in range(len(names)):
  name = names[i]
  insurance_cost = actual_insurance_costs[i]
  print (f'The insurance cost for {name} is {insurance_cost} dollars.')


The insurance cost for Judith is 1100.0 dollars.
The insurance cost for Abel is 2200.0 dollars.
The insurance cost for Tyson is 3300.0 dollars.
The insurance cost for Martha is 4400.0 dollars.
The insurance cost for Beverley is 5500.0 dollars.
The insurance cost for David is 6600.0 dollars.
The insurance cost for Anabel is 7700.0 dollars.


### Conditions inside a Loop

For each individual in names, we want to determine whether their insurance cost is above or below average.

Inside of the for loop, use if, elif, else statements after the print statement to check whether the insurance cost is above, below, or equal to the average. Print out messages for each case.

Observe the output. You should see messages indicating the insurance cost for each of the seven individuals and where their insurance cost stands relative to the average.

In [35]:
for i in range(len(names)):
  name = names[i]
  insurance_cost = actual_insurance_costs[i]
  print (f'The insurance cost for {name} is {insurance_cost} dollars.')
  if insurance_cost > average_cost:
    print(f'The insurance cost for {name} is above the average.')
  elif insurance_cost < average_cost:
    print(f'The insurance cost for {name} is below the average.')
  else : 
    print(f'The insurance cost for {name} is equal to the average.')

The insurance cost for Judith is 1100.0 dollars.
The insurance cost for Judith is below the average.
The insurance cost for Abel is 2200.0 dollars.
The insurance cost for Abel is below the average.
The insurance cost for Tyson is 3300.0 dollars.
The insurance cost for Tyson is below the average.
The insurance cost for Martha is 4400.0 dollars.
The insurance cost for Martha is equal to the average.
The insurance cost for Beverley is 5500.0 dollars.
The insurance cost for Beverley is above the average.
The insurance cost for David is 6600.0 dollars.
The insurance cost for David is above the average.
The insurance cost for Anabel is 7700.0 dollars.
The insurance cost for Anabel is above the average.


### Creating a List Comprehension

If you look closely at actual_insurance_costs and estimated_insurance_costs, you will notice that each of the actual insurance costs is 10% higher than the estimated insurance costs.

Using a list comprehension, create a new list called updated_estimated_costs, which has each element in estimated_insurance_costs multiplied by 11/10.

In [36]:
updated_estimated_costs = [estimated_cost * 11/10 for estimated_cost in estimated_insurance_costs]

print(f'The estimated_insurance_costs: {estimated_insurance_costs}')
print(f'The updated_estimated_costs: {updated_estimated_costs}')
print(f'The actual_insurance_costs: {actual_insurance_costs}')

The estimated_insurance_costs: [1000.0, 2000.0, 3000.0, 4000.0, 5000.0, 6000.0, 7000.0]
The updated_estimated_costs: [1100.0, 2200.0, 3300.0, 4400.0, 5500.0, 6600.0, 7700.0]
The actual_insurance_costs: [1100.0, 2200.0, 3300.0, 4400.0, 5500.0, 6600.0, 7700.0]


### Extra

* Convert the first for loop in the code to a while loop.
* Modify the second for loop so that it also calculates how far above or below the average estimated insurance cost is.

In [37]:
# The while loop can execute a set of statements as long as a condition is true
# Convert the first for loop in the code to a while loop
# First step create an iteration variable and initiaize it in the starting variable: insurance_cost = 0
# Second step transform the ending value of the range into the loop condition: while insurance_cost in actual_insurance_costs:
# Third step advance the loop incrementing the iteration variable: total_cost += 1

insurance_cost = 0
while insurance_cost in actual_insurance_costs:
  total_cost += 1

print(total_cost)

30800.0


In [38]:
# Modify the second for loop so that it also calculates how far above or below the average estimated insurance cost is.
for i in range(len(names)):
  name = names[i]
  insurance_cost = actual_insurance_costs[i]
  difference = insurance_cost - average_cost
  print (f'The difference insurance cost for {name} is {difference} dollars.')

The difference insurance cost for Judith is -3300.0 dollars.
The difference insurance cost for Abel is -2200.0 dollars.
The difference insurance cost for Tyson is -1100.0 dollars.
The difference insurance cost for Martha is 0.0 dollars.
The difference insurance cost for Beverley is 1100.0 dollars.
The difference insurance cost for David is 2200.0 dollars.
The difference insurance cost for Anabel is 3300.0 dollars.


You notice you have positive and negative values telling you the direction of the difference. But it is redundant to have the negative sign present so clean up the code more.

In [39]:
for i in range(len(names)):
  name = names[i]
  insurance_cost = actual_insurance_costs[i]
  abs_difference = abs(insurance_cost - average_cost)
  if insurance_cost > average_cost:
    print(f"{name}'s insurance cost of {insurance_cost} is {abs_difference} above the average cost.")
  elif insurance_cost < average_cost:
    print(f"{name}'s insurance cost of {insurance_cost} is {abs_difference} below the average cost.")
  else : 
    print(f"{name}'s insurance cost of {insurance_cost} is equal to the average cost.")

Judith's insurance cost of 1100.0 is 3300.0 below the average cost.
Abel's insurance cost of 2200.0 is 2200.0 below the average cost.
Tyson's insurance cost of 3300.0 is 1100.0 below the average cost.
Martha's insurance cost of 4400.0 is equal to the average cost.
Beverley's insurance cost of 5500.0 is 1100.0 above the average cost.
David's insurance cost of 6600.0 is 2200.0 above the average cost.
Anabel's insurance cost of 7700.0 is 3300.0 above the average cost.


***

## Python Classes: Medical Insurance Project

You have been asked to develop a system that makes it easier to organize patient data. You will create a class that does the following:
* Takes in patient parameters regarding their personal information
* Contains methods that allow users to update their information
* Gives patients insight into their potential medical fees.

### Building our Constructor

If you look at the code block below, you will see that we have started a class called Patient. It currently has an __init__ method with two class variables: self.name and self.age.

Let's start by adding in some more patient parameters:
* sex: patient's biological identification, 0 for male and 1 for female
* bmi: patient BMI
* num_of_children: number of children patient has
* smoker: patient smoking status, 0 for a non-smoker and 1 for a smoker

Add these into the __init__ method so that we can use them as we create our class methods.

In [40]:
class Patient:
    def __init__(self, name, age, sex, bmi, num_of_children, smoker):
        self.name = name
        self.age = age
        # add more parameters here
        self.sex = sex # 0 for male, 1 for female
        self.bmi = bmi
        self.num_of_children = num_of_children
        self.smoker = smoker # 0 for non-smoker, 1 for smoker

patient1 = Patient("John Doe", 25, 1, 22.2, 0, 0)
print(f'Print out the name of the patient: {patient1.name}')

Print out the name of the patient: John Doe


### Adding Functionality with Methods


In [41]:
class Patient:
    def __init__(self, name, age, sex, bmi, num_of_children, smoker):
        self.name = name
        self.age = age
        self.sex = sex
        self.bmi = bmi
        self.num_of_children = num_of_children
        self.smoker = smoker
    
    # method 1: estimated_insurance_cost(), which takes our instance's parameters 
    # (representing our patient's information) and returns their expected yearly medical fees
    def estimated_insurance_cost(self):
        # ensures that patient data is uploaded using numerical values
        try:
          estimated_cost = 250*self.age - 128 * self.sex + 370 * self.bmi + 425 * self.num_of_children + 24000 * self.smoker - 12500
        except TypeError:
          print("Only integers are allowed.")
        print(f"{self.name}'s estimated insurance costs is {estimated_cost} dollars.")
    
    # method 2: update the age and recalculates the estimated insurance cost
    def update_age(self, new_age):
        self.age = new_age
        print(f"{self.name} is now {self.age} years old.")
        self.estimated_insurance_cost()
    
    # method 3: update the bmi and recalculates the estimated insurance cost
    def update_bmi(self, new_bmi):
        self.bmi = new_bmi
        print(f"{self.name}'s new bmi is {self.bmi}.")
        self.estimated_insurance_cost()
    
    # method 4: update the number of children and recalculates the estimated insurance cost
    def update_num_of_children(self, new_num_children):
        self.num_of_children = new_num_children
        # use control flow program to be grammatically correct child vs children
        if self.num_of_children == 1:
          print(f'{self.name} has {self.num_of_children} child.')
        else:
          print(f'{self.name} has {self.num_of_children} children.')
        self.estimated_insurance_cost()
    
    # method 5: update smoking status and recalculates the estimated insurance cost
    def update_smoking_status(self, new_smoker):
        self.smoker = new_smoker
        if self.num_smoker == 1:
          print(f'{self.name} is now a smoker.')
        else:
          print(f'{self.name} is no longer a smoker.')
        self.estimated_insurance_cost()
    
    # method 6: uses a dictionary to store a patient's information in one convenient variable. 
    # We can use our parameters as the keys and their specific data as the values.
    def patient_profile(self):
        patient_information = {}
        patient_information["Name"] = self.name
        patient_information["Age"] = self.age
        patient_information["Sex"] = self.sex
        patient_information["BMI"] = self.bmi
        patient_information["Number of Children"] = self.num_of_children
        patient_information["Smoker"] = self.smoker
        
        # Remember that in patient_profile() we used a return statement rather than a print statement. 
        # In order to see our dictionary outputted, we must wrap a print statement around our method call.
        return print(f"Patient's information: {patient_information}")

**Test** some scenarios for the class

if not type(patient1.name) is str:
raise TypeError("Only strings are allowed")

if not type(patient1.age) is int:
raise TypeError("Only integers are allowed")

if not type(patient1.sex) is int:
raise TypeError("Only integers are allowed")

if not type(patient1.bmi) is float:
raise TypeError("Only integers are allowed")

if not type(patient1.num_of_children) is int:
raise TypeError("Only integers are allowed")

if not type(patient1.smoker) is int:
raise TypeError("Only integers are allowed")

Update the class so that users can upload lists of patient data rather than just individual numbers.

In [42]:
patient1 = Patient("John Doe", 25, 1, 22.2, 0, 0)

In [43]:
patient1.update_age('26')

John Doe is now 26 years old.
Only integers are allowed.


UnboundLocalError: cannot access local variable 'estimated_cost' where it is not associated with a value

In [44]:
patient1.patient_profile()

Patient's information: {'Name': 'John Doe', 'Age': '26', 'Sex': 1, 'BMI': 22.2, 'Number of Children': 0, 'Smoker': 0}


***

## Python Control Flow: Medical Insurance Project

In this project, you will examine how factors such as age, sex, BMI, number of children, and smoking status contribute to medical insurance costs.

You will apply your knowledge of Python control flow to write code that gives people advice on how to lower their medical insurance costs.

In general, insurance costs are higher for smokers as well as people with a higher BMI. We can use the data from the variables smoker and bmi to provide advice on how to lower insurance costs.

According to the WHO (World Health Organization), here are the nutritional statuses for various BMI ranges:
* BMI > 30: obese
* BMI >= 25 and BMI <= 30: overweight
* BMI >= 18.5 and BMI < 25: normal weight
* BMI < 18.5: underweight

In [45]:
# Function analyze_smoker
def analyze_smoker(smoker_status):
    if smoker_status == 1:
        print("To lower your cost, you should consider quitting smoking.")
    else:
        print("Smoking is not an issue for you.")

# Function analyze_bmi
def analyze_bmi(bmi_value):
    if bmi_value > 30:
        print("Your BMI is in the obese range. To lower your cost, you should significantly lower your BMI.")
    elif bmi_value >= 25 and bmi_value <= 30:
        print("Your BMI is in the overweight range. To lower your cost, you should lower your BMI.")
    elif bmi_value >= 18.5 and bmi_value < 25:
        print("Your BMI is in a healthy range.")
    else:
        print("Your BMI is in the underweight range. Increasing your BMI will not help lower your cost, but it will improve your health.")

# Function to estimate insurance cost:
def estimate_insurance_cost(name, age, sex, bmi, num_of_children, smoker):
    estimated_cost = 250*age - 128*sex + 370*bmi + 425*num_of_children + 24000*smoker - 12500
    print(f"{name}'s estimated ensurance cost: {estimated_cost} dollars.")
    # make function call to `analyze_smoker()` here
    analyze_smoker(smoker)
    # make function call to `analyze_bmi()` here
    analyze_bmi(bmi)
    return estimated_cost

# Estimate Keanu's insurance cost
keanu_insurance_cost = estimate_insurance_cost(name = 'Keanu', age = 29, sex = 1, bmi = 26.2, num_of_children = 3, smoker = 1)

Keanu's estimated ensurance cost: 29591.0 dollars.
To lower your cost, you should consider quitting smoking.
Your BMI is in the overweight range. To lower your cost, you should lower your BMI.
