# Python Syntax: Medical Insurance Project - Lesson 1 - Python Basics

Suppose you are a medical professional curious about how certain factors contribute to medical insurance costs. Using a formula that estimates a person's yearly insurance costs, you will investigate how different factors such as age, sex, BMI, etc. affect the prediction.

## Setting up Factors

1. Our first step is to create the variables for each factor we will consider when estimating medical insurance costs.

   These are the variables we will need to create:
   - `age`: age of the individual in years
   - `sex`: 0 for female, 1 for male*
   - `bmi`: individual's body mass index
   - `num_of_children`: number of children the individual has
   - `smoker`: 0 for a non-smoker, 1 for a smoker
   
   In the code block below, create the following variables for a **28**-year-old, **nonsmoking woman** who has **three children** and a **BMI** of **26.2**.
   
   **Note**: We are using this [medical insurance dataset](https://www.kaggle.com/mirichoi0218/insurance) as a guide, which unfortunately does not include data for non-binary individuals.

In [1]:
# create the initial variables below
age = 28
sex = 0
bmi = 26.2
num_of_children = 3
smoker = 0

## Working with the Formula

2. After the declaration of the variables, create a variable called `insurance_cost` that utilizes the following formula:

   $$
   \begin{aligned}
   insurance\_cost = 250*age - 128*sex \\
   + 370*bmi + 425*num\_of\_children \\
   + 24000*smoker - 12500 \\
   \end{aligned}
   $$

In [2]:
# Add insurance estimate formula below
insurance_cost = 250 * age - 128 * sex + 370 * bmi + 425 * num_of_children + 24000 * smoker - 12500

3. Let's display this value in an informative way. Print out the following string in the kernel:

   ```
   This person's insurance cost is {insurance_cost} dollars.
   ```
   
   You will need to use string concatenation, including the `str()` function to print out the `insurance_cost`.

In [4]:
print ("This person's insurance cost is " + str(insurance_cost) + " dollars.")

This person's insurance cost is 5469.0 dollars.


## Looking at Age Factor

4. We have seen how our formula can estimate costs for one individual. Now let's play with some individual factors to see what role each one plays in our estimation!

   Let's start with the `age` factor. Using a plus-equal operator, add 4 years to our `age` variable.

In [6]:
age += 4

5. Now that we have changed our `age` value, we want to recalculate our insurance cost. Declare a new variable called `new_insurance_cost` in the code block below.

   Make sure you leave the `insurance_cost` variable the same as in Task 2. We will use it later in our program!

In [7]:
new_insurance_cost = 250 * age - 128 * sex + 370 * bmi + 425 * num_of_children + 24000 * smoker - 12500

6. Next, we want to find the difference between our `new_insurance_cost` and `insurance_cost`. To do this, let's create a new variable called `change_in_insurance_cost` and set it equal to the difference between `new_insurance_cost` and `insurance_cost`.

   Note: depending on the order that we subtract (eg., `new_insurance_cost - insurance_cost` vs. `insurance_cost - new_insurance_cost`), we'll get a positive or negative version of the same number. To make this difference interpretable, let's calculate `new_insurance cost - insurance_cost`. Then we can say, "people who are four years older have estimated insurance costs that are `change_in_insurance_cost` dollars different, where the sign of `change_in_insurance_cost` tells us whether the cost is higher or lower".

In [8]:
change_in_insurance_cost = new_insurance_cost - insurance_cost

7. We want to display this information in an informative way similar to the output from instruction 3. In the code block below, print the following string, where `XXX` is replaced by the value of `change_in_insurance_cost`:

   ```
   The change in cost of insurance after increasing the age by 4 years is XXX dollars.
   ```
   
   Doing this will tell us how 4 years in age affects medical insurance cost estimates assuming that all other variables remain the same.
   
   You will need to concatenate strings and use the `str()` method.

In [9]:
print("The change in cost of insurance after increasing the age by 4 years is " + str(change_in_insurance_cost) + " dollars.")

The change in cost of insurance after increasing the age by 4 years is 1000.0 dollars.


## Looking at BMI Factor

8. Now that you have looked at the age factor, let's move onto another one: BMI. First, we have to redefine our `age` variable to be its original value.

   Set `age` to `28`. This will reset its value and allow us to focus on just the change in the BMI factor moving forward.
   
   On the next line, using the plus-equal operator, add `3.1` to our `bmi` variable.

In [10]:
age = 28
bmi += 3.1

9. Now let's find out how a change in BMI affects insurance costs. Our next steps are pretty much the same as we have done before when looking at `age`.
   1. Below the line where `bmi` was increased by `3.1`, rewrite the insurance cost formula and assign it to the variable name `new_insurance_cost`.
   2. Save the difference between `new_insurance_cost` and `insurance_cost` in a variable called `change_in_insurance_cost`.
   3. Display the following string in the output terminal, where `XXX` is replaced by the value of `change_in_insurance_cost`:
   
   ```py
   The change in estimated insurance cost after increasing BMI by 3.1 is XXX dollars.
   ```

In [11]:
new_insurance_cost = 250 * age - 128 * sex + 370 * bmi + 425 * num_of_children + 24000 * smoker - 12500
change_in_insurance_cost = new_insurance_cost - insurance_cost
print(f"The change in estimated insurance cost after increasing BMI by 3.1 is {change_in_insurance_cost} dollars.")

The change in estimated insurance cost after increasing BMI by 3.1 is 1147.0 dollars.


## Looking at Male vs. Female Factor

10. Let's look at the effect sex has on medical insurance costs. Before we make any additional changes, first reassign your `bmi` variable back to its original value of `26.2`.

    On a new line of code in the code block below, reassign the value of `sex` to `1`. A reminder that `1` identifies male individuals and `0` identifies female individuals.

In [12]:
bmi = 26.2
sex = 1

11. Perform the steps below!
    1. Rewrite the insurance cost formula and assign it to the variable name `new_insurance_cost`.
    2. Save the difference between `new_insurance_cost` and `insurance_cost` in a variable called `change_in_insurance_cost`.
    3. Display the following string, where `XXX` is replaced by the value of `change_in_insurance_cost`:
    ```
    The change in estimated cost for being male instead of female is XXX dollars.
    ```

In [14]:
new_insurance_cost = 250 * age - 128 * sex + 370 * bmi + 425 * num_of_children + 24000 * smoker - 12500
change_in_insurance_cost = new_insurance_cost - insurance_cost
print(f"The change in estimated insurance for being male instead of female is {change_in_insurance_cost} dollars.")

The change in estimated insurance for being male instead of female is -128.0 dollars.


12. Notice that this time you got a negative value for `change_in_insurance_cost`. Let's think about what that means. We changed the sex variable from `0` (female) to `1` (male) and it decreased the estimated insurance costs.

    This means that men tend to have lower medical costs on average than women. Reflect on the other findings you have dug up from this investigation so far.

## Extra Practice

13. Great job on the project!!!

    So far we have looked at 3 of the 5 factors in the insurance costs formula. The two remaining are `smoker` and `num_of_children`. If you want to keep challenging yourself, spend some time investigating these factors!
    1. Rewrite the insurance cost formula and assign it to the variable name `new_insurance_cost`.
    2. Save the difference between `new_insurance_cost` in a variable called `change_in_insurance_cost`.
    3. Display the information below!

In [15]:
sex = 0
smoker = 1
new_insurance_cost = 250 * age - 128 * sex + 370 * bmi + 425 * num_of_children + 24000 * smoker - 12500
change_in_insurance_cost = new_insurance_cost - insurance_cost
print(f"The change in estimated insurance for being a smoker is {change_in_insurance_cost} dollars.")

The change in estimated insurance for being a smoker is 24000.0 dollars.


In [16]:
smoker = 0
num_of_children = 0
new_insurance_cost = 250 * age - 128 * sex + 370 * bmi + 425 * num_of_children + 24000 * smoker - 12500
change_in_insurance_cost = new_insurance_cost - insurance_cost
print(f"The change in estimated insurance for not having children is {change_in_insurance_cost} dollars.")

The change in estimated insurance for not having children is -1275.0 dollars.


# Python Syntax: Medical Insurance Project - Lesson 2 - Python Functions

In this code, we estimate the medical insurance costs for two individuals, Maria and Omar, based on five variables:

`age`: age of the individual in years

`sex`: 0 for female, 1 for male

`bmi`: individual’s body mass index

`num_of_children`: number of children the individual has

`smoker`: 0 for a non-smoker, 1 for a smoker

These variables are used in the following formula to estimate an individual’s insurance cost (in USD):

   $$
   \begin{aligned}
   insurance\_cost = 250*age - 128*sex \\
   + 370*bmi + 425*num\_of\_children \\
   + 24000*smoker - 12500 \\
   \end{aligned}
   $$

In [1]:
# Initial variables for Maria 
age = 28
sex = 0  
bmi = 26.2
num_of_children = 3
smoker = 0  

# Estimate Maria's insurance cost
insurance_cost = 250*age - 128*sex + 370*bmi + 425*num_of_children + 24000*smoker - 12500

print("The estimated insurance cost for Maria is " + str(insurance_cost) + " dollars.")

# Initial variables for Omar
age = 35
sex = 1 
bmi = 22.2
num_of_children = 0
smoker = 1  

# Estimate Omar's insurance cost 
insurance_cost = 250*age - 128*sex + 370*bmi + 425*num_of_children + 24000*smoker - 12500

print("The estimated insurance cost for Omar is " + str(insurance_cost) + " dollars.")

The estimated insurance cost for Maria is 5469.0 dollars.
The estimated insurance cost for Omar is 28336.0 dollars.


The code used to estimate insurance costs for Maria and Omar looks quite similar – in both cases we calculate the insurance cost using the same formula and then print the output.

This code is a great candidate for a function because it involves repeating almost identical commands in multiple places.

Let’s start by defining a function called `calculate_insurance_cost()`.

Then, inside of `calculate_insurance_cost()`, do the following:

* Create a variable called `estimated_cost`. For now, set this variable equal to a value of `1000`. You’ll add the full formula in the next step.
* Add a print statement that prints `estimated_cost`. You should output a message similar to: `"The estimated insurance cost for this person is xxx dollars."`
* Return `estimated_cost`


In [3]:
def calculate_insurance_cost():
    estimated_cost = 1000
    print(f"The estimated insurance cost for this person is {estimated_cost} dollars.")
    return estimated_cost

The function currently returns a value of 1000. We want it to return our insurance cost formula instead.

Modify the function definition so that it contains five parameters:

* `age`
* `sex`
* `bmi`
* `num_of_children`
* `smoker`

In [4]:
def calculate_insurance_cost(age, sex, bmi, num_of_children, smoker):
    estimated_cost = 1000
    print(f"The estimated insurance cost for this person is {estimated_cost} dollars.")
    return estimated_cost

Now that we have set up the function to take inputs for each of the values needed in the insurance formula, we can make use of them inside of our function.

In `calculate_insurance_cost()`, change the value of estimated_cost from `1000` to our formula for insurance cost.

In [5]:
def calculate_insurance_cost(age, sex, bmi, num_of_children, smoker):
    estimated_cost = 250*age - 128*sex + 370*bmi + 425*num_of_children + 24000*smoker - 12500
    print(f"The estimated insurance cost for this person is {estimated_cost} dollars.")
    return estimated_cost

The function is now properly set up to calculate an individual’s medical insurance costs based on the five variables passed into it. Let’s test this out!

Go to the section of code that estimates Maria’s insurance cost.

Rename `insurance_cost` as `maria_insurance_cost` and set it equal to `calculate_insurance_cost()` with the appropriate values for Maria as arguments.

Now, remove the print statement for Maria since our function will take care of printing the estimated cost for us.

Additionally, remove the five lines of code defining the initial variables for Maria, as we are now passing in these values directly in the function call.

Repeat for Omar.

In [8]:
# Estimate Maria's insurance cost
maria_insurance_cost = calculate_insurance_cost(28, 0, 26.2, 3, 0) 

# Estimate Omar's insurance cost 
omar_insurance_cost = calculate_insurance_cost(35, 1, 22.2, 0, 1)

The estimated insurance cost for this person is 5469.0 dollars.
The estimated insurance cost for this person is 28336.0 dollars.


In the output terminal, notice that it says `"The estimated insurance cost for this person is..."` but it does not specify the actual name of the person.

To fix this, begin by adding an additional parameter called `name` to the function definition.

Next, modify the print statement in the function so that it includes the new `name` parameter, replacing `"this person"` with the actual name of the person.

We must also update our function calls, passing in the `name` variable as an argument.

Update the function call for `maria_insurance_cost`, passing in `name = "Maria"` as an argument.

Do the same for Omar, passing in `name = "Omar"`.

In [9]:
def calculate_insurance_cost(age, sex, bmi, num_of_children, smoker, name):
    estimated_cost = 250*age - 128*sex + 370*bmi + 425*num_of_children + 24000*smoker - 12500
    print(f"The estimated insurance cost for {name} is {estimated_cost} dollars.")
    return estimated_cost

In [None]:
# Estimate Maria's insurance cost
maria_insurance_cost = calculate_insurance_cost(28, 0, 26.2, 3, 0, "Maria") 

# Estimate Omar's insurance cost 
omar_insurance_cost = calculate_insurance_cost(35, 1, 22.2, 0, 1, "Omar")

# Python Syntax: Medical Insurance Project - Lesson 3 - Control Flow

The function `estimate_insurance_cost()` estimates the medical insurance cost for an individual, based on four variables:

* `age`: age of the individual in years
* `sex`: 0 for female, 1 if male
* `num_of_children`: number of children the individual has
* `smoker`: 0 for a non-smoker, 1 for a smoker

These variables are used in the following formula to estimate an individual’s insurance cost:

   $$
   \begin{aligned}
   insurance\_cost = 400*age - 128*sex \\
   + 425*num\_of\_children \\
   + 10000*smoker - 2500 \\
   \end{aligned}
   $$

In [12]:
# Function to estimate insurance cost:
def estimate_insurance_cost(name, age, sex, num_of_children, smoker):
    estimated_cost = 400*age - 128*sex + 425*num_of_children + 10000*smoker - 2500
    print(name + "'s Estimated Insurance Cost: " + str(estimated_cost) + " dollars.")
    return estimated_cost
 
# Estimate Keanu's insurance cost
keanu_insurance_cost = estimate_insurance_cost(name = 'Keanu', age = 29, sex = 1, num_of_children = 3, smoker = 1)

Keanu's Estimated Insurance Cost: 20247 dollars.


Currently, our function prints out the estimated insurance cost based on the values passed into the function. But it doesn’t do much beyond that.

It would be much more helpful if our function could provide more insight into how we can lower our insurance cost. We’ll learn to do exactly that by using control flow – `if`, `elif`, and `else` statements – in our code.

In general, insurance costs are higher for smokers. We can use data from the smoker variable to provide advice on how to lower insurance costs.

Let’s create a function that analyzes an individual’s smoking status.

At the top of your code, define a function called `analyze_smoker()` that takes an input `smoker_status`.

Inside of the `analyze_smoker()` function, write an if/else statement that does the following:

If `smoker_status` is equal to 1, print `"To lower your cost, you should consider quitting smoking."`
Otherwise, print `"Smoking is not an issue for you."`

In [13]:
def analyze_smoker(smoker_status):
    if smoker_status == 1:
        print("To lower your cost, you should consider quitting smoking.")
    else:
        print("Smoking is not an issue for you.")

Now that we’ve written the `analyze_smoker()` function, let’s make use of it.

In the `estimate_insurance_cost()` function, go to the line of code that prints the estimated insurance cost. On the next line, make a function call to `analyze_smoker()`, passing in the smoker variable as an argument.

In [14]:
# Function to estimate insurance cost:
def estimate_insurance_cost(name, age, sex, num_of_children, smoker):
    estimated_cost = 400*age - 128*sex + 425*num_of_children + 10000*smoker - 2500
    print(name + "'s Estimated Insurance Cost: " + str(estimated_cost) + " dollars.")
    analyze_smoker(smoker)
    return estimated_cost
 
# Estimate Keanu's insurance cost
keanu_insurance_cost = estimate_insurance_cost(name = 'Keanu', age = 29, sex = 1, num_of_children = 3, smoker = 1)

Keanu's Estimated Insurance Cost: 20247 dollars.
To lower your cost, you should consider quitting smoking.


# Python Syntax: Medical Insurance Project - Lesson 4 - Python Lists

First, take a look at the code below.

The function estimate_insurance_cost() estimates the medical insurance cost for an individual, based on five variables:

* `age`: age of the individual in years
* `sex`: 0 for female, 1 for male
* `bmi`: individual’s body mass index
* `num_of_children`: number of children the individual has
* `smoker`: 0 for a non-smoker, 1 for a smoker

These variables are used in the following formula to estimate an individual’s insurance cost (in USD):

   $$
   \begin{aligned}
   insurance\_cost = 250*age - 128*sex \\
   + 370*bmi + 425*num\_of\_children \\
   + 24000*smoker - 12500 \\
   \end{aligned}
   $$

In [20]:
# Function to estimate insurance cost:
def estimate_insurance_cost(name, age, sex, bmi, num_of_children, smoker):
    estimated_cost = 250*age - 128*sex + 370*bmi + 425*num_of_children + 24000*smoker - 12500
    print(name + "'s Estimated Insurance Cost: " + str(estimated_cost) + " dollars.")
    return estimated_cost
 
# Estimate Maria's insurance cost
maria_insurance_cost = estimate_insurance_cost(name = "Maria", age = 31, sex = 0, bmi = 23.1, num_of_children = 1, smoker = 0)

# Estimate Rohan's insurance cost
rohan_insurance_cost = estimate_insurance_cost(name = 
"Rohan", age = 25, sex = 1, bmi = 28.5, num_of_children = 3, smoker = 0)

# Estimate Valentina's insurance cost
valentina_insurance_cost = estimate_insurance_cost(name = "Valentina", age = 53, sex = 0, bmi = 31.4, num_of_children = 0, smoker = 1)

Maria's Estimated Insurance Cost: 4222.0 dollars.
Rohan's Estimated Insurance Cost: 5442.0 dollars.
Valentina's Estimated Insurance Cost: 36368.0 dollars.


We want to compare the estimated insurance costs (as calculated by our function) to the actual amounts that Maria, Rohan, and Valentina paid.

Create a list called `names` and fill it with the names of the individuals you are estimating insurance costs for:

* "Maria"
* "Rohan"
* "Valentina"

Next, create a list called `insurance_costs` and fill it with the actual amounts that Maria, Rohan, and Valentina paid for insurance:

* 4150.0
* 5320.0
* 35210.0

In [21]:
names = ["Maria", "Rohan", "Valentina"]
insurance_costs = [4150.0, 5320.0, 35210.0]

Currently the `names` and `insurance_costs` lists are separate, but we want each name to be paired with an insurance cost.

Create a new variable called `insurance_data` that combines `names` and `insurance_costs` using the `zip()` function.

Print this new variable.

In [22]:
insurance_data = zip(names, insurance_costs)
insurance_data = list(insurance_data)
print(insurance_data)

[('Maria', 4150.0), ('Rohan', 5320.0), ('Valentina', 35210.0)]


Next, create an empty list called `estimated_insurance_data`.

This is the list we’ll use to store the estimated insurance costs for our three individuals.

We want to add our estimated insurance data for Maria, Rohan, and Valentina to the `estimated_insurance_data` list.

Use `.append()` to add `("Maria", maria_insurance_cost)` to `estimated_insurance_data`. Do the same for Rohan and Valentina.

Print `estimated_insurance_data`.

In [23]:
estimated_insurance_data = []
estimated_insurance_data.append(("Maria", maria_insurance_cost))
estimated_insurance_data.append(("Rohan", rohan_insurance_cost))
estimated_insurance_data.append(("Valentina", maria_insurance_cost))
print(estimated_insurance_data)

[('Maria', 4222.0), ('Rohan', 5442.0), ('Valentina', 4222.0)]


Now we have two lists. The first one represents the actual insurance cost data and the second one represents the estimated insurance cost data.

However, it’s difficult to know this just by looking at the output. As a data scientist, you want to make sure that your data is clean and easy to understand.

Add to the print statement for `insurance_data` so that it’s clear what the list contains. The output of the print statement should look like:

Here is the actual insurance cost data: [...list output...]

Do the same for the print statement that prints `estimated_insurance_data`. The output should look like:

Here is the estimated insurance cost data: [...list output...]

In [24]:
print(f'Here is the actual insurance cost data: {insurance_data}')
print(f'Here is the estimated insurance cost data: {estimated_insurance_data}')

Here is the actual insurance cost data: [('Maria', 4150.0), ('Rohan', 5320.0), ('Valentina', 35210.0)]
Here is the estimated insurance cost data: [('Maria', 4222.0), ('Rohan', 5442.0), ('Valentina', 4222.0)]


It should be much more clear from the output what each of the two lists represents, helping you better understand the data you’re working with.

You may notice that there are differences between the actual insurance costs and estimated insurance costs. This means that our `estimate_insurance_cost()` function does not calculate insurance costs with 100% accuracy.

Compare the estimated insurance data to the actual insurance data. Do the estimated insurance costs seem to be overestimated or underestimated?

If you’d like additional practice on lists, here are some ways you might extend this project:

* Calculate the difference between the actual insurance cost data and the estimated insurance cost data for each individual, and store the results in a list called `insurance_cost_difference`.
* Estimate the insurance cost for a new individual, Akira, who is a 19 year-old male non-smoker with no children and a BMI of 27.1. Make sure to append his name to `names` and his actual insurance cost, 2930.0, to `insurance_costs`.

In [25]:
insurance_cost_difference = [estimated_insurance_data[0][1]-insurance_data[0][1],
                            estimated_insurance_data[1][1]-insurance_data[1][1],
                            estimated_insurance_data[2][1]-insurance_data[2][1]]
print(insurance_cost_difference)

[72.0, 122.0, -30988.0]


In [26]:
names.append("Akira")
akira_insurance_cost = estimate_insurance_cost(name = "Akira", age = 19, sex = 1, bmi = 27.1, num_of_children = 0, smoker = 0)
insurance_costs.append(akira_insurance_cost)

Akira's Estimated Insurance Cost: 2149.0 dollars.


In [27]:
print(names)
print(insurance_costs)

['Maria', 'Rohan', 'Valentina', 'Akira']
[4150.0, 5320.0, 35210.0, 2149.0]


# Python Syntax: Medical Insurance Project - Lesson 5 - Working with Python Lists

The list `names` stores the names of ten individuals, and `insurance_costs` stores their medical insurance costs.

Let’s add additional data to these lists:

* Append a new individual, "Priscilla", to `names`.
* Append her insurance cost, 8320.0, to `insurance_costs`.

Currently, the `names` and `insurance_costs` lists are separate, but we want each insurance cost to be paired with a name.

Create a new variable called `medical_records` that combines `insurance_costs` and `names` into a list using the zip() function.

Print out `medical_records`.

In [28]:
names = ["Mohamed", "Sara", "Xia", "Paul", "Valentina", "Jide", "Aaron", "Emily", "Nikita", "Paul"]
insurance_costs = [13262.0, 4816.0, 6839.0, 5054.0, 14724.0, 5360.0, 7640.0, 6072.0, 2750.0, 12064.0]

In [29]:
names.append("Priscilla")
insurance_costs.append(8320.0)

In [30]:
medical_records = list(zip(insurance_costs, names))
print(medical_records)

[(13262.0, 'Mohamed'), (4816.0, 'Sara'), (6839.0, 'Xia'), (5054.0, 'Paul'), (14724.0, 'Valentina'), (5360.0, 'Jide'), (7640.0, 'Aaron'), (6072.0, 'Emily'), (2750.0, 'Nikita'), (12064.0, 'Paul'), (8320.0, 'Priscilla')]


Let’s explore our medical data.

We want to see how many medical records we’re dealing with. Create a variable called `num_medical_records` that stores the length of `medical_records`.

Print `num_medical_records` with the following message:

There are {number of medical records} medical records.

In [32]:
num_medical_records = len(medical_records)
print(f'There are {num_medical_records} medical records.')

There are 11 medical records.


Select the first medical record in medical_records, and save it to a variable called `first_medical_record`.

Print `first_medical_record` with the following message:

Here is the first medical record: {first medical record}

In [33]:
first_medical_record = medical_records[0]
print(f'Here is the first medical record: {first_medical_record}')

Here is the first medical record: (13262.0, 'Mohamed')


Sort `medical_records` so that the individuals with the lowest insurance costs appear at the start of the list.

Print the sorted `medical_records` with the following message:

Here are the medical records sorted by insurance cost: {sorted list}

In [35]:
medical_records.sort()
print(f'Here are the medical records sorted by insurance cost: {medical_records}')

Here are the medical records sorted by insurance cost: [(2750.0, 'Nikita'), (4816.0, 'Sara'), (5054.0, 'Paul'), (5360.0, 'Jide'), (6072.0, 'Emily'), (6839.0, 'Xia'), (7640.0, 'Aaron'), (8320.0, 'Priscilla'), (12064.0, 'Paul'), (13262.0, 'Mohamed'), (14724.0, 'Valentina')]


Let’s look at the three cheapest insurance costs in our medical records.

Slice the `medical_records` list, and store the three cheapest insurance costs in a list called `cheapest_three`.

Print `cheapest_three` with the following message:

Here are the three cheapest insurance costs in our medical records: {cheapest three}

In [37]:
cheapest_three = medical_records[0:3]
print(f'Here are the three cheapest insurance costs in our medical records: {cheapest_three}')

Here are the three cheapest insurance costs in our medical records: [(2750.0, 'Nikita'), (4816.0, 'Sara'), (5054.0, 'Paul')]


Let’s look at the three most expensive insurance costs in our medical records.

Slice the `medical_records` list, and store the three most expensive insurance costs in a list called `priciest_three`.

Print `priciest_three` with the following message:

Here are the three most expensive insurance costs in our medical records: {priciest three}


In [39]:
priciest_three = medical_records[-3:]
print(f'Here are the three most expensive insurance costs in our medical records: {priciest_three}')

Here are the three most expensive insurance costs in our medical records: [(12064.0, 'Paul'), (13262.0, 'Mohamed'), (14724.0, 'Valentina')]


Some individuals in our medical records have the same name. For example, the name “Paul” shows up twice.

Count the number of occurrences of “Paul” in the `names` list, and store the result in a variable called `occurrences_paul`.

Print `occurrences_paul` with the following message:

There are {occurrences Paul} individuals with the name Paul in our medical records.

In [40]:
occurrences_paul = names.count("Paul")
print(f'There are {occurrences_paul} individuals with the name Paul in our medical records.')

There are 2 individuals with the name Paul in our medical records.


If you’d like additional practice on lists, here are some ways you might extend this project:

* Sort the medical records alphabetically by name. You’ll have to create a new list using `zip()` to do this.
* Select the medical records starting at index 3 and ending at index 7 and save it in a variable called `middle_five_records`.

In [41]:
medical_records_by_name = list(zip(names, insurance_costs))
medical_records_by_name.sort()

In [42]:
print(medical_records_by_name)

[('Aaron', 7640.0), ('Emily', 6072.0), ('Jide', 5360.0), ('Mohamed', 13262.0), ('Nikita', 2750.0), ('Paul', 5054.0), ('Paul', 12064.0), ('Priscilla', 8320.0), ('Sara', 4816.0), ('Valentina', 14724.0), ('Xia', 6839.0)]


In [44]:
middle_five_records = medical_records[3:8]
print(middle_five_records)

[(5360.0, 'Jide'), (6072.0, 'Emily'), (6839.0, 'Xia'), (7640.0, 'Aaron'), (8320.0, 'Priscilla')]


# Python Syntax: Medical Insurance Project - Lesson 6 - Python Loops

First, let’s take a look at the three lists below.

* `names` stores the names of seven individuals.
* `estimated_insurance_costs` stores the estimated medical insurance costs for the individuals.
* `actual_insurance_costs` stores the actual insurance costs paid by the individuals.

We want to calculate the average insurance cost each person paid. We’ll start by adding up all of the insurance costs.

Create a variable `total_cost` and initialize it to 0.

Use a `for` loop to iterate through `actual_insurance_costs` and add each insurance cost to the variable `total_cost`.

After the `for` loop, create a variable called `average_cost` that stores the `total_cost` divided by the length of the `actual_insurance_costs` list.

Print `average_cost` with the following message:

Average Insurance Cost: average_cost dollars.

In [45]:
names = ["Judith", "Abel", "Tyson", "Martha", "Beverley", "David", "Anabel"]
estimated_insurance_costs = [1000.0, 2000.0, 3000.0, 4000.0, 5000.0, 6000.0, 7000.0]
actual_insurance_costs = [1100.0, 2200.0, 3300.0, 4400.0, 5500.0, 6600.0, 7700.0]

In [46]:
total_cost = 0

for actual_cost in actual_insurance_costs:
    total_cost += actual_cost

average_cost = total_cost / len(actual_insurance_costs)

print(f'Average Insurance Cost: {average_cost} dollars.')

Average Insurance Cost: 4400.0 dollars.


For each individual in `names`, we want to determine whether their insurance cost is above or below average.

Write a `for` loop with variable `i` that goes from `0` to `len(names)`.

Inside of the for loop, do the following:

* Create a variable `name`, which stores `names[i]`.
* Create a variable `insurance_cost`, which stores `actual_insurance_costs[i]`.
* Print out the insurance cost for each individual, with the following message:
The insurance cost for {name} is {insurance_cost} dollars.

Inside of the `for` loop, use `if`, `elif`, `else` statements after the print statement to check whether the insurance cost is above, below, or equal to the average. Print out messages for each case:

When `insurance_cost` is higher than the average, print out the following:
The insurance cost for {name} is above average.

When `insurance_cost` is lower than the average, print out the following:
The insurance cost for {name} is below average.

Otherwise, print out the following:
The insurance cost for {name} is equal to the average.


In [47]:
for i in range(0, len(names)):
    name = names[i]
    insurance_cost = actual_insurance_costs[i]
    print(f'The insurance cost for {name} is {insurance_cost} dollars.')
    
    if insurance_cost > average_cost:
        print(f'The insurance cost for {name} is above average.')
    elif insurance_cost < average_cost:
        print(f'The insurance cost for {name} is below average.')
    else:
        print(f'The insurance cost for {name} is equal to the average.')

The insurance cost for Judith is 1100.0 dollars.
The insurance cost for Judith is below average.
The insurance cost for Abel is 2200.0 dollars.
The insurance cost for Abel is below average.
The insurance cost for Tyson is 3300.0 dollars.
The insurance cost for Tyson is below average.
The insurance cost for Martha is 4400.0 dollars.
The insurance cost for Martha is equal to the average.
The insurance cost for Beverley is 5500.0 dollars.
The insurance cost for Beverley is above average.
The insurance cost for David is 6600.0 dollars.
The insurance cost for David is above average.
The insurance cost for Anabel is 7700.0 dollars.
The insurance cost for Anabel is above average.


If you look closely at `actual_insurance_costs` and `estimated_insurance_costs`, you will notice that each of the actual insurance costs are 10% higher than the estimated insurance costs.

Using a list comprehension, create a new list called `updated_estimated_costs`, which has each element in `estimated_insurance_costs` multiplied by 11/10.

Print `updated_estimated_costs`.

You should see that the list now looks the same as `actual_insurance_costs`.

In [48]:
updated_estimated_costs = [estimated_cost * 11/10 for estimated_cost in estimated_insurance_costs]
print(updated_estimated_costs)

[1100.0, 2200.0, 3300.0, 4400.0, 5500.0, 6600.0, 7700.0]


If you’d like extra practice with Python loops, here are some suggestions to get you started:

* Convert the first `for` loop in the code to a `while` loop.
* Modify the second `for` loop so that it also calculates how far above or below the average the estimated insurance cost is.

In [52]:
total_cost = 0
i = 0

while i < len(actual_insurance_costs):
    total_cost += actual_insurance_costs[i]
    i += 1

average_cost = total_cost / len(actual_insurance_costs)

print(f'Average Insurance Cost: {average_cost} dollars.')

Average Insurance Cost: 4400.0 dollars.


In [54]:
for i in range(0, len(names)):
    name = names[i]
    insurance_cost = actual_insurance_costs[i]
    print(f'The insurance cost for {name} is {insurance_cost} dollars.')
    
    difference = insurance_cost - average_cost
    
    if insurance_cost > average_cost:
        print(f'The insurance cost for {name} is above average. It is {difference} dollars above the average.')
    elif insurance_cost < average_cost:
        print(f'The insurance cost for {name} is below average. It is {difference} dollars below the average.')
    else:
        print(f'The insurance cost for {name} is equal to the average.')

The insurance cost for Judith is 1100.0 dollars.
The insurance cost for Judith is below average. It is -3300.0 dollars below the average.
The insurance cost for Abel is 2200.0 dollars.
The insurance cost for Abel is below average. It is -2200.0 dollars below the average.
The insurance cost for Tyson is 3300.0 dollars.
The insurance cost for Tyson is below average. It is -1100.0 dollars below the average.
The insurance cost for Martha is 4400.0 dollars.
The insurance cost for Martha is equal to the average.
The insurance cost for Beverley is 5500.0 dollars.
The insurance cost for Beverley is above average. It is 1100.0 dollars above the average.
The insurance cost for David is 6600.0 dollars.
The insurance cost for David is above average. It is 2200.0 dollars above the average.
The insurance cost for Anabel is 7700.0 dollars.
The insurance cost for Anabel is above average. It is 3300.0 dollars above the average.


# Python Syntax: Medical Insurance Project - Lesson 7 - Python Strings

First, take a look at the code below.

The string `medical_data` stores the medical records for ten individuals. Each record is separated by a `;` and contains the name, age, BMI (body mass index), and insurance cost for an individual, in that order.

Print `medical_data` to see the output in the terminal

In [1]:
medical_data = \
"""Marina Allison   ,27   ,   31.1 , 
#7010.0   ;Markus Valdez   ,   30, 
22.4,   #4050.0 ;Connie Ballard ,43 
,   25.3 , #12060.0 ;Darnell Weber   
,   35   , 20.6   , #7500.0;
Sylvie Charles   ,22, 22.1 
,#3022.0   ;   Vinay Padilla,24,   
26.9 ,#4620.0 ;Meredith Santiago, 51   , 
29.3 ,#16330.0;   Andre Mccarty, 
19,22.7 , #2900.0 ; 
Lorena Hodson ,65, 33.1 , #19370.0; 
Isaac Vu ,34, 24.8,   #7045.0"""

In [2]:
print(medical_data)

Marina Allison   ,27   ,   31.1 , 
#7010.0   ;Markus Valdez   ,   30, 
22.4,   #4050.0 ;Connie Ballard ,43 
,   25.3 , #12060.0 ;Darnell Weber   
,   35   , 20.6   , #7500.0;
Sylvie Charles   ,22, 22.1 
,#3022.0   ;   Vinay Padilla,24,   
26.9 ,#4620.0 ;Meredith Santiago, 51   , 
29.3 ,#16330.0;   Andre Mccarty, 
19,22.7 , #2900.0 ; 
Lorena Hodson ,65, 33.1 , #19370.0; 
Isaac Vu ,34, 24.8,   #7045.0


We want the insurance costs to be represented in US dollars.

Replace all instances of `#` in `medical_data` with `$`. Store the result in a variable called `updated_medical_data`.

Print `updated_medical_data`.

In [4]:
updated_medical_data = medical_data.replace('#', '$')
print(updated_medical_data)

Marina Allison   ,27   ,   31.1 , 
$7010.0   ;Markus Valdez   ,   30, 
22.4,   $4050.0 ;Connie Ballard ,43 
,   25.3 , $12060.0 ;Darnell Weber   
,   35   , 20.6   , $7500.0;
Sylvie Charles   ,22, 22.1 
,$3022.0   ;   Vinay Padilla,24,   
26.9 ,$4620.0 ;Meredith Santiago, 51   , 
29.3 ,$16330.0;   Andre Mccarty, 
19,22.7 , $2900.0 ; 
Lorena Hodson ,65, 33.1 , $19370.0; 
Isaac Vu ,34, 24.8,   $7045.0


We want to calculate the number of medical records in our data.

Create a variable called `num_records` and initialize it at 0.

Next, write a `for` loop to iterate through the `updated_medical_data string`. Inside of the loop, add `1` to `num_records` when the current character is equal to `$`.

Outside of the loop, print `num_records` with the following message:

There are {num_records} medical records in the data.

In [6]:
num_records = 0
for ch in updated_medical_data:
    if ch == '$':
        num_records += 1
print(f'There are {num_records} medical records in the data.')

There are 10 medical records in the data.


The medical data in its current form is difficult to analyze. An essential job for a data scientist is to clean up data so that it’s easy to work with.

Let’s start off by splitting the `updated_medical_data` string into a list of each medical record. Remember that each medical record is separated by a `;` in the string.

Store the result in a variable called `medical_data_split` and print this variable.

In [8]:
medical_data_split = updated_medical_data.split(';')
print(medical_data_split)

['Marina Allison   ,27   ,   31.1 , \n$7010.0   ', 'Markus Valdez   ,   30, \n22.4,   $4050.0 ', 'Connie Ballard ,43 \n,   25.3 , $12060.0 ', 'Darnell Weber   \n,   35   , 20.6   , $7500.0', '\nSylvie Charles   ,22, 22.1 \n,$3022.0   ', '   Vinay Padilla,24,   \n26.9 ,$4620.0 ', 'Meredith Santiago, 51   , \n29.3 ,$16330.0', '   Andre Mccarty, \n19,22.7 , $2900.0 ', ' \nLorena Hodson ,65, 33.1 , $19370.0', ' \nIsaac Vu ,34, 24.8,   $7045.0']


Our data is now stored in a list, but it is still hard to read. Let’s split each medical record into its own list.

First, define an empty list called `medical_records`.

Next, iterate through `medical_data_split` and for each record, split the string after each comma (,) and append the split string to `medical_records`.

Print `medical_records` after the loop.

In [9]:
medical_records = []
for record in medical_data_split:
    medical_records.append(record.split(','))
print(medical_records)

[['Marina Allison   ', '27   ', '   31.1 ', ' \n$7010.0   '], ['Markus Valdez   ', '   30', ' \n22.4', '   $4050.0 '], ['Connie Ballard ', '43 \n', '   25.3 ', ' $12060.0 '], ['Darnell Weber   \n', '   35   ', ' 20.6   ', ' $7500.0'], ['\nSylvie Charles   ', '22', ' 22.1 \n', '$3022.0   '], ['   Vinay Padilla', '24', '   \n26.9 ', '$4620.0 '], ['Meredith Santiago', ' 51   ', ' \n29.3 ', '$16330.0'], ['   Andre Mccarty', ' \n19', '22.7 ', ' $2900.0 '], [' \nLorena Hodson ', '65', ' 33.1 ', ' $19370.0'], [' \nIsaac Vu ', '34', ' 24.8', '   $7045.0']]


Our data is now slightly more readable. However, it is not properly formatted – it contains unnecessary whitespace.

To fix this, let’s start by creating an empty list called `medical_records_clean`.

Next, use a for loop to iterate through `medical_records`.

Inside of the loop, create an empty list called `record_clean`. We’ll use this list to store a formatted version of each medical record.

After the `record_clean` variable, create a nested `for` loop that goes through each record:

Inside of this loop, append `item.strip()` to `record_clean` to remove any whitespace from the string.

Finally, we need to add each cleaned up record to `medical_records_clean`.

Outside of the nested for loop, append `record_clean` to `medical_records_clean`.

Print `medical_records_clean` outside of the for loops to see the output.

You should see output that is formatted and much easier to read.

In [11]:
medical_records_clean = []

for record in medical_records:
    record_clean = []
    for item in record:
        record_clean.append(item.strip())
    medical_records_clean.append(record_clean)

print(medical_records_clean)

[['Marina Allison', '27', '31.1', '$7010.0'], ['Markus Valdez', '30', '22.4', '$4050.0'], ['Connie Ballard', '43', '25.3', '$12060.0'], ['Darnell Weber', '35', '20.6', '$7500.0'], ['Sylvie Charles', '22', '22.1', '$3022.0'], ['Vinay Padilla', '24', '26.9', '$4620.0'], ['Meredith Santiago', '51', '29.3', '$16330.0'], ['Andre Mccarty', '19', '22.7', '$2900.0'], ['Lorena Hodson', '65', '33.1', '$19370.0'], ['Isaac Vu', '34', '24.8', '$7045.0']]


Our data is now clean and ready for analysis.

For example, to print out the names of each of the ten individuals, we can use the following loop:

<code>for record in medical_records_clean:
    print(record[0])<code>

You want all of the names in the medical records to be in uppercase characters.

In the `for` loop, update `record[0]` before the print statement so that all of the characters are uppercase.

In [12]:
for record in medical_records_clean:
    print(record[0].upper())

MARINA ALLISON
MARKUS VALDEZ
CONNIE BALLARD
DARNELL WEBER
SYLVIE CHARLES
VINAY PADILLA
MEREDITH SANTIAGO
ANDRE MCCARTY
LORENA HODSON
ISAAC VU


Let’s store each name, age, BMI, and insurance cost in separate lists.

To start, create four empty lists:

* `names`
* `ages`
* `bmis`
* `insurance_costs`

Next, iterate through `medical_records_clean` and for each record:

* Append the name to `names`.
* Append the age to `ages`.
* Append the BMI to `bmis`.
* Append the insurance cost to `insurance_costs`.

Print `names`, `ages`, `bmis`, and `insurance_costs` outside of the loop.

In [15]:
names = []
ages = []
bmis = []
insurance_costs = []

for record in medical_records_clean:
    names.append(record[0])
    ages.append(record[1])
    bmis.append(record[2])
    insurance_costs.append(record[3])
    
print(names)
print(ages)
print(bmis)
print(insurance_costs)

['Marina Allison', 'Markus Valdez', 'Connie Ballard', 'Darnell Weber', 'Sylvie Charles', 'Vinay Padilla', 'Meredith Santiago', 'Andre Mccarty', 'Lorena Hodson', 'Isaac Vu']
['27', '30', '43', '35', '22', '24', '51', '19', '65', '34']
['31.1', '22.4', '25.3', '20.6', '22.1', '26.9', '29.3', '22.7', '33.1', '24.8']
['$7010.0', '$4050.0', '$12060.0', '$7500.0', '$3022.0', '$4620.0', '$16330.0', '$2900.0', '$19370.0', '$7045.0']


Now that all of our data is in separate lists, we can easily perform analysis on that data. Let’s calculate the average BMI in our dataset.

First, create a variable called `total_bmi and` set it equal to 0.

Next, use a for loop to iterate through bmis and add each bmi to `total_bmi`.

After the for loop, create a variable called `average_bmi` that stores the `total_bmi` divided by the length of the bmis list.

Print out average_bmi with the following message:

Average BMI: {average_bmi}

In [17]:
total_bmi = 0

for bmi in bmis:
    total_bmi += float(bmi)
    
average_bmi = total_bmi / len(bmis)

print(f'Average BMI: {average_bmi}')

Average BMI: 25.830000000000002


If you’d like extra practice with Python strings, here are some suggestions to get you started:

* Calculate the average insurance cost in `insurance_costs`. You will have to remove the `$` in order to calculate this.
* Write a for loop that outputs a string for each individual in the following format:

Marina is 27 years old with a BMI of 31.1 and an insurance cost of $7010.0.

...
...

In [19]:
total_insurance_costs = 0

for insurance_cost in insurance_costs:
    total_insurance_costs += float(insurance_cost.replace('$',''))
    
average_insurance_cost = total_insurance_costs / len(insurance_costs)

print(f'Average Insurance Cost: {average_insurance_cost}')

Average Insurance Cost: 8390.7


In [22]:
for record in medical_records_clean:
    print('{name} is {age} years old with a BMI of {bmi} and an insurance cost of {insurance_cost}'.format(name = record[0], age = record[1], bmi = record[2], insurance_cost = record[3]))

Marina Allison is 27 years old with a BMI of 31.1 and an insurance cost of $7010.0
Markus Valdez is 30 years old with a BMI of 22.4 and an insurance cost of $4050.0
Connie Ballard is 43 years old with a BMI of 25.3 and an insurance cost of $12060.0
Darnell Weber is 35 years old with a BMI of 20.6 and an insurance cost of $7500.0
Sylvie Charles is 22 years old with a BMI of 22.1 and an insurance cost of $3022.0
Vinay Padilla is 24 years old with a BMI of 26.9 and an insurance cost of $4620.0
Meredith Santiago is 51 years old with a BMI of 29.3 and an insurance cost of $16330.0
Andre Mccarty is 19 years old with a BMI of 22.7 and an insurance cost of $2900.0
Lorena Hodson is 65 years old with a BMI of 33.1 and an insurance cost of $19370.0
Isaac Vu is 34 years old with a BMI of 24.8 and an insurance cost of $7045.0


# Python Syntax: Medical Insurance Project - Lesson 8 - Python Dictionaries

We would like to keep a record of medical patients and their insurance costs.

First, create an empty dictionary called `medical_costs`.

Let’s populate our medical_costs dictionary by adding the following key-value pairs:

* Add `"Marina"` to `medical_costs` as a key with a value of `6607.0`.
* Add `"Vinay"` to `medical_costs` as a key with a value of `3225.0`.

In [2]:
medical_costs = {}

medical_costs['Marina'] = 6607.0
medical_costs['Vinay'] = 3225.0

Using one line of code, add the following three patients to the medical_costs dictionary:

* `"Connie"`, with an insurance cost of `8886.0`
* `"Isaac"`, with an insurance cost of `16444.0`
* `"Valentina"`, with an insurance cost of `6420.0`

Print `medical_costs`. Make sure the dictionary is what you expected.

In [3]:
medical_costs.update({"Connie": 8886.0, "Isaac": 16444.0, "Valentina": 6420.0})
print(medical_costs)

{'Marina': 6607.0, 'Vinay': 3225.0, 'Connie': 8886.0, 'Isaac': 16444.0, 'Valentina': 6420.0}


You notice that `Vinay`’s insurance cost was incorrectly inputted. Update the value associated with `Vinay` to `3325.0`.

Print the updated dictionary.

In [6]:
medical_costs['Vinay'] = 3325.0
print(medical_costs)

{'Marina': 6607.0, 'Vinay': 3325.0, 'Connie': 8886.0, 'Isaac': 16444.0, 'Valentina': 6420.0}


Let’s calculate the average medical cost of each patient. Create a variable called `total_cost` and set it equal to `0`.

Next, iterate through the values in `medical_costs` and add each value to the `total_cost` variable.

After the loop, create a variable called `average_cost` that stores the `total_cost` divided by the length of the `medical_costs` dictionary.

Print `average_cost` with the following message:

Average Insurance Cost: {average_cost}

In [7]:
total_cost = 0

for cost in medical_costs.values():
    total_cost += cost

average_cost = total_cost / len(medical_costs)

In [8]:
print(f"Average Insurance Cost: {average_cost}")

Average Insurance Cost: 8336.4


### List Comprehension to Dictionary
You have been asked to create a second dictionary that maps patient names to their ages.

First, create two lists called `names` and `ages` with the following data:

| names 	| ages  | 
| --------- | ----- | 
| Marina	| 27    | 
| Vinay	    | 24    | 
| Connie	| 43    | 
| Isaac	    | 35    | 
| Valentina	| 52    | 

Next, create a variable called `zipped_ages` that is a zipped list of pairs between the `names` list and the `ages` list.

In [9]:
names = ['Marina', 'Vinay', 'Connie', 'Isaac', 'Valentina']
ages = [27, 24, 43, 35, 52]

zipped_ages = zip(names, ages)

Create a dictionary called `names_to_ages` by using a list comprehension that iterates through `zipped_ages` and turns each pair into a key : value item.

Print `names_to_ages` to see the result.

In [10]:
names_to_ages = {name : age for name, age in zipped_ages}
print(names_to_ages)

{'Marina': 27, 'Vinay': 24, 'Connie': 43, 'Isaac': 35, 'Valentina': 52}


Use `.get()` to get the value of Marina’s age and store it in a variable called `marina_age`. Use `None` as a default value if the key doesn’t exist.

Print `marina_age` with the following message:

Marina's age is {marina_age}

In [11]:
marina_age = names_to_ages.get('Marina', None)

print(f"Marina's age is {marina_age}")

Marina's age is 27


### Using a Dictionary to create a medical database

Let’s create a third dictionary to represent a database of medical records that contains information such as a patient’s name, age, sex, gender, BMI, number of children, smoker status, and insurance cost.

First, create an empty dictionary called `medical_records`.

Next, add `"Marina"` to `medical_records` as a key with the value being a dictionary of medical data:

`{"Age": 27, "Sex": "Female", "BMI": 31.1, "Children": 2, "Smoker": "Non-smoker", "Insurance_cost": 6607.0}`

Do the same for the following individuals:

| Name	    | Age	| Sex	    | BMI	| Children	| Smoker	    | Insurance Cost | 
| --------- | ----- | --------- | ----- | --------- | ------------- | -------------- |
| Vinay	    | 24	| Male	    | 26.9	| 0	        | Non-smoker	| 3225.0         | 
| Connie	| 43	| Female	| 25.3	| 3	        | Non-smoker	| 8886.0         | 
| Isaac	    | 35	| Male	    | 20.6	| 4	        | Smoker	    | 16444.0        | 
| Valentina	| 52	| Female	| 18.7	| 1	        | Non-smoker	| 6420.0         | 

In [12]:
medical_records = {}

medical_records['Marina'] = {"Age": 27, "Sex": "Female", "BMI": 31.1, "Children": 2, "Smoker": "Non-smoker", "Insurance_cost": 6607.0}

new_data = {
    "Vinay": {
        "Age": 24,
        "Sex": "Male",
        "BMI": 26.9,
        "Children": 0,
        "Smoker": "Non-smoker",
        "Insurance_cost": 3225.0
    },
    "Connie": {
        "Age": 43,
        "Sex": "Female",
        "BMI": 25.3,
        "Children": 3,
        "Smoker": "Non-smoker",
        "Insurance_cost": 8886.0
    },
    "Isaac": {
        "Age": 35,
        "Sex": "Male",
        "BMI": 20.6,
        "Children": 4,
        "Smoker": "Smoker",
        "Insurance_cost": 16444.0
    },
    "Valentina": {
        "Age": 52,
        "Sex": "Female",
        "BMI": 18.7,
        "Children": 1,
        "Smoker": "Non-smoker",
        "Insurance_cost": 6420.0
    }
}

# Update the medical_records dictionary with the new data
medical_records.update(new_data)

print(medical_records)

{'Marina': {'Age': 27, 'Sex': 'Female', 'BMI': 31.1, 'Children': 2, 'Smoker': 'Non-smoker', 'Insurance_cost': 6607.0}, 'Vinay': {'Age': 24, 'Sex': 'Male', 'BMI': 26.9, 'Children': 0, 'Smoker': 'Non-smoker', 'Insurance_cost': 3225.0}, 'Connie': {'Age': 43, 'Sex': 'Female', 'BMI': 25.3, 'Children': 3, 'Smoker': 'Non-smoker', 'Insurance_cost': 8886.0}, 'Isaac': {'Age': 35, 'Sex': 'Male', 'BMI': 20.6, 'Children': 4, 'Smoker': 'Smoker', 'Insurance_cost': 16444.0}, 'Valentina': {'Age': 52, 'Sex': 'Female', 'BMI': 18.7, 'Children': 1, 'Smoker': 'Non-smoker', 'Insurance_cost': 6420.0}}


The `medical_records` dictionary acts like a database of medical records. Let’s access a specific piece of data in `medical_records`.

Print out Connie’s insurance cost with the following message:

Connie's insurance cost is X dollars.

In [15]:
connie_insurance_cost = medical_records["Connie"]["Insurance_cost"]
print(f"Connie's insurance cost is {connie_insurance_cost} dollars.")

Connie's insurance cost is 8886.0 dollars.


Vinay has moved to a new country and we no longer want to include him in our medical records.

Remove `Vinay` from `medical_records`.

In [16]:
medical_records.pop('Vinay')
print(medical_records)

{'Marina': {'Age': 27, 'Sex': 'Female', 'BMI': 31.1, 'Children': 2, 'Smoker': 'Non-smoker', 'Insurance_cost': 6607.0}, 'Connie': {'Age': 43, 'Sex': 'Female', 'BMI': 25.3, 'Children': 3, 'Smoker': 'Non-smoker', 'Insurance_cost': 8886.0}, 'Isaac': {'Age': 35, 'Sex': 'Male', 'BMI': 20.6, 'Children': 4, 'Smoker': 'Smoker', 'Insurance_cost': 16444.0}, 'Valentina': {'Age': 52, 'Sex': 'Female', 'BMI': 18.7, 'Children': 1, 'Smoker': 'Non-smoker', 'Insurance_cost': 6420.0}}


Let’s take a closer look at each patient’s medical record.

Use a `for` loop to iterate through the items of `medical_records`. For each key-value pair, print out a string that looks like the following:

{Name} is a {Age} year old {Sex} {Smoker} with a BMI of {BMI} and insurance cost of {Insurance_cost}

In [19]:
name = ''
age = 0
sex = ''
smoker = ''
bmi = 0.0 
insurance_cost = 0.0

for name, data in medical_records.items():
    name = name
    age = data["Age"]
    sex = data["Sex"]
    smoker = data["Smoker"]
    bmi = data["BMI"]
    insurance_cost = data["Insurance_cost"]
        
    print(f"{name} is a {age} year old {sex} {smoker} with a BMI of {bmi} and insurance cost of {insurance_cost}")
    

Marina is a 27 year old Female Non-smoker with a BMI of 31.1 and insurance cost of 6607.0
Connie is a 43 year old Female Non-smoker with a BMI of 25.3 and insurance cost of 8886.0
Isaac is a 35 year old Male Smoker with a BMI of 20.6 and insurance cost of 16444.0
Valentina is a 52 year old Female Non-smoker with a BMI of 18.7 and insurance cost of 6420.0


If you’d like extra practice with dictionaries, here are some suggestions to go further with this project:

Create a function called `update_medical_records()` that takes in the name of an individual as well as their medical data, and then updates the `medical_records` dictionary accordingly.


In [20]:
def update_medical_records(name, medical_data):
    medical_records.update({name : medical_data})

name = "Alex"
medical_data = {
    'Age': 35, 
    'Sex': 'Male', 
    'BMI': 20.6, 
    'Children': 0, 
    'Smoker': 'Smoker', 
    'Insurance_cost': 10444.0}

update_medical_records("Alex", medical_data)
print(medical_records)

{'Marina': {'Age': 27, 'Sex': 'Female', 'BMI': 31.1, 'Children': 2, 'Smoker': 'Non-smoker', 'Insurance_cost': 6607.0}, 'Connie': {'Age': 43, 'Sex': 'Female', 'BMI': 25.3, 'Children': 3, 'Smoker': 'Non-smoker', 'Insurance_cost': 8886.0}, 'Isaac': {'Age': 35, 'Sex': 'Male', 'BMI': 20.6, 'Children': 4, 'Smoker': 'Smoker', 'Insurance_cost': 16444.0}, 'Valentina': {'Age': 52, 'Sex': 'Female', 'BMI': 18.7, 'Children': 1, 'Smoker': 'Non-smoker', 'Insurance_cost': 6420.0}, 'Alex': {'Age': 35, 'Sex': 'Male', 'BMI': 20.6, 'Children': 0, 'Smoker': 'Smoker', 'Insurance_cost': 10444.0}}
