### U.S. Medical Insurance Costs
In this project, a CSV file with medical insurance costs will be investigated using Python fundamentals. The goal with this project will be to analyze various attributes within insurance.csv to learn more about the patient information in the file and gain insight into potential use cases for the dataset.

In [12]:
import pandas as pd
import numpy as np

In [13]:
medical = pd.read_csv("insurance.csv")

In [14]:
medical.insert(loc=0, column='ID', value=np.arange(len(medical)))
print(medical.head())

   ID  age     sex     bmi  children smoker     region      charges
0   0   19  female  27.900         0    yes  southwest  16884.92400
1   1   18    male  33.770         1     no  southeast   1725.55230
2   2   28    male  33.000         3     no  southeast   4449.46200
3   3   33    male  22.705         0     no  northwest  21984.47061
4   4   32    male  28.880         0     no  northwest   3866.85520


**insurance.csv** contains the following columns:
* Patient Age
* Patient Sex 
* Patient BMI
* Patient Number of Children
* Patient Smoking Status
* Patient U.S Geopraphical Region
* Patient Yearly Medical Insurance Cost

There are no signs of missing data. 

I am going to be investigating the differences between male and female patients as well as regional difference in insurance costs. 

In [15]:
print("The average age of a female patient is ")
print(np.mean(medical.age[medical.sex == 'female']))
print("The average age of a male patient is ")
print(np.mean(medical.age[medical.sex == 'male']))

The average age of a female patient is 
39.503021148036254
The average age of a male patient is 
38.917159763313606


In [16]:
print("The average number of a children for a female patient is ")
print(round(np.mean(medical.children[medical.sex == 'female'])))
print("The average number of a children for a male patient is ")
print(round(np.mean(medical.children[medical.sex == 'male'])))
print("The average number of children for both female and male patients is 1.")

The average number of a children for a female patient is 
1
The average number of a children for a male patient is 
1
The average number of children for both female and male patients is 1.


In [17]:
region_counts = medical.groupby('region').ID.count().reset_index()
region_counts = region_counts.rename(columns={"ID": "counts"})
print(region_counts)

      region  counts
0  northeast     324
1  northwest     325
2  southeast     364
3  southwest     325


In [18]:
northeast_df = medical[medical.region == 'northeast']
northwest_df = medical[medical.region == 'northwest']
southeast_df = medical[medical.region == 'southeast']
southwest_df = medical[medical.region == 'southwest']

In [19]:
region_cost = medical.groupby('region').charges.mean().reset_index()
print("The average medical insurance cost per region is as followed: ")
print(region_cost)

The average medical insurance cost per region is as followed: 
      region       charges
0  northeast  13406.384516
1  northwest  12417.575374
2  southeast  14735.411438
3  southwest  12346.937377


In [20]:
ne_by_gender = northeast_df.groupby('sex').charges.mean().reset_index()
print("The average medical insurance cost for the Northeast by gender is as followed: ")
print(ne_by_gender)
nw_by_gender = northwest_df.groupby('sex').charges.mean().reset_index()
print("The average medical insurance cost for the Northwest by gender is as followed: ")
print(nw_by_gender)
se_by_gender = southeast_df.groupby('sex').charges.mean().reset_index()
print("The average medical insurance cost for the Southeast by gender is as followed: ")
print(se_by_gender)
sw_by_gender = southwest_df.groupby('sex').charges.mean().reset_index()
print("The average medical insurance cost for the Southwest by gender is as followed: ")
print(sw_by_gender)

The average medical insurance cost for the Northeast by gender is as followed: 
      sex       charges
0  female  12953.203151
1    male  13854.005374
The average medical insurance cost for the Northwest by gender is as followed: 
      sex       charges
0  female  12479.870397
1    male  12354.119575
The average medical insurance cost for the Southeast by gender is as followed: 
      sex       charges
0  female  13499.669243
1    male  15879.617173
The average medical insurance cost for the Southwest by gender is as followed: 
      sex       charges
0  female  11274.411264
1    male  13412.883576


In [21]:
cost_by_child = medical.groupby('children').charges.mean().reset_index()
#cost_by_child_region = medical.groupby(['children', 'region']).charges.mean().reset_index()
print(cost_by_child)
#print(cost_by_child_region)

   children       charges
0         0  12365.975602
1         1  12731.171832
2         2  15073.563734
3         3  15355.318367
4         4  13850.656311
5         5   8786.035247


In [22]:
ne_by_child = northeast_df.groupby('children').charges.mean().reset_index()
print("The average medical insurance cost for the Northeast by number of children is as followed: ")
print(ne_by_child)
nw_by_child = northwest_df.groupby('children').charges.mean().reset_index()
print("The average medical insurance cost for the Northwest by number of children is as followed: ")
print(nw_by_child)
se_by_child = southeast_df.groupby('children').charges.mean().reset_index()
print("The average medical insurance cost for the Southeast by number of children is as followed: ")
print(se_by_child)
sw_by_child = southwest_df.groupby('children').charges.mean().reset_index()
print("The average medical insurance cost for the Southwest by number of children is as followed: ")
print(sw_by_child)

The average medical insurance cost for the Northeast by number of children is as followed: 
   children       charges
0         0  11626.462658
1         1  16310.206403
2         2  13615.152722
3         3  14409.913296
4         4  14485.193120
5         5   6978.973483
The average medical insurance cost for the Northwest by number of children is as followed: 
   children       charges
0         0  11324.370919
1         1  10230.256309
2         2  13464.314687
3         3  17786.160672
4         4  11347.018725
5         5   8965.795750
The average medical insurance cost for the Southeast by number of children is as followed: 
   children       charges
0         0  14309.868378
1         1  13687.041971
2         2  15728.470623
3         3  18449.846015
4         4  14451.023972
5         5  10115.441542
The average medical insurance cost for the Southwest by number of children is as followed: 
   children       charges
0         0  11938.504986
1         1  10406.484953
2       