# Health Stats Part 1: Waist 2 Hip Ratios

<!--- Write an explanation of the Waist To Hips Ratio statistic used by health professionals. Please include an explanation of what it is used for, exactly how it is calculated, and how to interpret the results. Note: Formmatting matters. Make this as professional as you can using Markdown.  --->

<!--- feel free to use any web resources, including [Wikipedia](https://en.wikipedia.org/wiki/Waist%E2%80%93hip_ratio) or any other resources that you can find online. Just MAKE SURE you provide a link to every resource you decide to use. --->

<!--- Including the formula, or that fancy diagram/table you see on wikipedia is DEFINITELY a good idea! How? The LaTeX equations section in [This link](https://jupyter-notebook.readthedocs.io/en/stable/examples/Notebook/Working%20With%20Markdown%20Cells.html) might help. --->

<!--- For extra points, try to create a table similar to the one on the wikipedia page on your own. --->

### Overview

*Waist to Hip Ratio* is one of the many calculations doctors use to determine whether or not a patient is overweight. This helps doctors assess if a patient's weight may be putting them at certain health risks. The ratio compares your waist circumference to your hip circumference. Essentially, it determines how much fat is stored in the area in and around your waist and hips.  

Your *Waist to Hip Ratio* can be calculated by dividing your waist circumference by your hip circumference as follows: 

 #### $ ratio_{w2h} = \frac{w}{h} $

After calculating your Waist to Hip Ratio using the formual, it can be compared to the chart: 

### Waist to Hip Ratio Chart: 
| Health Risk          | Women                                                              |  Men                                 
|:-------------------  |:-----------------------------------------------------------------|:------------------------------|
| **Low:**             |    0.80 or lower                                                 |   0.95 or lower
        |
| **Moderate:**        |    0.81 - 0.85                                                   |    0.96 - .99
        |
| **High:**            |    0.86 or higher                                                |    1.0 or higher                                                   

According to the World Health Organization, a health WHR is: 
  + 0.9 or less for men 
  + 0.85 or less for women
  
In both men and women, a WHR  or 1.0 or higher increases the risk for heart disease and other conditions that are linked to being overweight.

<a href = "https://www.healthline.com/health/waist-to-hip-ratio">Click here to see the source for the information provided above</a>
    
<img src = 'https://upload.wikimedia.org/wikipedia/commons/3/34/Abdominal_obesity_in_men.jpg' />



## Source Data 

<!--- Replace the text below with a Markdown bullet list that defines the columns of the CSV file. Be sure to indicate the data type for each column. --->

<!--- Example can be: ID, unique identifier of each person, integer. Remember you need to put this into a bullet list! How? [This link](https://jupyter-notebook.readthedocs.io/en/stable/examples/Notebook/Working%20With%20Markdown%20Cells.html) might help. --->

<!--- These two markdown cells are required in almost any analytical report. --->

+ **ID**:
 + unique idenitfier for each person 
 + integer
+ **Waist**: 
 + measured circumference of the waist in inches 
 + integer
+ **Hip**: 
 + measured circumference of the hip in inches 
 + integer
+ **Gender**: 
 + one letter indicator for male(M) or female(F) 
 + string




## Data Import

For whatever type of analysis, we need to read in the data. 

This is the basic way how Python read-in data. 

For more information regarding this part, read Chapter 7 in your PY4E textbook.

In [15]:
# Goal: Extract the data from the file

# opens the w2h_data.csv for reading
f = open("w2h_data.csv", "r")

# loads the file into a list of strings, one string per line
raw_lines = list(f)

# closes the file
f.close()

Data are not useful when they are in the wrong data type, or have wrong values, missing values... 

Clean up your data is an important step in any analysis.

In [19]:
# Goal: Scrub and convert the data, loading it into a new list called rows

# Strips out newline '\n' characters and converts to a list
raw_rows = [r.rstrip('\n').split(',') for r in raw_lines] # <--- Whoa. Why does this work? 

# Creates a new list `rows`, starting with just the column names
rows = list() 
rows.append(raw_rows[0]);

# Convert each `raw_row`, starting with the second
for raw_row in raw_rows[1:]:
    
    # Note: the values in the `raw_row` list are all strings.
    # Create a new list called `row` that converts each item in `raw_row` to the right data type  
    row = [int(raw_row[0]),int(raw_row[1]),int(raw_row[2]),str(raw_row[3])] 
    # print row to check status
    # you'll need to use conversion functions above
    # Append the new `row` to the `rows` list
    rows.append(row)
    
# from here on out use the `rows` list instead of `raw_rows` or `raw_lines`
    #rows = row[0:]
# You may want to print out `rows` to test whether your code above worked
  #  print(rows)

## Calculations

Sometimes, the data given to you do not contain the values you need directly, you will need to calculate them somehow. 

In this part, you calculate two new features namely `W2H Ratio` and `Shape`.

In [20]:

rows[0].extend(["W2H Ratio","Shape"])

# For each row in the rows list, calculate the waist to hips ratio and shape
for row in rows[1:]:
    # Calculate the w2h_ratio
    w2h_ratio = row[1]/row[2] # FIX THIS; you will need to take care about data types
    
    # Based on the ratio and the gender, set the variable shape to either 'Apple' or 'Pear'
    if row[3] == 'F' and w2h_ratio <= 0.80:
        shape = 'Pear'
    elif row[3] == "F":
        shape = 'Apple'
    elif row[3] == 'M' and w2h_ratio <= 0.90:
        shape = 'Pear'
    else:
        shape = "Apple"
        

    
    # Add the new data to the end of the row
    row += [w2h_ratio, shape] # note: += is shorthand for the extend method used above

    

## Output

In your analysis report, it is always helpful to display your data somehow.

This is a very rudimentary way to displaying your data, including the original features and the new features you just calculated.

In [21]:
# Goal: pretty print the rows as an HTML table

# Note: this works, but we can do this much better with pandas
html_table = '<table><tr><th>'
html_table += "</th><th>".join(rows[0])
html_table += '</th></tr>'
for row in rows[1:]:
    html_table += "<tr><td>"
    html_table += "</td><td>".join(str(col) for col in row)
    html_table += "</td></tr>"
html_table += "</table>"

from IPython.display import HTML, display
display(HTML(html_table))

ID,Waist,Hip,Gender,W2H Ratio,Shape
1,30,32,M,0.9375,Apple
2,32,37,M,0.8648648648648649,Pear
3,30,36,M,0.8333333333333334,Pear
4,33,39,M,0.8461538461538461,Pear
5,29,33,M,0.8787878787878788,Pear
6,32,38,M,0.8421052631578947,Pear
7,33,42,M,0.7857142857142857,Pear
8,30,40,M,0.75,Pear
9,30,37,M,0.8108108108108109,Pear
10,32,39,M,0.8205128205128205,Pear
