# Health Stats Part 1: Waist 2 Hip Ratios

W2H Ratio
- Waist to Hip ratio is used to measure health by professionals
- The formula to find Waist-Hip ratio is $ ratio_{w2h} = \frac{w}{h} $
- A person's body shaped is defined by his or her gender and his or her waist-hip ratio
- A person can either be an apple or pear body shape based on these features

<img src = 'https://upload.wikimedia.org/wikipedia/commons/d/dd/Waist-hip_ratio.svg'>
<img src = 'https://cdn.psychologytoday.com/sites/default/files/styles/article-inline-half/public/field_blog_entry_images/2017-06/screen_shot_2017-06-18_at_8.00.28_am.png?itok=vwFW1irE'>

|Characteristics| DGSP|   WHO    |  NIDDK | 
| ------------- | ---------|----------|----------------|
|       Gender  |Women|Women|Women|
| Under-weight  |  ?   | ? |   ?  |  
| normal weight | < .8| ?| ?|
| over-weight   |.80 - .84| ?| ?|
| obesity       | >.85 | >.85| >.8|

|Characteristics| DGSP|   WHO    |  NIDDK | 
| ------------- | ---------|----------|----------------|
|       Gender  |Men|Men|Men|
| Under-weight  |  ?   | ? |   ?  |
| normal weight | < .9| ?| ?|
| over-weight   | .90 - .99| ?| ?|
| obesity       | >1.00| >.9| >.1|

 source https://www.healthline.com/health/waist-to-hip-ratio and https://en.wikipedia.org/wiki/Waist%E2%80%93hip_ratio

## Source Data 



W2H Data
- Column one of the data is the unique indentifier of the person. This data should be listed as an integer. 
- Column two of the data is the measurement of the person's waist. This data should be listed as an integer.
- Column three of the data is the measurement of the person's hips. This data should be listed as an integer. 
- Column four of the data is the gender of the person. This data should be listed as string.


## Data Import

For whatever type of analysis, we need to read in the data. 

This is the basic way how Python read-in data. 

For more information regarding this part, read Chapter 7 in your PY4E textbook.

In [1]:
# Goal: Extract the data from the file

# opens the w2h_data.csv for reading
f = open("w2h_data.csv", "r")

# loads the file into a list of strings, one string per line
raw_lines = list(f)

# closes the file
f.close()

Data are not useful when they are in the wrong data type, or have wrong values, missing values... 

Clean up your data is an important step in any analysis.

In [1]:
# Goal: Extract the data from the file

# opens the w2h_data.csv for reading
f = open("w2h_data.csv", "r")

# loads the file into a list of strings, one string per line
raw_lines = list(f)

# closes the file
f.close()

#Strips out newline '\n' characters and converts to a list 
raw_rows = [r.rstrip('\n').split(',') for r in raw_lines]
# Creates a new list `rows`, starting with just the column names
rows = list() 
rows.append(raw_rows[0]);

# Convert each `raw_row`, starting with the second
for raw_row in raw_rows[1:]:
    
        # Note: the values in the `raw_row` list are all strings.
        # Create a new list called `row` that converts each item in `raw_row` to the right data type
        row = [int(raw_row[0]),int(raw_row[1]),int(raw_row[2]),raw_row[3]]
        
        rows.append(row)
        
print(rows)

[['ID', 'Waist', 'Hip', 'Gender'], [1, 30, 32, 'M'], [2, 32, 37, 'M'], [3, 30, 36, 'M'], [4, 33, 39, 'M'], [5, 29, 33, 'M'], [6, 32, 38, 'M'], [7, 33, 42, 'M'], [8, 30, 40, 'M'], [9, 30, 37, 'M'], [10, 32, 39, 'M'], [11, 24, 35, 'F'], [12, 25, 37, 'F'], [13, 24, 37, 'F'], [14, 22, 34, 'F'], [15, 26, 38, 'F'], [16, 26, 37, 'F'], [17, 25, 38, 'F'], [18, 26, 37, 'F'], [19, 28, 40, 'F'], [20, 23, 35, 'F']]


## Calculations

Sometimes, the data given to you do not contain the values you need directly, you will need to calculate them somehow. 

In this part, you calculate two new features namely `W2H Ratio` and `Shape`.

In [2]:
# Goal: For each row of data calculate and store the w2h_ratio and shape.

# Creates a new list `rows`, starting with just the column names

# Adds columns for the two new features
rows[0].extend(["W2H Ratio","Shape"])

# For each row in the rows list, calculate the waist to hips ratio and shape
for row in rows[1:]:
    # Calculate the w2h_ratio 
    # FIX THIS; you will need to take care about data types
    w2h_ratio = int(row[1]) / int(row[2])
    # Based on the ratio and the gender, set the variable shape to either 'Apple' or 'Pear'
    if w2h_ratio <= .85 and row[3] == "F":
        Shape = "Pear" # FIX THIS; you will need to use a conditional
    elif w2h_ratio >= .85 and row[3] == "F":
        Shape = "Apple"
    elif w2h_ratio <= .9 and row[3] == "M":
        Shape = "Pear"
    else:
        Shape = "Apple"
    
    # Add the new data to the end of the row
    row += [w2h_ratio, Shape] # note: += is shorthand for the extend method used above

    
    #source https://www.healthline.com/health/waist-to-hip-ratio  
    
print(rows)

[['ID', 'Waist', 'Hip', 'Gender', 'W2H Ratio', 'Shape'], [1, 30, 32, 'M', 0.9375, 'Apple'], [2, 32, 37, 'M', 0.8648648648648649, 'Pear'], [3, 30, 36, 'M', 0.8333333333333334, 'Pear'], [4, 33, 39, 'M', 0.8461538461538461, 'Pear'], [5, 29, 33, 'M', 0.8787878787878788, 'Pear'], [6, 32, 38, 'M', 0.8421052631578947, 'Pear'], [7, 33, 42, 'M', 0.7857142857142857, 'Pear'], [8, 30, 40, 'M', 0.75, 'Pear'], [9, 30, 37, 'M', 0.8108108108108109, 'Pear'], [10, 32, 39, 'M', 0.8205128205128205, 'Pear'], [11, 24, 35, 'F', 0.6857142857142857, 'Pear'], [12, 25, 37, 'F', 0.6756756756756757, 'Pear'], [13, 24, 37, 'F', 0.6486486486486487, 'Pear'], [14, 22, 34, 'F', 0.6470588235294118, 'Pear'], [15, 26, 38, 'F', 0.6842105263157895, 'Pear'], [16, 26, 37, 'F', 0.7027027027027027, 'Pear'], [17, 25, 38, 'F', 0.6578947368421053, 'Pear'], [18, 26, 37, 'F', 0.7027027027027027, 'Pear'], [19, 28, 40, 'F', 0.7, 'Pear'], [20, 23, 35, 'F', 0.6571428571428571, 'Pear']]


## Output

In your analysis report, it is always helpful to display your data somehow.

This is a very rudimentary way to displaying your data, including the original features and the new features you just calculated.

In [3]:
# Goal: pretty print the rows as an HTML table

# Note: this works, but we can do this much better with pandas
html_table = '<table><tr><th>'
html_table += "</th><th>".join(rows[0])
html_table += '</th></tr>'
for row in rows[1:]:
    html_table += "<tr><td>"
    html_table += "</td><td>".join(str(col) for col in row)
    html_table += "</td></tr>"
html_table += "</table>"

from IPython.display import HTML, display
display(HTML(html_table))

ID,Waist,Hip,Gender,W2H Ratio,Shape
1,30,32,M,0.9375,Apple
2,32,37,M,0.8648648648648649,Pear
3,30,36,M,0.8333333333333334,Pear
4,33,39,M,0.8461538461538461,Pear
5,29,33,M,0.8787878787878788,Pear
6,32,38,M,0.8421052631578947,Pear
7,33,42,M,0.7857142857142857,Pear
8,30,40,M,0.75,Pear
9,30,37,M,0.8108108108108109,Pear
10,32,39,M,0.8205128205128205,Pear


In [22]:
# source: https://en.wikipedia.org/wiki/Waist%E2%80%93hip_ratio