# Health Stats Part 1: Waist 2 Hip Ratios

# W2H Ratio
- a ratio of the circumference of a person's waist to their hips (see photo below for further explanation)
    - waist circumference is measured just above the belly button
    - hip circumference is measured at the widest part of the hips
- this measurment is calculated by dividing waist by hips:
    $ ratio_{w2h} = \frac{w}{h} $
- this ratio is used to indicate the relative health of a person and their risk of developing a serious health condition in the future
    - larger waist circumference (apple- shaped) can lead to greater health risk, rather than a larger hip circumference (pear- shaped)
    - for instance, the risk for diabetes increases with a W2H ratio above 0.85 for females and above 1.0 for men due to fat distribution
    - this ratio is also thought to be correlated with fertility

- research shows that obesity can be defined by a W2H ratio: 
    - above 0.90 for males
    - above 0.85 for females 
  
source: https://en.wikipedia.org/wiki/Waist–hip_ratio

<img src = 'https://upload.wikimedia.org/wikipedia/commons/d/dd/Waist-hip_ratio.svg' />


The following table represents W2H ratios considered by three well-known organizations. 
    - DGSP
        -represents the first two columns of women and men data
    - WHO
        -represents the second two columns of women and men data
    - NIDDK
        -represents the third two columns of women and men data
    

| **DGSP** | **WHO** | **NIDDK** |
| ------- :| -----: | -----: |

| Women | Men | Women | Men | Women | Men |      |
| -----:| ---: | -----:| ---:| -----:| ---: | ---: |
| ?  | ?  | ?  | ?  | ?  | ?  |**under-weight**|
| < 0.80| < 0.90 | ?  | ?  | ?  | ?  |**normal weight**|
| 0.80-0.84 | 0.90-0.99 | ?  | ?  | ?  | ?  |**over-weight**|
| >0.85 | >1.00 | >0.85  | >0.90  | >0.80  | >1.00  |**obesity**| 
   
   
   




<!--- Write an explanation of the Waist To Hips Ratio statistic used by health professionals. Please include an explanation of what it is used for, exactly how it is calculated, and how to interpret the results. Note: Formmatting matters. Make this as professional as you can using Markdown.  --->

<!--- feel free to use any web resources, including [Wikipedia](https://en.wikipedia.org/wiki/Waist%E2%80%93hip_ratio) or any other resources that you can find online. Just MAKE SURE you provide a link to every resource you decide to use. --->

<!--- Including the formula, or that fancy diagram/table you see on wikipedia is DEFINITELY a good idea! How? The LaTeX equations section in [This link](https://jupyter-notebook.readthedocs.io/en/stable/examples/Notebook/Working%20With%20Markdown%20Cells.html) might help. --->

<!--- For extra points, try to create a table similar to the one on the wikipedia page on your own. --->

__EDIT THIS MARKDOWN CELL__

## Source Data 



## Definitions of Columns in CSV File
- **ID**: uniquie identifier of each person, integer
- **Waist**: measurment for circumference of area just above the belly button, integer
- **Hip**: measurment for circumference of area at the widest part of the hips, integer
- **Gender**: identity identifier of each person, string



## Data Import

For whatever type of analysis, we need to read in the data. 

This is the basic way how Python read-in data. 

For more information regarding this part, read Chapter 7 in your PY4E textbook.

In [1]:
# Goal: Extract the data from the file

import csv #need to impor csv file in order to use it below

# opens the w2h_data.csv for reading
f = open("w2h_data.csv", "r")

# loads the file into a list of strings, one string per line
raw_lines = list(f)

# closes the file
f.close()

In [28]:
#import re
#import csv

#with open('w2h_data.csv', 'r') as fp: 
    #for line in fp: 
        #line = line.rstrip()  
       # if re.search('raw_lines', line): 
            #print(line)

In [2]:
#old code that created list
raw_rows = [r.rstrip('\n').split(',') for r in raw_lines]

In [3]:
print(raw_rows)

[['ID', 'Waist', 'Hip', 'Gender'], ['1', '30', '32', 'M'], ['2', '32', '37', 'M'], ['3', '30', '36', 'M'], ['4', '33', '39', 'M'], ['5', '29', '33', 'M'], ['6', '32', '38', 'M'], ['7', '33', '42', 'M'], ['8', '30', '40', 'M'], ['9', '30', '37', 'M'], ['10', '32', '39', 'M'], ['11', '24', '35', 'F'], ['12', '25', '37', 'F'], ['13', '24', '37', 'F'], ['14', '22', '34', 'F'], ['15', '26', '38', 'F'], ['16', '26', '37', 'F'], ['17', '25', '38', 'F'], ['18', '26', '37', 'F'], ['19', '28', '40', 'F'], ['20', '23', '35', 'F']]


In [62]:
#turn above into a dictionary
#w2h_dict = {'ID':[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20],'Waist':[30, 32, 30, 33, 29, 32, 33, 30, 30, 32, 24, 25, 24, 22, 26, 26, 25, 26, 28, 23],'Hip': [32, 37, 36, 39, 33, 38, 42, 30, 37, 39, 35, 37, 37, 34, 38, 37, 38, 37, 40, 35],'Gender': ['M', 'M', 'M', 'M', 'M', 'M', 'M', 'M', 'M', 'M', 'F', 'F', 'F', 'F', 'F', 'F', 'F', 'F', 'F', 'F']}
#print(w2h_dict)

{'ID': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20], 'Waist': [30, 32, 30, 33, 29, 32, 33, 30, 30, 32, 24, 25, 24, 22, 26, 26, 25, 26, 28, 23], 'Hip': [32, 37, 36, 39, 33, 38, 42, 30, 37, 39, 35, 37, 37, 34, 38, 37, 38, 37, 40, 35], 'Gender': ['M', 'M', 'M', 'M', 'M', 'M', 'M', 'M', 'M', 'M', 'F', 'F', 'F', 'F', 'F', 'F', 'F', 'F', 'F', 'F']}


In [7]:
raw_rows = [r.rstrip('\n').split(',') for r in raw_lines] 
rows = list() 
#rows.append(raw_rows[0]);

for raw_row in raw_rows[1:]:
    row = [int(raw_row[0]),int(raw_row[1]),int(raw_row[2]),raw_row[3]]
    rows.append(row)
    
print(rows)   

[[1, 30, 32, 'M'], [2, 32, 37, 'M'], [3, 30, 36, 'M'], [4, 33, 39, 'M'], [5, 29, 33, 'M'], [6, 32, 38, 'M'], [7, 33, 42, 'M'], [8, 30, 40, 'M'], [9, 30, 37, 'M'], [10, 32, 39, 'M'], [11, 24, 35, 'F'], [12, 25, 37, 'F'], [13, 24, 37, 'F'], [14, 22, 34, 'F'], [15, 26, 38, 'F'], [16, 26, 37, 'F'], [17, 25, 38, 'F'], [18, 26, 37, 'F'], [19, 28, 40, 'F'], [20, 23, 35, 'F']]


In [15]:
# example= lst_of_lsts = [[1, 30, 32, 'M'], [2, 32, 37, 'M'], [3, 30, 36, 'M']]

#create empty lists to populate seperate above lists
id_lst = []
waist_lst = []
hip_lst =[]
gender_lst =[]
for lst in rows:
    #print(lst)
    id_lst.append(lst[0])
    waist_lst.append(lst[1])
    hip_lst.append(lst[2])
    gender_lst.append(lst[3])
print(id_lst)
print(waist_lst)
print(hip_lst)
print(gender_lst)

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]
[30, 32, 30, 33, 29, 32, 33, 30, 30, 32, 24, 25, 24, 22, 26, 26, 25, 26, 28, 23]
[32, 37, 36, 39, 33, 38, 42, 40, 37, 39, 35, 37, 37, 34, 38, 37, 38, 37, 40, 35]
['M', 'M', 'M', 'M', 'M', 'M', 'M', 'M', 'M', 'M', 'F', 'F', 'F', 'F', 'F', 'F', 'F', 'F', 'F', 'F']


In [38]:
value_lst = [id_lst, waist_lst, hip_lst, gender_lst]
key_lst = raw_rows[0]
w2h_dict = dict(zip(key_lst, value_lst))

print(w2h_dict)

{'ID': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20], 'Waist': [30, 32, 30, 33, 29, 32, 33, 30, 30, 32, 24, 25, 24, 22, 26, 26, 25, 26, 28, 23], 'Hip': [32, 37, 36, 39, 33, 38, 42, 40, 37, 39, 35, 37, 37, 34, 38, 37, 38, 37, 40, 35], 'Gender': ['M', 'M', 'M', 'M', 'M', 'M', 'M', 'M', 'M', 'M', 'F', 'F', 'F', 'F', 'F', 'F', 'F', 'F', 'F', 'F']}


Data are not useful when they are in the wrong data type, or have wrong values, missing values... 

Clean up your data is an important step in any analysis.

In [17]:
#create new dictionary 'rows', starting with just the column names
rows = {'ID','Waist','Hip','Gender'}
print(rows)

{'Gender', 'ID', 'Hip', 'Waist'}


## Calculations

Sometimes, the data given to you do not contain the values you need directly, you will need to calculate them somehow. 

In this part, you calculate two new features namely `W2H Ratio` and `Shape`.

In [73]:
#extend rows to include "W2H Ratio" and "Shape"
rows = {'ID', 'Waist', 'Hip', 'Gender', 'W2H_Ratio', 'Shape'}
print(rows)

{'Shape', 'ID', 'Hip', 'Waist', 'Gender', 'W2H_Ratio'}


In [20]:
w2h_ratio = [x/y for x,y in zip(waist_lst,hip_lst)] #found on stack overflow- however goes back to out lesson on for loops:https://stackoverflow.com/questions/43047685/divide-two-list-of-numbers-in-python-using-list-comprehension-and-not-using-zip
 


print(w2h_ratio)

[0.9375, 0.8648648648648649, 0.8333333333333334, 0.8461538461538461, 0.8787878787878788, 0.8421052631578947, 0.7857142857142857, 0.75, 0.8108108108108109, 0.8205128205128205, 0.6857142857142857, 0.6756756756756757, 0.6486486486486487, 0.6470588235294118, 0.6842105263157895, 0.7027027027027027, 0.6578947368421053, 0.7027027027027027, 0.7, 0.6571428571428571]


In [22]:
#update dictionary to include w2h_ratio

w2h_dict['w2h_ratio']= w2h_ratio

print(w2h_dict)

{'ID': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20], 'Waist': [30, 32, 30, 33, 29, 32, 33, 30, 30, 32, 24, 25, 24, 22, 26, 26, 25, 26, 28, 23], 'Hip': [32, 37, 36, 39, 33, 38, 42, 40, 37, 39, 35, 37, 37, 34, 38, 37, 38, 37, 40, 35], 'Gender': ['M', 'M', 'M', 'M', 'M', 'M', 'M', 'M', 'M', 'M', 'F', 'F', 'F', 'F', 'F', 'F', 'F', 'F', 'F', 'F'], 'w2h_ratio': [0.9375, 0.8648648648648649, 0.8333333333333334, 0.8461538461538461, 0.8787878787878788, 0.8421052631578947, 0.7857142857142857, 0.75, 0.8108108108108109, 0.8205128205128205, 0.6857142857142857, 0.6756756756756757, 0.6486486486486487, 0.6470588235294118, 0.6842105263157895, 0.7027027027027027, 0.6578947368421053, 0.7027027027027027, 0.7, 0.6571428571428571]}


In [41]:
# Based on the ratio and the gender, set the variable shape to either 'Apple' or 'Pear'- how to do this- create for loop


shape_lst = []
w2h_ratio = [x/y for x,y in zip(waist_lst,hip_lst)] 

for shape_lst in w2h_ratio:
    if w2h_ratio < 0.90: # Based on the ratio and the gender, set the variable shape to either 'Apple' or 'Pear'
        shape_lst.append('Pear')
    else:
        shape_lst.append('Apple')

print(shape_lst)

TypeError: '<' not supported between instances of 'list' and 'float'

In [31]:
w2h_dict['shape'] = shape_lst

## Output

In your analysis report, it is always helpful to display your data somehow.

This is a very rudimentary way to displaying your data, including the original features and the new features you just calculated.

In [32]:
# Goal: pretty print the rows as an HTML table

# Note: this works, but we can do this much better with pandas
html_table = '<table><tr><th>'
html_table += "</th><th>".join(rows[0])
html_table += '</th></tr>'
for row in rows[1:]:
    html_table += "<tr><td>"
    html_table += "</td><td>".join(str(col) for col in row)
    html_table += "</td></tr>"
html_table += "</table>"

from IPython.display import HTML, display
display(HTML(html_table))

TypeError: 'set' object does not support indexing