# Health Stats Part 3: Numpy Structured Arrays

<!--- Write an explanation of the Waist To Hips Ratio statistic used by health professionals. Please include an explanation of what it is used for, exactly how it is calculated, and how to interpret the results. Note: Formmatting matters. Make this as professional as you can using Markdown.  --->
Waist to Hips Ratio is a measure of the waist divided by a measure of the hips. This can be in either inches, cm, or any other unit of measure as long as the two measures share consistent units. Healthy Waist To Hips Ratio (WTHR) depends on your sex. Different levels of obese, overweight, normal weight, and underweight are calculated by different medical authorities such as The National Institute of Diabetes, Digestive and Kidney Diseases among others. For men a value above 1.0 (meaning your waist is larger than your hips) indicates obesity. For women, this value is listed as 0.8. 

## Source Data 

<!--- Replace the text below with a Markdown bullet list that defines the columns of the w2h_data.csv file. Be sure to indicate the data type for each column. --->
- ID: Unique identifier of a patient
- Waist: Waist measurement in inches
- Hip: Hip measurement in inches
- Gender: Male or Female

## Data Import

In [1]:
# Goal: Extract the data from the file
import numpy as np
import pandas as pd
# opens the w2h_data.csv for reading

f = np.genfromtxt('w2h_data.csv',delimiter=',',dtype=[('ID', '<f8'), ('Waist', '<f8'), ('Hip', '<f8'), ('Gender', 'U1')],names=True)
#f = open("w2h_data.csv", "r")

# loads the file into a list of strings, one string per line
#raw_lines = list(f)

# closes the file
f

array([( 1., 30., 32., 'M'), ( 2., 32., 37., 'M'), ( 3., 30., 36., 'M'),
       ( 4., 33., 39., 'M'), ( 5., 29., 33., 'M'), ( 6., 32., 38., 'M'),
       ( 7., 33., 42., 'M'), ( 8., 30., 40., 'M'), ( 9., 30., 37., 'M'),
       (10., 32., 39., 'M'), (11., 24., 35., 'F'), (12., 25., 37., 'F'),
       (13., 24., 37., 'F'), (14., 22., 34., 'F'), (15., 26., 38., 'F'),
       (16., 26., 37., 'F'), (17., 25., 38., 'F'), (18., 26., 37., 'F'),
       (19., 28., 40., 'F'), (20., 23., 35., 'F')],
      dtype=[('ID', '<f8'), ('Waist', '<f8'), ('Hip', '<f8'), ('Gender', '<U1')])

## Calculations

In [18]:
# Goal: For each row of data calculate and store the w2h_ratio and shape.

w2h_ratio = f['Waist']/f['Hip']
shape = (((f['Gender']=='M') & (f['Waist']/f['Hip']>0.9)) | ((f['Gender']=='F') & ((f['Waist']/f['Hip'])>0.8)))


dt = np.dtype(f.dtype.descr +[('W2H Ratio',float),('Shape',int)])
results = np.zeros(f.shape,dtype=dt)

# copy over the rows data
for c in f.dtype.names:
    results[c]=f[c]
    
# add the two new columns
results['W2H Ratio']=w2h_ratio
results['Shape']=shape
results

array([( 1., 30., 32., 'M', 0.9375    , 1),
       ( 2., 32., 37., 'M', 0.86486486, 0),
       ( 3., 30., 36., 'M', 0.83333333, 0),
       ( 4., 33., 39., 'M', 0.84615385, 0),
       ( 5., 29., 33., 'M', 0.87878788, 0),
       ( 6., 32., 38., 'M', 0.84210526, 0),
       ( 7., 33., 42., 'M', 0.78571429, 0),
       ( 8., 30., 40., 'M', 0.75      , 0),
       ( 9., 30., 37., 'M', 0.81081081, 0),
       (10., 32., 39., 'M', 0.82051282, 0),
       (11., 24., 35., 'F', 0.68571429, 0),
       (12., 25., 37., 'F', 0.67567568, 0),
       (13., 24., 37., 'F', 0.64864865, 0),
       (14., 22., 34., 'F', 0.64705882, 0),
       (15., 26., 38., 'F', 0.68421053, 0),
       (16., 26., 37., 'F', 0.7027027 , 0),
       (17., 25., 38., 'F', 0.65789474, 0),
       (18., 26., 37., 'F', 0.7027027 , 0),
       (19., 28., 40., 'F', 0.7       , 0),
       (20., 23., 35., 'F', 0.65714286, 0)],
      dtype=[('ID', '<f8'), ('Waist', '<f8'), ('Hip', '<f8'), ('Gender', '<U1'), ('W2H Ratio', '<f8'), ('Shape', '<i4')

## Output

In [21]:
# Goal: pretty print the rows as an HTML table

# Note: this works, but we can do this much better with pandas
html_table = '<table><tr><th>'
html_table += "</th><th>".join(results.dtype.names)
html_table += '</th></tr>'
for row in results:
    html_table += "<tr><td>"
    html_table += "</td><td>".join(str(v) for v in row)
    html_table += "</td></tr>"
html_table += "</table>"

from IPython.display import HTML, display
display(HTML(html_table))

# Export to "StatsResults.csv"
np.savetxt("StatsResults.csv",results,fmt='%s',delimiter=',',header=','.join(results.dtype.names),comments="")

ID,Waist,Hip,Gender,W2H Ratio,Shape
1.0,30.0,32.0,M,0.9375,1
2.0,32.0,37.0,M,0.8648648648648649,0
3.0,30.0,36.0,M,0.8333333333333334,0
4.0,33.0,39.0,M,0.8461538461538461,0
5.0,29.0,33.0,M,0.8787878787878788,0
6.0,32.0,38.0,M,0.8421052631578947,0
7.0,33.0,42.0,M,0.7857142857142857,0
8.0,30.0,40.0,M,0.75,0
9.0,30.0,37.0,M,0.8108108108108109,0
10.0,32.0,39.0,M,0.8205128205128205,0
