# GUESS MY WEIGHT 
A program to predict the weight from my health data

![guess_your_weight.gif](images/guess_your_weight.gif)

## Overview
Health and Wellness is a big business. Specifically, weight loss. We’re all trying because it’s very, very hard. I recently went on my own weight loss journey, losing about 50 lbs in roughly 18 months. Weighing myself every morning, I agonized over every tenth of a lb, recording it in an app on my phone. I realized that losing big chunks of weights starts with small, incremental progress on the scale. But I didn’t stop there. As a data nerd I thought, “let’s record every meal.” So I did that too. I wondered… given all this data I have, could I predict my weight? My watch and phone captures my exercise, sleep, eating, and so much more. There must be trends here. At a minimum, I should be able to predict whether my weight will go up or down from the previous day. So let’s do it.

## Data Understanding
I have much (and probably too much) of this data in my iphone and Apple Watch. It contains the weight information, workouts, heart rate, meals - broken down into subcategories (proteins, fats, etc). Most importantly is the weight. That will be the feature that I primarily use for classification.  

Because it’s my data, there’s more clarity about data entry methods. This is more subjective, than a controlled experiment with many participants. I know what data I was diligent about collecting so I should be able to scrub it appropriately. For instance, I didn’t record my fluids consistently - water, tea, coffee. Water consumption is a big part of this so I’ll have to be clear about the gaps in the data

### Weigh-In Protocol
The routine for entering the weigh-in was pretty basic. I recorded my weight on a 3rd party app, on the same bathroom scale, before I drank any fluids in the morning but after urination. Morning wiegh-in works well because it's a simple routine. More importantly though, you likely weight the least because you're dehydrated after a night of sleep.

### Apple Health Data
Besides the weigh-in and meal logging, all of the other data is generated by Apple's proprietary software. I can not speak to it's accuracy.

### Meal Logging
All of the meal logging was done to the best of my ability using judgements about serving sizes, volume, weights, etc. A kitchen scale was incorporated after January to the measurements would have improved in accuracy after that time. There are certain weeks where there is no data, especially around holidays and weekends. You'll have to do your best there.

### Data scrubbing and transfer to Kaggle

To execute this project, personal data was utilized from the iphone, scrubbed, and uploaded to Kaggle for storage. The file is approximately, 40 MB, so a public area where this is easil

#### Data Import 
So...let's get started. To begin this project, I was able to Airdrop my health data from my Iphone to my personal labtop. 

#### Data Import 

In [1]:
import pandas as pd
import xml.etree.ElementTree as ET

In [2]:
#extract data from the xml file
tree = ET.parse("data_raw/export.xml")
root = tree.getroot()

In [5]:
health_records = [x.attrib for x in root.iter('Record')]
record_data = pd.DataFrame(health_records)

In [6]:
record_data.head()

Unnamed: 0,type,sourceName,sourceVersion,unit,creationDate,startDate,endDate,value,device
0,HKQuantityTypeIdentifierDietaryWater,MyPlate,4,mL,2022-06-01 13:20:27 -0500,2022-05-31 23:00:00 -0500,2022-05-31 23:00:00 -0500,354.84,
1,HKQuantityTypeIdentifierDietaryWater,MyPlate,4,mL,2022-07-11 09:43:30 -0500,2022-07-10 23:00:00 -0500,2022-07-10 23:00:00 -0500,1064.52,
2,HKQuantityTypeIdentifierDietaryWater,MyPlate,4,mL,2022-07-13 20:57:54 -0500,2022-07-12 23:00:00 -0500,2022-07-12 23:00:00 -0500,2129.04,
3,HKQuantityTypeIdentifierDietaryWater,MyPlate,4,mL,2022-07-14 12:42:54 -0500,2022-07-13 23:00:00 -0500,2022-07-13 23:00:00 -0500,946.24,
4,HKQuantityTypeIdentifierDietaryWater,MyPlate,4,mL,2022-07-16 18:11:29 -0500,2022-07-15 23:00:00 -0500,2022-07-15 23:00:00 -0500,2129.04,
