In [None]:
# fitness_calories_estimator_milestone.ipynb

# Fitness Calorie Estimator Web App - Project Milestone

## Overview
This project aims to develop a web application that estimates calories burned during physical activities based on user inputs such as age, gender, weight, heart rate, duration, and type of activity. The output will be a predicted calorie burn using a regression-based model.

The goal of this milestone is to demonstrate dataset exploration, preprocessing, and initial model experiments. The web app integration and deployment will be implemented in later milestones.

## Dataset Description
The dataset includes the following features:
- 'Age': user age in years
- 'Gender': Male/Female
- 'Height': User height in feet
- 'Weight': user weight in lbs
- 'Duration': Duration of activity in minutes
- 'Heart_Rate': Average heart rate during activity
- 'Body_Temp': Body temperature in Fahrenheit
- 'Activity': Type if physical activity (e.g., Walking, Running, Cycling)
- 'Calories_Burned': Target variable for regression

We will use Kaggle's Calories Burned Dataset (or a similar public dataset).

In [None]:
import pandas as pd

# Load dataset (replace with your dataset path)
df = pd.read_csv('calories_burned_dataset.csv')

# Display first few rows
df.head()

In [None]:
# Basic dataset information
df.info()

# Summary statistics
df.describe()

# Check for missing values
df.isnull().sum()

In [None]:
import matplotlib.pyplot as plt
import seaborn as sns

# Scatter plot: Duration vs Calories Burned
plt.figure(figsize=(8, 5))
sns.scatterplot(data=df, x='Duration', y='Calories_Burned', hue='Activity')
plt.title('calories Burned vs Activity Duration')
plt.show()

# Histogram of Calories Burned
plt.figure(figsize=(8, 5))
sns.histplot(df['Calories_Burned'], bins=30, kde=True)
plt.title('Distribution of Calories Burned')
plt.show()

## Data Preprocessing
- Handle missing values
- Encode categorical variables (Gender, Activity)
- Scale numerical features if necessary

In [None]:
from sklearn.preprocessing import LabelEncoder

# Encode categorical variables
le_gender = LabelEncoder()
df['Gender'] = le_gender.fit_transform(df['Gender'])

le_activity = LabelEncoder()
df['Activity'] = le_activity.fit_transform(df['Activity'])

# Example: dropping rows with missing target values
df = df.dropna(subset=['Calories_Burned'])

# Features and target
X = df[['Age', 'Gender', 'Height', 'Weight', 'Duration', 'Heart_Rate', 'Body_Temp', 'Activity']]
y = df['Calories_Burned']

## Model Development
We will start with a simple Linear Regression model to predict calories burned.

In [None]:
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_absolute_error, mean_squared_error
import numpy as np

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train Linear Regression model
lr_model = LinearRegression()
lr_model.fit(X_train, y_train)

# Predictions
y_pred = lr_model.predict(X_test)

# Evaluation
mae = mean_absolute_error(y_test, y_pred)
rmse = np.sqrt(mean_squared_error(y_test, y_pred))

print(f'Mean Absolute Error (MAE): {mae:.2f}')
print(f'Root Mean Squared Error (RMSE): {rmse:.2f}')

## Future Web App Integration

We plan to build a web application using the following technologies:

- **Backend**: Flask (Python) to handle user inputs and model predictions
- **Frontend:** HTML, CSS, and JavaScript for a simple user interface
- **Visualization:** matplotlib or Plotly to display charts of calories burned vs. activity duration
- **Deployment:** Local testing and possible deployment on Heroku or similar platforms

Users will input their personal and workout data, and the app will display the predicted calories burned. Future enhancements include allowing users to compare predictions with actual data.

## Next Steps

1. Hyperparameter tuning and experimentation with Random Forest Regressor.
2. Develop Flask backend routes and HTML templates.
3. Integrate ML model into the web app.
4. Add data visualizations and user feedback.
5. Deployment of web app for public use.