In [None]:
{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "# Feature Engineering\n",
        "\n",
        "This notebook creates new features and preprocesses the dataset for modeling. We add `weight_difference` (assuming a fixed target weight for analysis) and standardize numerical features."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Import Libraries"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import pandas as pd\n",
        "from sklearn.preprocessing import StandardScaler\n",
        "import matplotlib.pyplot as plt\n",
        "import seaborn as sns\n",
        "\n",
        "%matplotlib inline"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Load Dataset\n",
        "\n",
        "Load the cleaned dataset from the previous notebook."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "df = pd.read_csv('cleaned_fitness_data.csv')\n",
        "df"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Create New Feature\n",
        "\n",
        "Add `weight_difference` (current_weight - target_weight). For this analysis, assume a fixed target weight of 65 kg (a reasonable average based on typical fitness goals)."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "df['weight_difference'] = df['current_weight'] - 65\n",
        "df"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Standardize Features\n",
        "\n",
        "Standardize numerical features (`current_weight`, `exercise_hours`, `calorie_intake`, `weight_difference`) to have zero mean and unit variance."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "scaler = StandardScaler()\n",
        "features = ['current_weight', 'exercise_hours', 'calorie_intake', 'weight_difference']\n",
        "df[features] = scaler.fit_transform(df[features])\n",
        "df"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Visualize Engineered Features\n",
        "\n",
        "Plot the distribution of the new `weight_difference` feature."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Set up the plotting style\n",
        "sns.set(style='whitegrid')\n",
        "\n",
        "# Plot histogram for weight_difference\n",
        "plt.figure(figsize=(6, 4))\n",
        "sns.histplot(df['weight_difference'], kde=True)\n",
        "plt.title('Distribution of Weight Difference')\n",
        "plt.xlabel('Weight Difference (Standardized)')\n",
        "plt.savefig('weight_difference_distribution.png')\n",
        "plt.show()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Save Engineered Dataset\n",
        "\n",
        "Save the dataset with the new feature and standardized values."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "df.to_csv('engineered_fitness_data.csv', index=False)\n",
        "print('Engineered dataset saved as engineered_fitness_data.csv')"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Summary\n",
        "\n",
        "Added `weight_difference` as a new feature, assuming a target weight of 65 kg. Standardized all numerical features for modeling. The dataset is now ready for training the linear regression model."
      ]
    }
  ],
  "metadata": {
    "kernelspec": {
      "display_name": "Python 3",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.8.10"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 4
}