Update model training

parcaster · Dec 2, 2023 · 8e7d8ea · 8e7d8ea
1 parent dd1b650
commit 8e7d8ea
Show file tree

Hide file tree

Showing 2 changed files with 156 additions and 76 deletions.
diff --git a/model/W&B_PPSG_LSTM.ipynb b/model/W&B_PPSG_LSTM.ipynb
@@ -7,38 +7,17 @@
    },
    "source": [
     "\n",
-    "# Hyperparameter Sweeps\n",
+    "# Hyperparameter Sweeps for Parking Data in St. Gallen with LSTM\n",
     "\n",
     "In this project, we use Hyperparemter sweeps with Pytorch on \"Weights & Biases\". For further details, check out this [Colab](http://wandb.me/sweeps-colab).\n",
     "\n",
-    "Inspired by https://github.com/SheezaShabbir/Time-series-Analysis-using-LSTM-RNN-and-GRU"
+    "A lot of inspirations come from [this GitHub repository](https://github.com/SheezaShabbir/Time-series-Analysis-using-LSTM-RNN-and-GRU)"
    ]
   },
   {
    "cell_type": "markdown",
-   "metadata": {
-    "id": "hjckmLcx5qL_"
-   },
    "source": [
-    "## Setup\n",
-    "\n",
-    "Start out by installing the experiment tracking library and setting up your free W&B account:\n",
-    "\n",
-    "1. Install with `!pip install`\n",
-    "2. `import` the library into Python\n",
-    "3. `.login()` so you can log metrics to your projects\n",
-    "\n",
-    "If you've never used Weights & Biases before,\n",
-    "the call to `login` will give you a link to sign up for an account.\n",
-    "W&B is free to use for personal and academic projects!"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "outputs": [],
-   "source": [
-    "!pip install wandb -Uq"
+    "## Imports"
    ],
    "metadata": {
     "collapsed": false
@@ -49,12 +28,34 @@
    "execution_count": null,
    "outputs": [],
    "source": [
-    "import wandb"
+    "import wandb\n",
+    "import os\n",
+    "import torch\n",
+    "import torch.optim as optim\n",
+    "import torch.nn.functional as F\n",
+    "import pandas as pd\n",
+    "import numpy as np\n",
+    "import matplotlib.pyplot as plt\n",
+    "from torch.utils.data import TensorDataset, DataLoader\n",
+    "from sklearn.preprocessing import StandardScaler, MaxAbsScaler, MinMaxScaler, RobustScaler\n",
+    "from models import LSTMModel, RNNModel, GRUModel\n",
+    "from scaler import Scaler\n",
+    "from data.metadata.metadata import parking_data_labels\n",
+    "from data.preprocessing.preprocess_features import PreprocessFeatures"
    ],
    "metadata": {
     "collapsed": false
    }
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "hjckmLcx5qL_"
+   },
+   "source": [
+    "## Login to W&B"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": null,
@@ -76,7 +77,9 @@
     "\n",
     "We define the sweep config via dict in our Jupyter notebook. You can find more information on sweeps in the [documentation](https://docs.wandb.com/sweeps/configuration).\n",
     "\n",
-    "You can find a list of all configuration options [here](https://docs.wandb.com/library/sweeps/configuration) and a big collection of examples in YAML format [here](https://github.com/wandb/examples/tree/master/examples/keras/keras-cnn-fashion)."
+    "You can find a list of all configuration options [here](https://docs.wandb.com/library/sweeps/configuration) and a big collection of examples in YAML format [here](https://github.com/wandb/examples/tree/master/examples/keras/keras-cnn-fashion).\n",
+    "\n",
+    "We use some information for good values from this [source](https://towardsdatascience.com/choosing-the-right-hyperparameters-for-a-simple-lstm-using-keras-f8e9ed76f046)."
    ]
   },
   {
@@ -87,9 +90,6 @@
    },
    "outputs": [],
    "source": [
-    "# See also https://towardsdatascience.com/choosing-the-right-hyperparameters-for-a-simple-lstm-using-keras-f8e9ed76f046\n",
-    "# Runs are here: https://wandb.ai/parcaster/pp-sg-lstm\n",
-    "\n",
     "sweep_config = {\n",
     "    'method': 'bayes',\n",
     "    'metric': {\n",
@@ -142,7 +142,9 @@
     "id": "bHesSoz85qMF"
    },
    "source": [
-    "## Initialize the setup"
+    "## Initialize the setup\n",
+    "\n",
+    "You can find the runs [here](https://wandb.ai/parcaster/pp-sg-lstm)."
    ]
   },
   {
@@ -191,20 +193,6 @@
    },
    "outputs": [],
    "source": [
-    "import os\n",
-    "import torch\n",
-    "import torch.optim as optim\n",
-    "import torch.nn.functional as F\n",
-    "import pandas as pd\n",
-    "import numpy as np\n",
-    "import matplotlib.pyplot as plt\n",
-    "from torch.utils.data import TensorDataset, DataLoader\n",
-    "from sklearn.preprocessing import StandardScaler, MaxAbsScaler, MinMaxScaler, RobustScaler\n",
-    "from models import LSTMModel, RNNModel, GRUModel\n",
-    "from scaler import Scaler\n",
-    "from data.metadata.metadata import feature_columns, parking_data_labels\n",
-    "from data.preprocessing.preprocess_features import PreprocessFeatures\n",
-    "\n",
     "device = torch.device(\"cuda\" if torch.cuda.is_available() else \"cpu\")\n",
     "\n",
     "print(f\"Training on {device}\")\n",
@@ -249,6 +237,17 @@
     "        save_model_scaler(network, scaler)"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "source": [
+    "### Load the data\n",
+    "\n",
+    "We load the data from the CSV files and split it into training and validation sets."
+   ],
+   "metadata": {
+    "collapsed": false
+   }
+  },
   {
    "cell_type": "code",
    "execution_count": null,
@@ -259,9 +258,6 @@
     "\n",
     "    preprocess_features = PreprocessFeatures(df)\n",
     "\n",
-    "    # TODO unify this\n",
-    "    # df['datetime'] = pd.to_datetime(df['datetime'], format='%d.%m.%Y %H:%M')\n",
-    "\n",
     "    y = df[parking_data_labels]\n",
     "    X, input_dim = preprocess_features.get_features_for_model()\n",
     "\n",
@@ -283,6 +279,17 @@
     "collapsed": false
    }
   },
+  {
+   "cell_type": "markdown",
+   "source": [
+    "### Apply the scaler\n",
+    "\n",
+    "We apply the scaler to the data."
+   ],
+   "metadata": {
+    "collapsed": false
+   }
+  },
   {
    "cell_type": "code",
    "execution_count": null,
@@ -310,12 +317,9 @@
     "id": "Lq8SQD9s5qMG"
    },
    "source": [
-    "This cell defines the four pieces of our training procedure:\n",
-    "`build_dataset`, `build_network`, `build_optimizer`, and `train_epoch`.\n",
+    "### Build the dataset\n",
     "\n",
-    "All of these are a standard part of a basic PyTorch pipeline,\n",
-    "and their implementation is unaffected by the use of W&B,\n",
-    "so we won't comment on them."
+    "We build the dataset from features (X) and labels (y). This is used for training data, validation data and test data."
    ]
   },
   {
@@ -332,9 +336,27 @@
     "\n",
     "    dataset = TensorDataset(features, targets)\n",
     "\n",
-    "    return DataLoader(dataset, batch_size=batch_size, shuffle=False, drop_last=True)\n",
+    "    return DataLoader(dataset, batch_size=batch_size, shuffle=False, drop_last=True)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "source": [
+    "### Build the network\n",
     "\n",
+    "We build the network with the given hyperparameters. We can choose between RNN, LSTM and GRU.\n",
     "\n",
+    "We decided to use LSTM, because it is the most powerful of the three and we have a lot of data."
+   ],
+   "metadata": {
+    "collapsed": false
+   }
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "outputs": [],
+   "source": [
     "def build_network(fc_layer_size, dropout, num_layers, input_dim, output_dim, model):\n",
     "    if model == \"rnn\":\n",
     "        network = RNNModel(input_dim=input_dim, hidden_dim=fc_layer_size, layer_dim=num_layers, output_dim=output_dim,\n",
@@ -348,19 +370,59 @@
     "    else:\n",
     "        raise ValueError(f\"Invalid model value: {model}\")\n",
     "\n",
-    "    return network.to(device)\n",
-    "\n",
+    "    return network.to(device)\n"
+   ],
+   "metadata": {
+    "collapsed": false
+   }
+  },
+  {
+   "cell_type": "markdown",
+   "source": [
+    "### Build the optimizer\n",
     "\n",
+    "We build the optimizer with the given hyperparameters. We can choose between SGD and Adam."
+   ],
+   "metadata": {
+    "collapsed": false
+   }
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "outputs": [],
+   "source": [
     "def build_optimizer(network, optimizer, learning_rate):\n",
     "    if optimizer == \"sgd\":\n",
     "        optimizer = optim.SGD(network.parameters(),\n",
     "                              lr=learning_rate, momentum=0.9)\n",
     "    elif optimizer == \"adam\":\n",
     "        optimizer = optim.Adam(network.parameters(),\n",
     "                               lr=learning_rate)\n",
-    "    return optimizer\n",
-    "\n",
+    "    return optimizer"
+   ],
+   "metadata": {
+    "collapsed": false
+   }
+  },
+  {
+   "cell_type": "markdown",
+   "source": [
+    "### Define the training and validation epochs, and the test network\n",
     "\n",
+    "- `train_epoch` is used in every epoch to train the network. The training loss is calculated with the mean squared error.\n",
+    "- `val_epoch` is used in every epoch to validate the network. The validation loss is calculated with the mean squared error.\n",
+    "- `test_network` is used after the training to test the network. The test loss is calculated with the mean squared error."
+   ],
+   "metadata": {
+    "collapsed": false
+   }
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "outputs": [],
+   "source": [
     "def train_epoch(network, loader, optimizer, batch_size, input_dim):\n",
     "    losses = []\n",
     "    network.train()\n",
@@ -412,7 +474,21 @@
     "            losses.append(loss.item())\n",
     "\n",
     "    return np.mean(losses), outputs, targets"
-   ]
+   ],
+   "metadata": {
+    "collapsed": false
+   }
+  },
+  {
+   "cell_type": "markdown",
+   "source": [
+    "## Plot the test prediction\n",
+    "\n",
+    "We invert the scaler and plot the prediction and the target."
+   ],
+   "metadata": {
+    "collapsed": false
+   }
   },
   {
    "cell_type": "code",
@@ -465,6 +541,17 @@
     "collapsed": false
    }
   },
+  {
+   "cell_type": "markdown",
+   "source": [
+    "## Save the model and the scaler\n",
+    "\n",
+    "This can later be reused to predict real values. (see app.py)"
+   ],
+   "metadata": {
+    "collapsed": false
+   }
+  },
   {
    "cell_type": "code",
    "execution_count": null,
@@ -483,20 +570,8 @@
   },
   {
    "cell_type": "markdown",
-   "metadata": {
-    "id": "UIEd9-Gm5qMH"
-   },
    "source": [
-    "The cell below will launch an `agent` that runs `train` 5 times,\n",
-    "usingly the randomly-generated hyperparameter values returned by the Sweep Controller. Execution takes under 5 minutes."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "source": [
-    "Now, we're ready to start sweeping! 🧹🧹🧹\n",
-    "\n",
-    "Sweep Controllers, like the one we made by running `wandb.sweep`, sit waiting for someone to ask them for a `config` to try out.\n",
+    "# Start the agent and run the sweep\n",
     "\n",
     "That someone is an `agent`, and they are created with `wandb.agent`.\n",
     "To get going, the agent just needs to know\n",