Skip to content

Commit

Permalink
Update model training
Browse files Browse the repository at this point in the history
  • Loading branch information
mr-perseus committed Dec 2, 2023
1 parent dd1b650 commit 8e7d8ea
Show file tree
Hide file tree
Showing 2 changed files with 156 additions and 76 deletions.
217 changes: 146 additions & 71 deletions model/W&B_PPSG_LSTM.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -7,38 +7,17 @@
},
"source": [
"\n",
"# Hyperparameter Sweeps\n",
"# Hyperparameter Sweeps for Parking Data in St. Gallen with LSTM\n",
"\n",
"In this project, we use Hyperparemter sweeps with Pytorch on \"Weights & Biases\". For further details, check out this [Colab](http://wandb.me/sweeps-colab).\n",
"\n",
"Inspired by https://github.com/SheezaShabbir/Time-series-Analysis-using-LSTM-RNN-and-GRU"
"A lot of inspirations come from [this GitHub repository](https://github.com/SheezaShabbir/Time-series-Analysis-using-LSTM-RNN-and-GRU)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "hjckmLcx5qL_"
},
"source": [
"## Setup\n",
"\n",
"Start out by installing the experiment tracking library and setting up your free W&B account:\n",
"\n",
"1. Install with `!pip install`\n",
"2. `import` the library into Python\n",
"3. `.login()` so you can log metrics to your projects\n",
"\n",
"If you've never used Weights & Biases before,\n",
"the call to `login` will give you a link to sign up for an account.\n",
"W&B is free to use for personal and academic projects!"
]
},
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"!pip install wandb -Uq"
"## Imports"
],
"metadata": {
"collapsed": false
Expand All @@ -49,12 +28,34 @@
"execution_count": null,
"outputs": [],
"source": [
"import wandb"
"import wandb\n",
"import os\n",
"import torch\n",
"import torch.optim as optim\n",
"import torch.nn.functional as F\n",
"import pandas as pd\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"from torch.utils.data import TensorDataset, DataLoader\n",
"from sklearn.preprocessing import StandardScaler, MaxAbsScaler, MinMaxScaler, RobustScaler\n",
"from models import LSTMModel, RNNModel, GRUModel\n",
"from scaler import Scaler\n",
"from data.metadata.metadata import parking_data_labels\n",
"from data.preprocessing.preprocess_features import PreprocessFeatures"
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "markdown",
"metadata": {
"id": "hjckmLcx5qL_"
},
"source": [
"## Login to W&B"
]
},
{
"cell_type": "code",
"execution_count": null,
Expand All @@ -76,7 +77,9 @@
"\n",
"We define the sweep config via dict in our Jupyter notebook. You can find more information on sweeps in the [documentation](https://docs.wandb.com/sweeps/configuration).\n",
"\n",
"You can find a list of all configuration options [here](https://docs.wandb.com/library/sweeps/configuration) and a big collection of examples in YAML format [here](https://github.com/wandb/examples/tree/master/examples/keras/keras-cnn-fashion)."
"You can find a list of all configuration options [here](https://docs.wandb.com/library/sweeps/configuration) and a big collection of examples in YAML format [here](https://github.com/wandb/examples/tree/master/examples/keras/keras-cnn-fashion).\n",
"\n",
"We use some information for good values from this [source](https://towardsdatascience.com/choosing-the-right-hyperparameters-for-a-simple-lstm-using-keras-f8e9ed76f046)."
]
},
{
Expand All @@ -87,9 +90,6 @@
},
"outputs": [],
"source": [
"# See also https://towardsdatascience.com/choosing-the-right-hyperparameters-for-a-simple-lstm-using-keras-f8e9ed76f046\n",
"# Runs are here: https://wandb.ai/parcaster/pp-sg-lstm\n",
"\n",
"sweep_config = {\n",
" 'method': 'bayes',\n",
" 'metric': {\n",
Expand Down Expand Up @@ -142,7 +142,9 @@
"id": "bHesSoz85qMF"
},
"source": [
"## Initialize the setup"
"## Initialize the setup\n",
"\n",
"You can find the runs [here](https://wandb.ai/parcaster/pp-sg-lstm)."
]
},
{
Expand Down Expand Up @@ -191,20 +193,6 @@
},
"outputs": [],
"source": [
"import os\n",
"import torch\n",
"import torch.optim as optim\n",
"import torch.nn.functional as F\n",
"import pandas as pd\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"from torch.utils.data import TensorDataset, DataLoader\n",
"from sklearn.preprocessing import StandardScaler, MaxAbsScaler, MinMaxScaler, RobustScaler\n",
"from models import LSTMModel, RNNModel, GRUModel\n",
"from scaler import Scaler\n",
"from data.metadata.metadata import feature_columns, parking_data_labels\n",
"from data.preprocessing.preprocess_features import PreprocessFeatures\n",
"\n",
"device = torch.device(\"cuda\" if torch.cuda.is_available() else \"cpu\")\n",
"\n",
"print(f\"Training on {device}\")\n",
Expand Down Expand Up @@ -249,6 +237,17 @@
" save_model_scaler(network, scaler)"
]
},
{
"cell_type": "markdown",
"source": [
"### Load the data\n",
"\n",
"We load the data from the CSV files and split it into training and validation sets."
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "code",
"execution_count": null,
Expand All @@ -259,9 +258,6 @@
"\n",
" preprocess_features = PreprocessFeatures(df)\n",
"\n",
" # TODO unify this\n",
" # df['datetime'] = pd.to_datetime(df['datetime'], format='%d.%m.%Y %H:%M')\n",
"\n",
" y = df[parking_data_labels]\n",
" X, input_dim = preprocess_features.get_features_for_model()\n",
"\n",
Expand All @@ -283,6 +279,17 @@
"collapsed": false
}
},
{
"cell_type": "markdown",
"source": [
"### Apply the scaler\n",
"\n",
"We apply the scaler to the data."
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "code",
"execution_count": null,
Expand Down Expand Up @@ -310,12 +317,9 @@
"id": "Lq8SQD9s5qMG"
},
"source": [
"This cell defines the four pieces of our training procedure:\n",
"`build_dataset`, `build_network`, `build_optimizer`, and `train_epoch`.\n",
"### Build the dataset\n",
"\n",
"All of these are a standard part of a basic PyTorch pipeline,\n",
"and their implementation is unaffected by the use of W&B,\n",
"so we won't comment on them."
"We build the dataset from features (X) and labels (y). This is used for training data, validation data and test data."
]
},
{
Expand All @@ -332,9 +336,27 @@
"\n",
" dataset = TensorDataset(features, targets)\n",
"\n",
" return DataLoader(dataset, batch_size=batch_size, shuffle=False, drop_last=True)\n",
" return DataLoader(dataset, batch_size=batch_size, shuffle=False, drop_last=True)"
]
},
{
"cell_type": "markdown",
"source": [
"### Build the network\n",
"\n",
"We build the network with the given hyperparameters. We can choose between RNN, LSTM and GRU.\n",
"\n",
"We decided to use LSTM, because it is the most powerful of the three and we have a lot of data."
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"def build_network(fc_layer_size, dropout, num_layers, input_dim, output_dim, model):\n",
" if model == \"rnn\":\n",
" network = RNNModel(input_dim=input_dim, hidden_dim=fc_layer_size, layer_dim=num_layers, output_dim=output_dim,\n",
Expand All @@ -348,19 +370,59 @@
" else:\n",
" raise ValueError(f\"Invalid model value: {model}\")\n",
"\n",
" return network.to(device)\n",
"\n",
" return network.to(device)\n"
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "markdown",
"source": [
"### Build the optimizer\n",
"\n",
"We build the optimizer with the given hyperparameters. We can choose between SGD and Adam."
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"def build_optimizer(network, optimizer, learning_rate):\n",
" if optimizer == \"sgd\":\n",
" optimizer = optim.SGD(network.parameters(),\n",
" lr=learning_rate, momentum=0.9)\n",
" elif optimizer == \"adam\":\n",
" optimizer = optim.Adam(network.parameters(),\n",
" lr=learning_rate)\n",
" return optimizer\n",
"\n",
" return optimizer"
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "markdown",
"source": [
"### Define the training and validation epochs, and the test network\n",
"\n",
"- `train_epoch` is used in every epoch to train the network. The training loss is calculated with the mean squared error.\n",
"- `val_epoch` is used in every epoch to validate the network. The validation loss is calculated with the mean squared error.\n",
"- `test_network` is used after the training to test the network. The test loss is calculated with the mean squared error."
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"def train_epoch(network, loader, optimizer, batch_size, input_dim):\n",
" losses = []\n",
" network.train()\n",
Expand Down Expand Up @@ -412,7 +474,21 @@
" losses.append(loss.item())\n",
"\n",
" return np.mean(losses), outputs, targets"
]
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "markdown",
"source": [
"## Plot the test prediction\n",
"\n",
"We invert the scaler and plot the prediction and the target."
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "code",
Expand Down Expand Up @@ -465,6 +541,17 @@
"collapsed": false
}
},
{
"cell_type": "markdown",
"source": [
"## Save the model and the scaler\n",
"\n",
"This can later be reused to predict real values. (see app.py)"
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "code",
"execution_count": null,
Expand All @@ -483,20 +570,8 @@
},
{
"cell_type": "markdown",
"metadata": {
"id": "UIEd9-Gm5qMH"
},
"source": [
"The cell below will launch an `agent` that runs `train` 5 times,\n",
"usingly the randomly-generated hyperparameter values returned by the Sweep Controller. Execution takes under 5 minutes."
]
},
{
"cell_type": "markdown",
"source": [
"Now, we're ready to start sweeping! 🧹🧹🧹\n",
"\n",
"Sweep Controllers, like the one we made by running `wandb.sweep`, sit waiting for someone to ask them for a `config` to try out.\n",
"# Start the agent and run the sweep\n",
"\n",
"That someone is an `agent`, and they are created with `wandb.agent`.\n",
"To get going, the agent just needs to know\n",
Expand Down

0 comments on commit 8e7d8ea

Please sign in to comment.