Xylambda · Xylambda · Aug 8, 2022 · Feb 4, 2022 · Feb 4, 2022 · Feb 18, 2022
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -4,6 +4,30 @@ All notable changes to this project will be documented in this file.
 
 The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
 
+## [4.2.0] - 2022-08-08
+
+## Added
+- Add `torch.ditribution` example, with code taken from [Romain Strock](https://romainstrock.com/blog/modeling-uncertainty-with-pytorch.html).
+- Add `predict` method to `Trainer`. #38
+- Add functions to freeze and unfreeze model. #43
+- Add function to transform dataset into time series dataset.
+
+## Fixed
+- Metrics are now moved to the execution device #41.
+- Log level is now used in the Trainer. #40
+- `LearningRateScheduler` now does not crash in first epoch when `on_train` is False. #36
+
+## Changed
+- Make regularization part of the callbacks system. #37 
+- Divide utils into three submodules: `convenience`,`preprocessing` and `data`.
+- Update requirements to avoid conflicts.
+- Update some tests.
+
+## Removed
+
+- Remove old regularization module and all related code.
+
+
 ## [4.1.2] - 2021-12-24
 
 ### Fixed
@@ -88,6 +112,7 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
 - Update tests with new testing methods.
 - Make some method on Trainer and Manager private.
 
+
 ## [3.0.0] - 2021-07-27
 
 ### Fixed
@@ -113,6 +138,7 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
 - Add testing utility to check gradients: `compute_forward_gradient`.
 - Add more functions to `utils`: `FastTensorDataLoader`, `check_model_on_cuda`.
 
+
 ## [2.0.2] - 2021-05-10
 
 ### Fixed
@@ -126,6 +152,7 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
 - Change `_validate` in favour of `validation_step`.
 - Update tests to be correct.
 
+
 ## [2.0.1] - 2021-04-29
 
 ### Added

diff --git a/README.md b/README.md
@@ -21,17 +21,14 @@ The library also provides a callbacks API that can be used to interact with
 the model during the training process, as well as a set of basic regularization
 procedures.
 
-Additionally, you will find the `Manager` class which allows you to run 
-multiple experiments for different random seeds.
-
 ## Installation
 **Normal user**
 ```bash
 pip install torchfitter
 ```
 
 This library does not ship CUDA nor XLA. Follow the 
-[official PyTorch documentarion](https://pytorch.org/get-started/locally/) for
+[official PyTorch documentation](https://pytorch.org/get-started/locally/) for
 more information about how to install CUDA binaries.
 
 **Developer**
@@ -130,40 +127,6 @@ trainer = Trainer(
 )
 ```
 
-
-## Regularization
-`TorchFitter` includes regularization algorithms but you can also create your
-own procedures. To create your own algorithms you just:
-1. Inherit from `RegularizerBase` and call the `super` operator appropiately.
-2. Implement the procedure in the `compute_penalty` method.
-
-Here's an example implementing L1 from scratch:
-
-```python
-import torch
-from torchfitter.regularization.base import RegularizerBase
-
-
-class L1Regularization(RegularizerBase):
-    def __init__(self, regularization_rate, biases=False):
-        super(L1Regularization, self).__init__(regularization_rate, biases)
-
-    def compute_penalty(self, named_parameters, device):
-        # Initialize with tensor, cannot be scalar
-        penalty_term = torch.zeros(1, 1, requires_grad=True).to(device)
-
-        for name, param in named_parameters:
-            if not self.biases and name.endswith("bias"):
-                pass
-            else:
-                penalty_term = penalty_term + param.norm(p=1)
-
-        return self.rate * penalty_term
-```
-
-Notice how the `penalty_term` is moved to the given `device`. This is necessary
-in order to avoid operations with tensors stored at different devices.
-
 ## Callbacks
 Callbacks allow you to interact with the model during the fitting process. They
 provide with different methods that are called at different stages. To create a 

diff --git a/examples/trainer.py → examples/regression.py b/examples/trainer.py → examples/regression.py
@@ -14,11 +14,11 @@
 from torchfitter.utils.data import DataWrapper
 from torchfitter.conventions import ParamsDict
 from sklearn.model_selection import train_test_split
-from torchfitter.regularization import L1Regularization
 from torchfitter.callbacks import (
     EarlyStopping,
     RichProgressBar,
     StochasticWeightAveraging,
+    L1Regularization
 )
 
 # -----------------------------------------------------------------------------
@@ -29,12 +29,19 @@
 
 
 def main():
+    # -------------------------------------------------------------------------
+    # argument parsing
+    parser = argparse.ArgumentParser("")
+    parser.add_argument("--epochs", type=int, default=5000)
+
+    args = parser.parse_args()
+    n_epochs = args.epochs
+
     # -------------------------------------------------------------------------
     X = np.load(DATA_PATH / "features.npy")
     y = np.load(DATA_PATH / "labels.npy")
     y = y.reshape(-1, 1)
 
-
     # simplest case of cross-validation
     X_train, X_val, y_train, y_val = train_test_split(
         X, y, test_size=0.33, random_state=42
@@ -43,7 +50,6 @@ def main():
     # -------------------------------------------------------------------------
     model = nn.Linear(in_features=1, out_features=1)
 
-    regularizer = L1Regularization(regularization_rate=0.01, biases=False)
     criterion = nn.MSELoss()
     optimizer = optim.Adam(model.parameters(), lr=0.005)
 
@@ -58,6 +64,7 @@ def main():
         EarlyStopping(patience=100, load_best=True),
         swa_callback,
         RichProgressBar(display_step=100, log_lr=False),
+        L1Regularization(regularization_rate=0.01, biases=False)
     ]
 
     metrics = [
@@ -80,27 +87,14 @@ def main():
         model=model,
         criterion=criterion,
         optimizer=optimizer,
-        regularizer=regularizer,
         callbacks=callbacks,
         metrics=metrics,
     )
 
     # -------------------------------------------------------------------------
-    # argument parsing
-    parser = argparse.ArgumentParser("")
-    parser.add_argument("--epochs", type=int, default=5000)
-
-    args = parser.parse_args()
-    n_epochs = args.epochs
-
-    # -------------------------------------------------------------------------
-    # fitting process
+    # fitting process and predictions
     history = trainer.fit(train_loader, val_loader, epochs=n_epochs)
-
-    # predictions
-    with torch.no_grad():
-        to_predict = torch.from_numpy(X_val).float()
-        y_pred = model(to_predict).cpu().numpy()
+    y_pred = trainer.predict(X_val, as_array=True)
 
     # -------------------------------------------------------------------------
     # plot predictions, losses and learning rate

diff --git a/examples/torchdist.py b/examples/torchdist.py
@@ -0,0 +1,194 @@
+"""
+In this example, a regression model with the ability to predict a mean and
+standard deviation is created and trained using torchfitter.
+
+By predicting a mean and a std. one can define some sort of uncertainty
+interval around the predictions (a.k.a. how sure is my model about the
+prediction of this sample?).
+"""
+
+import torch
+import argparse
+import torch.nn as nn
+import torch.optim as optim
+import matplotlib.pyplot as plt
+from torchfitter.conventions import ParamsDict
+from sklearn.datasets import make_regression
+from torchfitter.utils.preprocessing import train_test_val_split, torch_to_numpy
+from torchfitter.trainer import Trainer
+from torch.utils.data import DataLoader
+from torchfitter.utils.data import DataWrapper
+from torchfitter.callbacks import RichProgressBar, EarlyStopping
+
+
+class DeepNormal(nn.Module):
+    """Neural network with parametrizable normal distribution as output.
+
+    Taken from [1].
+
+    References
+    ----------
+    .. [1] Romain Strock - Modeling uncertainty with Pytorch:
+       https://romainstrock.com/blog/modeling-uncertainty-with-pytorch.html
+    """
+    def __init__(self, n_inputs, n_hidden):
+        super().__init__()
+
+        # Shared parameters
+        self.shared_layer = nn.Sequential(
+            nn.Linear(n_inputs, n_hidden),
+            nn.ReLU(),
+            nn.Dropout(),
+        )
+
+        # Mean parameters
+        self.mean_layer = nn.Sequential(
+            nn.Linear(n_hidden, n_hidden),
+            nn.ReLU(),
+            nn.Dropout(),
+            nn.Linear(n_hidden, 1),
+        )
+
+        # Standard deviation parameters
+        self.std_layer = nn.Sequential(
+            nn.Linear(n_hidden, n_hidden),
+            nn.ReLU(),
+            nn.Dropout(),
+            nn.Linear(n_hidden, 1),
+            nn.Softplus(),  # enforces positivity
+        )
+
+    def forward(self, x):
+        # Shared embedding
+        shared = self.shared_layer(x)
+
+        # Parametrization of the mean
+        mean = self.mean_layer(shared)
+
+        # Parametrization of the standard deviation
+        std = self.std_layer(shared)
+
+        return torch.distributions.Normal(mean, std)
+
+
+class NLLLoss(nn.Module):
+    def __init__(self):
+        super().__init__()
+
+    def forward(self, output, target):
+        """
+        Assumes `output` is a distribution.
+        """
+        neg_log_likelihood = -output.log_prob(target)
+        return torch.mean(neg_log_likelihood)
+
+
+def main():
+    # -------------------------------------------------------------------------
+    # argument parsing
+    parser = argparse.ArgumentParser("")
+    parser.add_argument("--epochs", type=int, default=5000)
+
+    args = parser.parse_args()
+    n_epochs = args.epochs
+
+    # -------------------------------------------------------------------------
+    # generate dummy data
+    X, y = make_regression(
+        n_samples=5000, n_features=1, n_informative=1, noise=5, random_state=0
+    )
+    y = y.reshape(-1,1)
+
+    # split data into train, test and validation
+    _tup = train_test_val_split(X, y)
+    X_train, y_train, X_val, y_val, X_test, y_test = _tup
+
+    # wrap data in Dataset
+    train_wrapper = DataWrapper(
+        X_train, y_train, dtype_X="float", dtype_y="float"
+    )
+    val_wrapper = DataWrapper(X_val, y_val, dtype_X="float", dtype_y="float")
+
+    # torch Loaders
+    train_loader = DataLoader(train_wrapper, batch_size=64, pin_memory=True)
+    val_loader = DataLoader(val_wrapper, batch_size=64, pin_memory=True)
+
+    # -------------------------------------------------------------------------
+    # define model, optimizer and loss
+    criterion = NLLLoss()
+    model = DeepNormal(n_inputs=X.shape[1], n_hidden=15)
+    optimizer = optim.AdamW(model.parameters(), lr=1e-3)
+
+    # callbacks list
+    callbacks = [
+        EarlyStopping(patience=150, load_best=True),
+        RichProgressBar(display_step=50)
+    ]
+
+    # instantiate Trainer object with all the configuration
+    trainer = Trainer(
+        model=model,
+        criterion=criterion,
+        optimizer=optimizer,
+        callbacks=callbacks,
+    )
+
+    # train process
+    history = trainer.fit(train_loader, val_loader, epochs=n_epochs)
+
+    # -------------------------------------------------------------------------
+    # this is a torch distribution
+    distr_prediction = trainer.predict(X_test)
+
+    # get mean and standard deviation for each sample in test
+    y_pred = distr_prediction.mean
+    y_pred_std = distr_prediction.stddev
+
+    # to array
+    y_pred = torch_to_numpy(y_pred)
+    y_pred_std = torch_to_numpy(y_pred_std)
+
+    # -------------------------------------------------------------------------
+    # plot losses, mean predictions and lr
+    fig, ax = plt.subplots(nrows=1, ncols=3, figsize=(19, 4))
+    epoch_hist = history[ParamsDict.EPOCH_HISTORY]
+
+    ax[0].plot(epoch_hist[ParamsDict.LOSS]["train"], label="Train loss")
+    ax[0].plot(
+        epoch_hist[ParamsDict.LOSS]["validation"], label="Validation loss"
+    )
+    ax[0].set_title("Train and validation losses")
+    ax[0].grid()
+    ax[0].legend()
+
+    ax[1].plot(X_test, y_test, ".", label="Real")
+    ax[1].plot(X_test, y_pred, ".", label="Prediction")
+    ax[1].set_title("Predictions")
+    ax[1].grid()
+    ax[1].legend()
+
+    ax[2].plot(epoch_hist[ParamsDict.HISTORY_LR], label="Learning rate")
+    ax[2].set_title("Learning Rate")
+    ax[2].legend()
+    ax[2].grid()
+    plt.show()
+
+    # -------------------------------------------------------------------------
+    # create some upper and lower bounds
+    lower = y_pred - 2 * y_pred_std
+    upper = y_pred + 2 * y_pred_std
+
+    fig, ax = plt.subplots(1, 1, figsize=(15,8))
+
+    ax.plot(X_test, y_test, "*k")
+    ax.scatter(X_test.flatten(), y_pred, label="predicted means")
+
+    ax.scatter(X_test.flatten(), lower)
+    ax.scatter(X_test.flatten(), upper)
+
+    ax.grid(True)
+    ax.legend()
+
+
+if __name__ == "__main__":
+    main()