-
Notifications
You must be signed in to change notification settings - Fork 55
Closed
Description
In paper 2.1:
We report a normalized version of all MSE numbers, where we divide by a baseline reconstruction error of always predicting the mean activations.
In readme example:
normalized_mse = (reconstructed_activations - input_tensor).pow(2).sum(dim=1) / (input_tensor).pow(2).sum(dim=1)
Which is the same as in loss.py:
def normalized_mean_squared_error(
reconstruction: torch.Tensor,
original_input: torch.Tensor,
) -> torch.Tensor:
"""
:param reconstruction: output of Autoencoder.decode (shape: [batch, n_inputs])
:param original_input: input of Autoencoder.encode (shape: [batch, n_inputs])
:return: normalized mean squared error (shape: [1])
"""
return (
((reconstruction - original_input) ** 2).mean(dim=1) / (original_input**2).mean(dim=1)
).mean()
The way I understand normalized MSE and divide by baseline reconstruction error of always predicting the mean activations
is
mean_activations = input_tensor.mean(dim=1)
baseline_mse = (input_tensor - mean_activations).pow(2).mean()
actual_mse = (reconstructed_activations - input_tensor).pow(2).mean()
normalized_mse = actual_mse / baseline_mse
What did I miss? Did I misunderstand the paper or code? Thx for your time.
Metadata
Metadata
Assignees
Labels
No labels