Skip to content

question about normalized MSE #8

@lukaemon

Description

@lukaemon

In paper 2.1:

We report a normalized version of all MSE numbers, where we divide by a baseline reconstruction error of always predicting the mean activations.

In readme example:

normalized_mse = (reconstructed_activations - input_tensor).pow(2).sum(dim=1) / (input_tensor).pow(2).sum(dim=1)

Which is the same as in loss.py:

def normalized_mean_squared_error(
    reconstruction: torch.Tensor,
    original_input: torch.Tensor,
) -> torch.Tensor:
    """
    :param reconstruction: output of Autoencoder.decode (shape: [batch, n_inputs])
    :param original_input: input of Autoencoder.encode (shape: [batch, n_inputs])
    :return: normalized mean squared error (shape: [1])
    """
    return (
        ((reconstruction - original_input) ** 2).mean(dim=1) / (original_input**2).mean(dim=1)
    ).mean()

The way I understand normalized MSE and divide by baseline reconstruction error of always predicting the mean activations is

mean_activations = input_tensor.mean(dim=1)
baseline_mse = (input_tensor - mean_activations).pow(2).mean()
actual_mse = (reconstructed_activations - input_tensor).pow(2).mean()
normalized_mse = actual_mse / baseline_mse

What did I miss? Did I misunderstand the paper or code? Thx for your time.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions