question about normalized MSE

In paper 2.1:
> We report a normalized version of all MSE numbers, where we divide by a baseline reconstruction error of always predicting the mean activations.

In readme example:
```python
normalized_mse = (reconstructed_activations - input_tensor).pow(2).sum(dim=1) / (input_tensor).pow(2).sum(dim=1)
```

Which is the same as in loss.py:
```python
def normalized_mean_squared_error(
    reconstruction: torch.Tensor,
    original_input: torch.Tensor,
) -> torch.Tensor:
    """
    :param reconstruction: output of Autoencoder.decode (shape: [batch, n_inputs])
    :param original_input: input of Autoencoder.encode (shape: [batch, n_inputs])
    :return: normalized mean squared error (shape: [1])
    """
    return (
        ((reconstruction - original_input) ** 2).mean(dim=1) / (original_input**2).mean(dim=1)
    ).mean()
```

The way I understand normalized MSE and divide by `baseline reconstruction error of always predicting the mean activations` is
```python
mean_activations = input_tensor.mean(dim=1)
baseline_mse = (input_tensor - mean_activations).pow(2).mean()
actual_mse = (reconstructed_activations - input_tensor).pow(2).mean()
normalized_mse = actual_mse / baseline_mse
```

What did I miss? Did I misunderstand the paper or code? Thx for your time. 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

question about normalized MSE #8

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

question about normalized MSE #8

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions