The number of parameters is doubled #89

sansiro77 · 2021-07-01T11:57:00Z

Here is the simplest example.

fc1 = BayesianLinear(1, 1)
print(list(fc1.parameters()))
pytorch_total_params = sum(p.numel() for p in fc1.parameters() if p.requires_grad)

The output is:

[Parameter containing:
tensor([[0.0651]], requires_grad=True), Parameter containing:
tensor([[-7.1001]], requires_grad=True), Parameter containing:
tensor([-0.0429], requires_grad=True), Parameter containing:
tensor([-6.9712], requires_grad=True), Parameter containing:
tensor([[0.0651]], requires_grad=True), Parameter containing:
tensor([[-7.1001]], requires_grad=True), Parameter containing:
tensor([-0.0429], requires_grad=True), Parameter containing:
tensor([-6.9712], requires_grad=True)]
total parameters: 8

The parameters are fc1.weight_mu, fc1.weight_rho, fc1.bias_mu, fc1.bias_rho, fc1.weight_sampler.mu, fc1.weight_sampler.rho, fc1.bias_sampler.mu, fc1.bias_sampler.rho, respectively, which is double what is expected.

The text was updated successfully, but these errors were encountered:

sansiro77 · 2021-07-01T13:31:09Z

My current solution is:

count = 0
for name, param in net.named_parameters():
    if ("sampler" not in name) and param.requires_grad:
        count += param.numel()
print(count)

Philippe-Drolet · 2021-08-18T17:14:49Z

I am also wondering what these parameters mean respectively, thank you

sansiro77 · 2021-08-19T05:33:42Z

I am also wondering what these parameters mean respectively, thank you

In Bayesian neural networks, each parameter ("weight" and "bias") is a random variable related to a distribution, which is Gaussian here.
"mu" is the mean and "rho" is related to sigma by self.sigma = torch.log1p(torch.exp(self.rho)).
A "sampler" samples a specific value of the distribution every time the model is calculated.

Philippe-Drolet · 2021-08-19T12:36:44Z

Thanks for the reply! I knew that but I mean more that from what I have seen with BNNs in general, the weight distribution is rarely a perfect normal dist centered at mu and sigma (it usually is more of a gaussian mixture) but here, every weight dist I obtain are like that. Is it variational inference that always gives perfect normal distributions?

sansiro77 · 2021-08-19T15:28:23Z

In the paper "Weight Uncertainty in Neural Networks", the authors applied Gaussian variational posterior and scale mixture prior.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The number of parameters is doubled #89

The number of parameters is doubled #89

sansiro77 commented Jul 1, 2021 •

edited

Loading

sansiro77 commented Jul 1, 2021 •

edited

Loading

Philippe-Drolet commented Aug 18, 2021

sansiro77 commented Aug 19, 2021

Philippe-Drolet commented Aug 19, 2021

sansiro77 commented Aug 19, 2021

The number of parameters is doubled #89

The number of parameters is doubled #89

Comments

sansiro77 commented Jul 1, 2021 • edited Loading

sansiro77 commented Jul 1, 2021 • edited Loading

Philippe-Drolet commented Aug 18, 2021

sansiro77 commented Aug 19, 2021

Philippe-Drolet commented Aug 19, 2021

sansiro77 commented Aug 19, 2021

sansiro77 commented Jul 1, 2021 •

edited

Loading

sansiro77 commented Jul 1, 2021 •

edited

Loading