Embed LR schedule and initialization with the model

I tried to implement [SqueezeNet](https://github.com/DeepScale/SqueezeNet) as a `torchvision` model and train it via [ImageNet example](https://github.com/pytorch/examples/tree/master/imagenet), and found that it doesn't converge as is. The reference code differs in two aspect:

- All but the last convolutions are initialized with Xavier Glorot initializer, the last is normal with stdev 0.01
- The learning rate is linearly decreased (polynomial schedule with power=1).

In PyTorch these aspects are hard-coded inside the ImageNet example, but I think it makes sense to make them part of the model definition in `torch.vision`. What's your position on it?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Embed LR schedule and initialization with the model #36

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Embed LR schedule and initialization with the model #36

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions