Borrowing Weights from a Pretrained Network
Domenic Curro edited this page Feb 23, 2016
·
5 revisions
Pages 35
- Home
- AWS EC2 GPU enabled Caffe AMI
- Borrowing Weights from a Pretrained Network
- Caffe installing script for ubuntu 16.04 support Cuda 8
- Caffe on EC2 Ubuntu 14.04 Cuda 7
- Caffe Output: .caffemodel .solverstate
- Contributing
- Development
- Excluding Layers: Train and Test Phase
- Faster Caffe Training
- Fine Tuning or Training Certain Layers Exclusively
- GeForce GTX 1080, CUDA 8.0, Ubuntu 16.04, Caffe
- IDE Nvidia’s Eclipse Nsight
- Image Format: BGR not RGB
- Install Caffe on EC2 from scratch (Ubuntu, CUDA 7, cuDNN 3)
- Installation
- Installation (OSX)
- Making Prototxt Nets with Python
- Model Zo
- Model Zoo
- Models accuracy on ImageNet 2012 val
- OpenCV 3.2 Installation Guide on Ubuntu 16.04
- Python Layer Unit Tests
- Related Projects
- Reporting Bugs and Other Issues
- Simple Example: Sin Layer
- Solver Prototxt
- The Data Layer
- The Datum Object
- Training and Resuming
- Ubuntu 14.04 ec2 instance
- Ubuntu 14.04 VirtualBox VM
- Ubuntu 16.04 or 15.10 Installation Guide
- Using a Trained Network: Deploy
- Working with Blobs
- Show 20 more pages…
Clone this wiki locally
Borrowing Weights
What are Weights
Weights are learnable parameters in the network, which are tuned by the back propagation phase.
Who Makes the Weights?
Weights are generated by caffe when your network in constructed.
Borrowing Weights from a Pretrained Network
To borrow the weights of an already trained model, we need to do two things:
- Rename our layer to match the name of the original model's layer. The weights are assigned by layer name, thus using the original network's layer name, we get it's weights.
For example, let say the original model had a layer name ip1, then we should name our layer ip1:
layer {
name: "ip1"
type: "InnerProduct"
bottom: "pool2"
top: "ip1"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 500
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
- Train our new hybrid model declaring the location of the weights:
caffe train —solver ourSolver.prototxt —weights theirModel.caffemodel
What About the Other Layers of Our Network?
The other layers of our network will be initialized just like any other brand new layer (usually ~zero).