-
Notifications
You must be signed in to change notification settings - Fork 192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
load pre-train model from original mobilenet #17
Comments
Hi Tracy, Yes can do it with some work. Alessandro |
Hi @alessandro, Thank you so much for such a detailed answer. This is my another account by the way. I did as you said above. One problem is that I set the env flag BREVITAS_IGNORE_MISSING_KEYS=1. It is ok when I set quantization type to QuantType.FP. But it will miss keys related to quantization when loading pre-train model when I set quantization type as QuantType.INT. Do you have any ideas about this? Thank you so much! Best, |
Hi Tracy, Can you please double check that when you set the env flag BREVITAS_IGNORE_MISSING_KEYS=1, if you do import brevitas.config as config
print(config.IGNORE_MISSING_KEYS) it prints True? That's the variable that reads the env setting, so it should be true. Otherwise it means that the env is not being set properly. Regarding dropout, I don't have a pre-made layer but something like this should work (off the top of my head, haven't tested it). QuantTensor is just a tuple, so you can simply unpack it, pass it through the forward function, and the pack the output back into a QuantTensor: import torch.nn as nn
from brevitas.quant_tensor import QuantTensor
QuantDropout(nn.Dropout):
def forward(input_quant_tensor):
inp, scale, bit_width = input_quant_tensor
output = super(QuantDropout, self).forward(inp)
output_quant_tensor = QuantTensor(tensor=output, scale=scale, bit_width=bit_width)
return output_quant_tensor You can take the same approach for nn.MaxPool2d too. Alessandro |
Hi Alessandro, Thanks for your quick reply. Now I can train it successfully. But one more issue. When I set quantization flag to FP, the model can be trained, and while training, the 'free -m' showed that memory used increased slowly and the training process was going normally. But when I set the quantization flag to INT, 'free -m' showed that memory used increased much faster and ran out of the CPU memory, and the training process will be stuck. Do you have any ideas about this issue? Best, |
Hi Tracy, Training aware quantization is expensive compute and memory wise. The idea is always that you trading off increased training cost for reduced inference cost. Good luck with your training. Alessandro |
Hi Alessandro,
Thank you so much for the amazing work. I have a question that I have changed pytorch official mobilenet-v2 code to quantization version by referring to your mobilenet-v1 code and is there any possibility that I could load the pre-train model (original mobilenet-v2 version) pytorch official released (which is a float version) to my quantization mobilenet-v2 so as to finetune from it?
Thank you so much!
Best,
Tracy
The text was updated successfully, but these errors were encountered: