-
Notifications
You must be signed in to change notification settings - Fork 145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rescale the network #9
Comments
Thank you! I have experimented with this before. When I tried just halving the resolution it stopped producing useful output. If you need more speed, you can try using the new 30 point model Lines 666 to 680 in 46e26f5
|
@emilianavt Thanks for your reply, I will also trying the 56x56 network. from model import *
PATH = "./weights/lm_model0.pth"
model = OpenSeeFaceLandmarks("small", 0.5, True)
model.load_state_dict(torch.load(PATH))
model.eval() Throws
My environment is Python3.7, pytorch 1.6 |
Alright. The issue of model.py cause by the update of genffnet. I fix the problem by adding from geffnet.efficientnet_builder import *
from geffnet.config import layer_config_kwargs
from geffnet.activations import get_act_fn, get_act_layer
def _gen_mobilenet_v3(variant, channel_multiplier=1.0, pretrained=False, **kwargs):
"""Creates a MobileNet-V3 large/small/minimal models.
Ref impl: https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet/mobilenet_v3.py
Paper: https://arxiv.org/abs/1905.02244
Args:
channel_multiplier: multiplier to number of channels per layer.
"""
if 'small' in variant:
num_features = 1024
if 'minimal' in variant:
act_layer = 'relu'
arch_def = [
# stage 0, 112x112 in
['ds_r1_k3_s2_e1_c16'],
# stage 1, 56x56 in
['ir_r1_k3_s2_e4.5_c24', 'ir_r1_k3_s1_e3.67_c24'],
# stage 2, 28x28 in
['ir_r1_k3_s2_e4_c40', 'ir_r2_k3_s1_e6_c40'],
# stage 3, 14x14 in
['ir_r2_k3_s1_e3_c48'],
# stage 4, 14x14in
['ir_r3_k3_s2_e6_c96'],
# stage 6, 7x7 in
['cn_r1_k1_s1_c576'],
]
else:
act_layer = 'hard_swish'
arch_def = [
# stage 0, 112x112 in
['ds_r1_k3_s2_e1_c16_se0.25_nre'], # relu
# stage 1, 56x56 in
['ir_r1_k3_s2_e4.5_c24_nre', 'ir_r1_k3_s1_e3.67_c24_nre'], # relu
# stage 2, 28x28 in
['ir_r1_k5_s2_e4_c40_se0.25', 'ir_r2_k5_s1_e6_c40_se0.25'], # hard-swish
# stage 3, 14x14 in
['ir_r2_k5_s1_e3_c48_se0.25'], # hard-swish
# stage 4, 14x14in
['ir_r3_k5_s2_e6_c96_se0.25'], # hard-swish
# stage 6, 7x7 in
['cn_r1_k1_s1_c576'], # hard-swish
]
else:
num_features = 1280
if 'minimal' in variant:
act_layer = 'relu'
arch_def = [
# stage 0, 112x112 in
['ds_r1_k3_s1_e1_c16'],
# stage 1, 112x112 in
['ir_r1_k3_s2_e4_c24', 'ir_r1_k3_s1_e3_c24'],
# stage 2, 56x56 in
['ir_r3_k3_s2_e3_c40'],
# stage 3, 28x28 in
['ir_r1_k3_s2_e6_c80', 'ir_r1_k3_s1_e2.5_c80', 'ir_r2_k3_s1_e2.3_c80'],
# stage 4, 14x14in
['ir_r2_k3_s1_e6_c112'],
# stage 5, 14x14in
['ir_r3_k3_s2_e6_c160'],
# stage 6, 7x7 in
['cn_r1_k1_s1_c960'],
]
else:
act_layer = 'hard_swish'
arch_def = [
# stage 0, 112x112 in
['ds_r1_k3_s1_e1_c16_nre'], # relu
# stage 1, 112x112 in
['ir_r1_k3_s2_e4_c24_nre', 'ir_r1_k3_s1_e3_c24_nre'], # relu
# stage 2, 56x56 in
['ir_r3_k5_s2_e3_c40_se0.25_nre'], # relu
# stage 3, 28x28 in
['ir_r1_k3_s2_e6_c80', 'ir_r1_k3_s1_e2.5_c80', 'ir_r2_k3_s1_e2.3_c80'], # hard-swish
# stage 4, 14x14in
['ir_r2_k3_s1_e6_c112_se0.25'], # hard-swish
# stage 5, 14x14in
['ir_r3_k5_s2_e6_c160_se0.25'], # hard-swish
# stage 6, 7x7 in
['cn_r1_k1_s1_c960'], # hard-swish
]
with layer_config_kwargs(kwargs):
model_kwargs = dict(
block_args=decode_arch_def(arch_def),
num_features=num_features,
stem_size=16,
channel_multiplier=channel_multiplier,
act_layer=resolve_act_layer(kwargs, act_layer),
se_kwargs=dict(
act_layer=get_act_layer('relu'), gate_fn=get_act_fn('hard_sigmoid'), reduce_mid=True, divisor=8),
norm_kwargs=resolve_bn_args(kwargs),
**kwargs,
)
return model_kwargs to the model.py and replace kwargs = geffnet.mobilenetv3._gen_mobilenet_v3([size], channel_multiplier=channel_multiplier) with
|
I should probably bundle the necessary geffnet code. |
I am trying to export PyTorch model to onnx model by my own and meat this error. With some hard code I got 112x112 model working. However when export like this dummy_input = torch.randn(1, 3, 112, 112, device='cpu')
torch.onnx.export(model, dummy_input, "lm_model0.onnx", verbose=True, input_names=["input"], output_names=["output"], opset_version=11) It throws
============ I reset commit to 8795d3298d to solve this issue. |
I have not encountered this issue before. I'm on c450c12ae6ffb1757f62dde3c2765da3c10f6def of geffnet. |
I modified the UNetUp class and also the input and output size to make the origin model works on 112x112->14x14 and output the it to onnx. from model import *
PATH = "weights/lm_model0.pth"
model = OpenSeeFaceLandmarks("small", 0.5, True)
model.load_state_dict(torch.load(PATH))
dummy_input = torch.randn(1, 3, 112, 112, device='cpu')
torch.onnx.export(model, dummy_input, "lm_model0_small.onnx", verbose=True, input_names=["input"], output_names=["output"], opset_version=11) The network is 4x faster which is suitable for my application. However it seems didn't provide any result could match the face. Do I need to retrain the model or just maybe some error on heat map processsing? My heat map process code is like this float logit(float p)
{
if (p >= 1.0)
p = 0.99999;
else if (p <= 0.0)
p = 0.0000001;
p = p / (1 - p);
return log(p) / 16;
}
CvPts proc_heatmaps(float* heatmaps, int x0, int y0, float scale_x, float scale_y)
{
CvPts facical_landmarks;
int heatmap_size = EMI_NN_OUTPUT_SIZE*EMI_NN_OUTPUT_SIZE;
for (int landmark = 0; landmark < 66; landmark++)
{
int offset = heatmap_size * landmark;
int argmax = -100;
float maxval = -100;
for (int i = 0; i < heatmap_size; i++)
{
if (heatmaps[offset + i] > maxval)
{
argmax = i;
maxval = heatmaps[offset + i];
}
}
int x = argmax / EMI_NN_OUTPUT_SIZE;
int y = argmax % EMI_NN_OUTPUT_SIZE;
float conf = heatmaps[offset + argmax];
float res = EMI_NN_SIZE - 1;
int off_x = floor(res * (logit(heatmaps[66 * heatmap_size + offset + argmax])) + 0.1);
int off_y = floor(res * (logit(heatmaps[2 * 66 * heatmap_size + offset + argmax])) + 0.1);
float lm_y = (float)y0 + (float)(scale_x * (res * (float(x) / (EMI_NN_OUTPUT_SIZE-1)) + off_x));
float lm_x = (float)x0 + (float)(scale_y * (res * (float(y) / (EMI_NN_OUTPUT_SIZE-1)) + off_y));
facical_landmarks.push_back(cv::Point2f(lm_x, lm_y));
}
return facical_landmarks;
}
|
The landmark decoding looks correct to me, but this matches my experience with just reducing the resolution. I'm not sure why it doesn't work. I suspect the offset layers might give bad results if the resolution is different. That's the reason I trained the special lower resolution model with less points. I also tried training at 112x112 before, but found that the gain in performance was smaller compared to the reduction in accuracy, so I settled on 56x56 to make the performance gain worthwhile. |
Thanks for your patience again. |
Ok! Thanks again! |
I have test the model. The 30 points model is too noisy for me =.=. Looks like 114x114 model with 66 features may balanced since I need more than 75FPS on cpu.. |
I have looked a bit more at the result of the downscaled models and it just looks completely broken. I don't think I have my old results, but might try training another 112x112 some time soon. |
@emilianavt Thanks a lot! Waiting for your update. |
Training will probably take a few more days as I mainly train overnight. |
Thanks a lot! |
Validation loss doesn't seem to be improving anymore, so you can give this one a try: The logit factor is 16, input resolution 112x112, output resolution 14x14. |
Thanks a lot!!! As I have tried in my own code, this model works pretty well, looks like its running speed similar to model0 better performance similar to 1 or 2. More analyze will be processed later. |
BTW, how about your quantization progress of these models? I see you are trying to quat them in onnxruntime's repo |
I have encountered the same issue as you. Models get smaller but slower when successfully quantized. I haven't bothered evaluating accuracy due to this. Hopefully something can be fixed on the onnxruntime side. I believe in theory quantization should be able to give a good speedup, which would help a lot. |
I also trained a faster version. I'm still trying to figure out where the two 112x112 models fit among the other different models quality-wise. |
In my practice, lm_modelV_opt.zip is more accuracy than lm_model0 with same inference time, and not as good as lm_model1. |
Btw, how did you solve the clip issue while quant the model? |
I only tried dynamic quantization on that model, which worked without solving that issue, but it caused things to run slower. |
looks like modelU has similar performance compare to model0 but much faster in my application. |
Thank you for your feedback! |
Thanks you for your excellent work again! |
Thanks for your contribution! And I will include your license and your repo link correctly.
I also have a question, estimate facial landmarks from 224x224 image is not neccassary sometime since my input image is near 100x100, is that possible to rescale the network?
I will also try this by my own.
The text was updated successfully, but these errors were encountered: