New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Structured pruning for YOLOv5 #925
Comments
Hey @hawrot Regarding model size: Structured pruning of YOLOv5 is possible, we haven't looked into it yet though. It may be a bit tricky, due to the architecture of YOLO (e.g. the "long" skip connections may enforce the network to prune the same channel weights along the depth of the network, this may be slightly problematic). If you are interested in trying, here are the structured pruning modifiers that you can incorporate into your recipe. We can provide further advice if needed. If the size is what you care about, we would recommend you to use YOLO quantization (if not already employed) not only for further speedup but also for significant model file size reduction |
I also pruned yolov5 through sparseml, but it didn't seem to have much effect. python train.py --cfg ../models_v5.0/yolov5s.yaml --weights PATH_TO_COCO_PRETRAINED_WEIGHTS --data coco.yaml --hyp data/hyps/hyp.scratch.yaml --recipe ../recipes/yolov5s.pruned.md I want to know how to get a big effect. Can you tell me? |
@OSMasterSoohwan what do you mean when you say big effect? Inference speedup right? |
Yes, I wonder how it got much faster. |
Did you try runnning pruned model through the Deepsparse engine? |
@OSMasterSoohwan -> @hawrot makes a very good point.
would lunch a training run that takes an original, dense yolo5s and applies the sparsification recipe, that will prune the weights of the network. Once this is complete and you want to use that sparsified model for fast inference, you would need to export the trained weights to onnx file and then compile the model in deepsparse engine. If you have any pending questions, feel free to follow up. |
@dbogunowicz can the converted onnx model be converted to Qualcomm DLC model to run on Qualcomm hardware accelrated device to have the benefit in inference speed? |
@tsangz189 I am not aware of any support from our side. I would say we should probably ask the Qualcomm team, whether their accelerator:
|
Closing due to inactivity. |
@hawrot If I understand correctly, you saw reduced inference time even when doing unstructured pruning, right? Does this inference time reduction happen only when you run the pruned model with the DeepSparse engine? Does this also happen when you export the pruned model to the normal PyTorch engine? |
I have been using sparseml for pruning YOLOv5 recently and I can see big improvement in inference time, however, the model size stays the same. I have realised that it happens due to unstrustured pruning which only fills the 0 rather than removing the weights.
I was wondering whether it is possible to implement structured pruning for YOLOv5
The text was updated successfully, but these errors were encountered: