Quantization and pruning to my pre-trained model #13077

hsaine · 2024-06-10T13:57:56Z

Search before asking

I have searched the YOLOv5 issues and discussions and found no similar questions.

Question

Hello,

I want to apply Quantization and pruning to my pre-trained yolov5 model. I specifically want to use post-training quantization (PTQ) and unstructured pruning. Could you provide me with the steps and a tutorial on how to do this?

Thank you.

Additional

No response

glenn-jocher · 2024-06-10T16:43:36Z

Hello,

Thank you for reaching out! It's great to hear that you're interested in applying quantization and pruning to your pre-trained YOLOv5 model. Let's walk through the steps for both post-training quantization (PTQ) and unstructured pruning.

Pruning

First, let's start with unstructured pruning. Pruning helps in reducing the model size and potentially increasing inference speed by setting a percentage of the model's weights to zero. Here's a concise guide to get you started:

Clone the YOLOv5 repository and install the required dependencies:

git clone https://github.com/ultralytics/yolov5
cd yolov5
pip install -r requirements.txt

Test your model to establish a baseline performance:

python val.py --weights your_model.pt --data coco.yaml --img 640 --half

Apply pruning to your model:
Update val.py to include the pruning step. For example, to prune your model to 30% sparsity:

from utils.torch_utils import prune

# Load your model
model = torch.load('your_model.pt')['model'].float()

# Apply pruning
prune(model, amount=0.3)  # 30% sparsity

# Save the pruned model
torch.save(model, 'your_pruned_model.pt')

Evaluate the pruned model:

python val.py --weights your_pruned_model.pt --data coco.yaml --img 640 --half

For more detailed information, you can refer to our Model Pruning and Sparsity Tutorial.

Quantization

For post-training quantization (PTQ), you can use PyTorch's built-in quantization tools. Here’s a basic example:

Prepare your model for quantization:

import torch
from torch.quantization import quantize_dynamic

# Load your model
model = torch.load('your_model.pt')['model'].float()

# Apply dynamic quantization
quantized_model = quantize_dynamic(model, {torch.nn.Linear}, dtype=torch.qint8)

# Save the quantized model
torch.save(quantized_model, 'your_quantized_model.pt')

Evaluate the quantized model:

python val.py --weights your_quantized_model.pt --data coco.yaml --img 640 --half

Additional Resources

For a more comprehensive guide on quantization, you can refer to the PyTorch Quantization Documentation.

If you encounter any issues or have further questions, please ensure you provide a minimum reproducible example as outlined here. This will help us assist you more effectively.

Happy coding! 😊

hsaine · 2024-06-11T08:59:47Z

Thank you very much. I have tested unstructured pruning and now I want to apply structured pruning to compare both methods. How can I do it? Thanks.

glenn-jocher · 2024-06-11T16:27:45Z

Hello,

Thank you for your interest in exploring structured pruning! It's great to hear that you've successfully tested unstructured pruning. Structured pruning can further help in reducing the model size and potentially improving inference speed by removing entire channels or filters from the model.

Here's a step-by-step guide to apply structured pruning to your YOLOv5 model:

Structured Pruning

Clone the YOLOv5 repository and install the required dependencies (if you haven't already):

git clone https://github.com/ultralytics/yolov5
cd yolov5
pip install -r requirements.txt

Test your model to establish a baseline performance (if not done already):
```
python val.py --weights your_model.pt --data coco.yaml --img 640 --half
```

Apply structured pruning to your model:
Structured pruning typically involves removing entire filters or channels. Here's an example using PyTorch's pruning methods:

import torch
import torch.nn.utils.prune as prune

# Load your model
model = torch.load('your_model.pt')['model'].float()

# Define a function to apply structured pruning
def apply_structured_pruning(model, amount=0.3):
    for name, module in model.named_modules():
        if isinstance(module, torch.nn.Conv2d):
            prune.ln_structured(module, name='weight', amount=amount, n=2, dim=0)
            prune.remove(module, 'weight')
    return model

# Apply structured pruning
pruned_model = apply_structured_pruning(model, amount=0.3)  # 30% structured pruning

# Save the pruned model
torch.save(pruned_model, 'your_structured_pruned_model.pt')

Evaluate the structured pruned model:

python val.py --weights your_structured_pruned_model.pt --data coco.yaml --img 640 --half

Additional Considerations

Comparison: After applying structured pruning, you can compare the performance metrics (e.g., mAP, inference time) with those obtained from unstructured pruning to see which method better suits your needs.
Fine-tuning: Structured pruning might require fine-tuning the model to regain some of the lost accuracy. You can fine-tune the pruned model using your dataset.

If you encounter any issues or have further questions, please ensure you provide a minimum reproducible example as outlined here. This will help us assist you more effectively.

Happy experimenting! 😊

hsaine · 2024-06-12T08:11:03Z

Hello, when I try to validate the structured pruning, it shows me this error. For the path, I have changed it to a placeholder name for confidentiality. I have replaced the actual path with "path." How can I solve this problem in order to visualize the performance of the model? and thanks.

data=path/Bureau/yolov5/data/data2aug.yaml, weights=['path/Bureau/yolov5/runs/pruning/pruned_model_nano_structured.pt'], batch_size=32, imgsz=640, conf_thres=0.001, iou_thres=0.6, max_det=300, task=val, device=, workers=8, single_cls=False, augment=False, verbose=False, save_txt=False, save_hybrid=False, save_conf=False, save_json=False, project=runs/val, name=exp, exist_ok=False, half=False, dnn=False
YOLOv5 🚀 v7.0-317-gc1803846 Python-3.10.12 torch-2.0.1+cu117 CUDA:0 (NVIDIA GeForce RTX 2070, 7972MiB)

Traceback (most recent call last):
File "path/Bureau/yolov5/val.py", line 438, in
main(opt)
File "path/Bureau/yolov5/val.py", line 409, in main
run(**vars(opt))
File "/path/.local/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "path/Bureau/yolov5/val.py", line 165, in run
model = DetectMultiBackend(weights, device=device, dnn=dnn, data=data, fp16=half)
File "path/yolov5/models/common.py", line 467, in init
model = attempt_load(weights if isinstance(weights, list) else w, device=device, inplace=True, fuse=fuse)
File "path/Bureau/yolov5/models/experimental.py", line 99, in attempt_load
ckpt = (ckpt.get("ema") or ckpt["model"]).to(device).float() # FP32 model
AttributeError: 'collections.OrderedDict' object has no attribute 'to'
File "path/Bureau/yolov5/models/common.py", line 467, in init
model = attempt_load(weights if isinstance(weights, list) else w, device=device, inplace=True, fuse=fuse)
File "path/Bureau/yolov5/models/experimental.py", line 99, in attempt_load
ckpt = (ckpt.get("ema") or ckpt["model"]).to(device).float() # FP32 model
AttributeError: 'collections.OrderedDict' object has no attribute 'to'

glenn-jocher · 2024-06-13T19:38:31Z

Hello,

Thank you for reaching out and providing detailed information about the issue you're encountering. It looks like you're facing an AttributeError related to the model loading process during validation after applying structured pruning.

To assist you effectively, could you please provide a minimum reproducible example of your code? This will help us better understand the context and reproduce the issue on our end. You can refer to our Minimum Reproducible Example Guide for more details on how to create one. This step is crucial for us to investigate and resolve the problem efficiently.

In the meantime, please ensure that you are using the latest versions of torch and the YOLOv5 repository. You can update your packages with the following commands:

pip install --upgrade torch
git pull

From the error message, it seems that the model checkpoint might not be loaded correctly. The attempt_load function expects the checkpoint to have either an "ema" or "model" key, which should be a model object that can be moved to the device. Here’s a potential fix you can try:

Check the structure of your checkpoint file:
Ensure that your checkpoint file contains the correct keys. You can inspect the checkpoint file as follows:
```
import torch

checkpoint = torch.load('path/Bureau/yolov5/runs/pruning/pruned_model_nano_structured.pt')
print(checkpoint.keys())
```
The output should include either "ema" or "model". If not, you might need to adjust how the model is saved.

Modify the checkpoint loading process:
If the checkpoint structure is different, you can modify the loading process to handle it appropriately. For example:

from yolov5.models.common import DetectMultiBackend

# Load the model
weights = 'path/Bureau/yolov5/runs/pruning/pruned_model_nano_structured.pt'
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
checkpoint = torch.load(weights, map_location=device)

# Check and load the model
model = checkpoint.get('ema') or checkpoint.get('model')
if model is None:
    raise ValueError("Checkpoint does not contain 'ema' or 'model' keys.")
model = model.to(device).float()

# Continue with validation
detect_backend = DetectMultiBackend(weights, device=device, dnn=False, data='path/Bureau/yolov5/data/data2aug.yaml', fp16=False)

Please try these steps and let us know if the issue persists. Providing the minimum reproducible example will greatly help us in diagnosing the problem further.

Thank you for your cooperation, and we look forward to helping you resolve this issue!

hsaine added the question Further information is requested label Jun 10, 2024

hsaine changed the title ~~Quantification and pruning to my pre-trained model~~ Quantization and pruning to my pre-trained model Jun 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quantization and pruning to my pre-trained model #13077

Quantization and pruning to my pre-trained model #13077

hsaine commented Jun 10, 2024 •

edited

Loading

glenn-jocher commented Jun 10, 2024

hsaine commented Jun 11, 2024

glenn-jocher commented Jun 11, 2024

hsaine commented Jun 12, 2024

glenn-jocher commented Jun 13, 2024

Quantization and pruning to my pre-trained model #13077

Quantization and pruning to my pre-trained model #13077

Comments

hsaine commented Jun 10, 2024 • edited Loading

Search before asking

Question

Additional

glenn-jocher commented Jun 10, 2024

Pruning

Quantization

Additional Resources

hsaine commented Jun 11, 2024

glenn-jocher commented Jun 11, 2024

Structured Pruning

Additional Considerations

hsaine commented Jun 12, 2024

glenn-jocher commented Jun 13, 2024

hsaine commented Jun 10, 2024 •

edited

Loading