Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quantization and pruning to my pre-trained model #13077

Open
1 task done
hsaine opened this issue Jun 10, 2024 · 5 comments
Open
1 task done

Quantization and pruning to my pre-trained model #13077

hsaine opened this issue Jun 10, 2024 · 5 comments
Labels
question Further information is requested

Comments

@hsaine
Copy link

hsaine commented Jun 10, 2024

Search before asking

Question

Hello,

I want to apply Quantization and pruning to my pre-trained yolov5 model. I specifically want to use post-training quantization (PTQ) and unstructured pruning. Could you provide me with the steps and a tutorial on how to do this?

Thank you.

Additional

No response

@hsaine hsaine added the question Further information is requested label Jun 10, 2024
@hsaine hsaine changed the title Quantification and pruning to my pre-trained model Quantization and pruning to my pre-trained model Jun 10, 2024
@glenn-jocher
Copy link
Member

Hello,

Thank you for reaching out! It's great to hear that you're interested in applying quantization and pruning to your pre-trained YOLOv5 model. Let's walk through the steps for both post-training quantization (PTQ) and unstructured pruning.

Pruning

First, let's start with unstructured pruning. Pruning helps in reducing the model size and potentially increasing inference speed by setting a percentage of the model's weights to zero. Here's a concise guide to get you started:

  1. Clone the YOLOv5 repository and install the required dependencies:

    git clone https://github.com/ultralytics/yolov5
    cd yolov5
    pip install -r requirements.txt
  2. Test your model to establish a baseline performance:

    python val.py --weights your_model.pt --data coco.yaml --img 640 --half
  3. Apply pruning to your model:
    Update val.py to include the pruning step. For example, to prune your model to 30% sparsity:

    from utils.torch_utils import prune
    
    # Load your model
    model = torch.load('your_model.pt')['model'].float()
    
    # Apply pruning
    prune(model, amount=0.3)  # 30% sparsity
    
    # Save the pruned model
    torch.save(model, 'your_pruned_model.pt')
  4. Evaluate the pruned model:

    python val.py --weights your_pruned_model.pt --data coco.yaml --img 640 --half

For more detailed information, you can refer to our Model Pruning and Sparsity Tutorial.

Quantization

For post-training quantization (PTQ), you can use PyTorch's built-in quantization tools. Here’s a basic example:

  1. Prepare your model for quantization:

    import torch
    from torch.quantization import quantize_dynamic
    
    # Load your model
    model = torch.load('your_model.pt')['model'].float()
    
    # Apply dynamic quantization
    quantized_model = quantize_dynamic(model, {torch.nn.Linear}, dtype=torch.qint8)
    
    # Save the quantized model
    torch.save(quantized_model, 'your_quantized_model.pt')
  2. Evaluate the quantized model:

    python val.py --weights your_quantized_model.pt --data coco.yaml --img 640 --half

Additional Resources

For a more comprehensive guide on quantization, you can refer to the PyTorch Quantization Documentation.

If you encounter any issues or have further questions, please ensure you provide a minimum reproducible example as outlined here. This will help us assist you more effectively.

Happy coding! 😊

@hsaine
Copy link
Author

hsaine commented Jun 11, 2024

Thank you very much. I have tested unstructured pruning and now I want to apply structured pruning to compare both methods. How can I do it? Thanks.

@glenn-jocher
Copy link
Member

Hello,

Thank you for your interest in exploring structured pruning! It's great to hear that you've successfully tested unstructured pruning. Structured pruning can further help in reducing the model size and potentially improving inference speed by removing entire channels or filters from the model.

Here's a step-by-step guide to apply structured pruning to your YOLOv5 model:

Structured Pruning

  1. Clone the YOLOv5 repository and install the required dependencies (if you haven't already):

    git clone https://github.com/ultralytics/yolov5
    cd yolov5
    pip install -r requirements.txt
  2. Test your model to establish a baseline performance (if not done already):

    python val.py --weights your_model.pt --data coco.yaml --img 640 --half
  3. Apply structured pruning to your model:
    Structured pruning typically involves removing entire filters or channels. Here's an example using PyTorch's pruning methods:

    import torch
    import torch.nn.utils.prune as prune
    
    # Load your model
    model = torch.load('your_model.pt')['model'].float()
    
    # Define a function to apply structured pruning
    def apply_structured_pruning(model, amount=0.3):
        for name, module in model.named_modules():
            if isinstance(module, torch.nn.Conv2d):
                prune.ln_structured(module, name='weight', amount=amount, n=2, dim=0)
                prune.remove(module, 'weight')
        return model
    
    # Apply structured pruning
    pruned_model = apply_structured_pruning(model, amount=0.3)  # 30% structured pruning
    
    # Save the pruned model
    torch.save(pruned_model, 'your_structured_pruned_model.pt')
  4. Evaluate the structured pruned model:

    python val.py --weights your_structured_pruned_model.pt --data coco.yaml --img 640 --half

Additional Considerations

  • Comparison: After applying structured pruning, you can compare the performance metrics (e.g., mAP, inference time) with those obtained from unstructured pruning to see which method better suits your needs.
  • Fine-tuning: Structured pruning might require fine-tuning the model to regain some of the lost accuracy. You can fine-tune the pruned model using your dataset.

If you encounter any issues or have further questions, please ensure you provide a minimum reproducible example as outlined here. This will help us assist you more effectively.

Happy experimenting! 😊

@hsaine
Copy link
Author

hsaine commented Jun 12, 2024

Hello, when I try to validate the structured pruning, it shows me this error. For the path, I have changed it to a placeholder name for confidentiality. I have replaced the actual path with "path." How can I solve this problem in order to visualize the performance of the model? and thanks.

data=path/Bureau/yolov5/data/data2aug.yaml, weights=['path/Bureau/yolov5/runs/pruning/pruned_model_nano_structured.pt'], batch_size=32, imgsz=640, conf_thres=0.001, iou_thres=0.6, max_det=300, task=val, device=, workers=8, single_cls=False, augment=False, verbose=False, save_txt=False, save_hybrid=False, save_conf=False, save_json=False, project=runs/val, name=exp, exist_ok=False, half=False, dnn=False
YOLOv5 🚀 v7.0-317-gc1803846 Python-3.10.12 torch-2.0.1+cu117 CUDA:0 (NVIDIA GeForce RTX 2070, 7972MiB)

Traceback (most recent call last):
File "path/Bureau/yolov5/val.py", line 438, in
main(opt)
File "path/Bureau/yolov5/val.py", line 409, in main
run(**vars(opt))
File "/path/.local/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "path/Bureau/yolov5/val.py", line 165, in run
model = DetectMultiBackend(weights, device=device, dnn=dnn, data=data, fp16=half)
File "path/yolov5/models/common.py", line 467, in init
model = attempt_load(weights if isinstance(weights, list) else w, device=device, inplace=True, fuse=fuse)
File "path/Bureau/yolov5/models/experimental.py", line 99, in attempt_load
ckpt = (ckpt.get("ema") or ckpt["model"]).to(device).float() # FP32 model
AttributeError: 'collections.OrderedDict' object has no attribute 'to'
File "path/Bureau/yolov5/models/common.py", line 467, in init
model = attempt_load(weights if isinstance(weights, list) else w, device=device, inplace=True, fuse=fuse)
File "path/Bureau/yolov5/models/experimental.py", line 99, in attempt_load
ckpt = (ckpt.get("ema") or ckpt["model"]).to(device).float() # FP32 model
AttributeError: 'collections.OrderedDict' object has no attribute 'to'

@glenn-jocher
Copy link
Member

Hello,

Thank you for reaching out and providing detailed information about the issue you're encountering. It looks like you're facing an AttributeError related to the model loading process during validation after applying structured pruning.

To assist you effectively, could you please provide a minimum reproducible example of your code? This will help us better understand the context and reproduce the issue on our end. You can refer to our Minimum Reproducible Example Guide for more details on how to create one. This step is crucial for us to investigate and resolve the problem efficiently.

In the meantime, please ensure that you are using the latest versions of torch and the YOLOv5 repository. You can update your packages with the following commands:

pip install --upgrade torch
git pull

From the error message, it seems that the model checkpoint might not be loaded correctly. The attempt_load function expects the checkpoint to have either an "ema" or "model" key, which should be a model object that can be moved to the device. Here’s a potential fix you can try:

  1. Check the structure of your checkpoint file:
    Ensure that your checkpoint file contains the correct keys. You can inspect the checkpoint file as follows:

    import torch
    
    checkpoint = torch.load('path/Bureau/yolov5/runs/pruning/pruned_model_nano_structured.pt')
    print(checkpoint.keys())

    The output should include either "ema" or "model". If not, you might need to adjust how the model is saved.

  2. Modify the checkpoint loading process:
    If the checkpoint structure is different, you can modify the loading process to handle it appropriately. For example:

    from yolov5.models.common import DetectMultiBackend
    
    # Load the model
    weights = 'path/Bureau/yolov5/runs/pruning/pruned_model_nano_structured.pt'
    device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
    checkpoint = torch.load(weights, map_location=device)
    
    # Check and load the model
    model = checkpoint.get('ema') or checkpoint.get('model')
    if model is None:
        raise ValueError("Checkpoint does not contain 'ema' or 'model' keys.")
    model = model.to(device).float()
    
    # Continue with validation
    detect_backend = DetectMultiBackend(weights, device=device, dnn=False, data='path/Bureau/yolov5/data/data2aug.yaml', fp16=False)

Please try these steps and let us know if the issue persists. Providing the minimum reproducible example will greatly help us in diagnosing the problem further.

Thank you for your cooperation, and we look forward to helping you resolve this issue!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants