Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How many training epochs should we use with 300 "evolve" iterations? #13083

Closed
1 task done
Pedro-Leitek opened this issue Jun 12, 2024 · 2 comments
Closed
1 task done
Labels
question Further information is requested Stale

Comments

@Pedro-Leitek
Copy link

Search before asking

Question

Hi there!

I was wondering, how many epochs should we use, so that when using 300 evolve iterations, we can get good model hyperparameters?
In a scenario of 10vs20 epochs and 300 evolve iterations, will the hyperparameters be the same? I mean is the amount of epochs "irrelevant"? My issue with the evolve tool is that it will consume tons of time to produce a result. The idea is to first get the hyperparameters with a few epochs and then use those parameters with a training of, lets say, 100 epochs.

Thank you

Additional

No response

@Pedro-Leitek Pedro-Leitek added the question Further information is requested label Jun 12, 2024
@glenn-jocher
Copy link
Member

@Pedro-Leitek hi there!

Great question! The number of training epochs you choose for each evolution iteration can significantly impact the quality of the hyperparameters you obtain. Here's a detailed response to help you understand the considerations:

Epochs and Hyperparameter Evolution

  1. Shorter Epochs for Initial Evolution:

    • Using fewer epochs (e.g., 10 epochs) per evolution iteration can speed up the hyperparameter search process. This approach allows you to quickly identify promising hyperparameter configurations without spending too much time on each iteration.
    • However, the trade-off is that the model may not fully converge within these few epochs, which might lead to suboptimal hyperparameters for longer training runs.
  2. Longer Epochs for Final Training:

    • Once you have identified a set of promising hyperparameters using shorter epochs, you can then train your model for a longer period (e.g., 100 epochs) using these hyperparameters. This two-step approach can save time while still achieving good performance.

Practical Approach

Given your scenario, here's a practical approach:

  1. Initial Evolution with Shorter Epochs:

    • Start with 10 epochs for each of the 300 evolution iterations. This will allow you to explore a wide range of hyperparameters relatively quickly.
    • Command example:
      python train.py --epochs 10 --data coco128.yaml --weights yolov5s.pt --cache --evolve
  2. Final Training with Selected Hyperparameters:

    • After the evolution process, use the best hyperparameters found (saved in runs/evolve/hyp_evolved.yaml) to train your model for a longer period, such as 100 epochs.
    • Command example:
      python train.py --epochs 100 --data coco128.yaml --weights yolov5s.pt --hyp runs/evolve/hyp_evolved.yaml

Considerations

  • Convergence: Ensure that the model shows signs of convergence within the shorter epochs used during evolution. If not, you might need to increase the number of epochs slightly.
  • Time and Resources: Hyperparameter evolution is computationally expensive. Balancing the number of epochs and iterations is crucial to manage time and resource constraints effectively.

For more detailed information on hyperparameter evolution, you can refer to our Hyperparameter Evolution Guide.

I hope this helps! If you have any further questions, feel free to ask. Happy training! 🚀

Copy link
Contributor

👋 Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.

For additional resources and information, please see the links below:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLO 🚀 and Vision AI ⭐

@github-actions github-actions bot added the Stale label Jul 14, 2024
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jul 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested Stale
Projects
None yet
Development

No branches or pull requests

2 participants