-
-
Notifications
You must be signed in to change notification settings - Fork 15.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to modify the activation function? #3013
Comments
@ilem777 see Conv() module in activations study branch for example implementations of alternative activation functions: Lines 34 to 57 in 0824388
|
Hi, @glenn-jocher why did you replace the Relu activation function with the sigmoid function in the last version? I'm really curious to understand the results that provides you ( I don't have the time to try all the parameters that's why I'm asking XD) |
@besmaGuesmi architecture updates are typically informed by empirical results of experiments and studies we run. You can see our Activations Study at https://wandb.ai/glenn-jocher/activations, and a discussion at #2891 |
@glenn-jocher what I've seen is that the FReLU provides the best result! why didn't you choose it? |
@besmaGuesmi FReLU may be suitable for smaller models like 5n and 5s, but it adds too many operations to larger models and causes earlier overfitting. It also requires substantially increased resources like CUDA memory, which is not compatible with our goal of good results on consumer hardware using less resources. |
Understood, could you please tell me how can I change the activation function in the model, I really like to use the FReLU function instead of SiLU because I used the Yolov5n model. Thanks. |
Is it enough to change it only in the common.py and expriment.py files? |
@besmaGuesmi activations are defined in one place for all official YOLOv5 models: Line 38 in 79bca2b
|
@glenn-jocher I tried the FReLU function as below
But there is an error in the training! are there any mistakes in the implementation above? Thanks |
@besmaGuesmi see #3013 (comment) |
Hi @glenn-jocher, When I tried to change the Conv class, there are some issues because of the same files, what I understood is I have to change some files content according to https://github.com/ultralytics/yolov5/tree/0824388b9e1afb5a888ce4c302acfe2ad3da8101, but is there any other method to use directly FReLU without the need to change files like general.py, utils.py, etc. Thanks |
@besmaGuesmi the only file you need to update is common.py, you just import and use FReLU as in #3013 (comment) |
@glenn-jocher yes I did exactly the same work but I faced an error in the training, do I have to remove the forward_fuse function from the Conv? |
@besmaGuesmi your python indentations are incorrect. This is unrelated to YOLOv5. You may want to take a beginner's python course first to learn the basics. |
sorry, I uploaded the wrong screenshot, I talked about this one, what I understand is that when I cloned yolov5, the version is not updated so we have to change other python files (activation.py, general.py ...) in addition, when I checked the common.py it also missed the activation function importation (FReLU, FReLU_noBN_biasFalse, FReLU_noBN_biasTrue..) the common.py, the activation.py, etc I have obtained after cloned are not the same here:https://github.com/ultralytics/yolov5/tree/0824388b9e1afb5a888ce4c302acfe2ad3da8101/models. Did you understand what I mean by my first question? |
Solved by removing the other activation function, thanks |
Sir, I will experiment with the new activation function, as well as a lighter backbone, etc. If there is progress, I will let you know~ |
@fanghua2021 architecture updates are typically informed by empirical results of experiments and studies we run. You can see our Activations Study with YOLOv5s on COCO for 300 epochs at https://wandb.ai/glenn-jocher/activations, and a discussion at #2891 Results may vary by dataset and model. |
Hi @glenn-jocher,Dy-relu,how do you think?it might be work better. |
@hellodennis4 you can see our Activations Study with YOLOv5s on COCO for 300 epochs at https://wandb.ai/glenn-jocher/activations, and a discussion at #2891 Results may vary by dataset and model. |
How do I add a new ELU activation function to yolov5 and use it? |
@marziyemahmoudifar you can simply replace the default nn.SiLU() activation here on models.py L44 with another one of your design. This will affect all activations in the whole YOLOv5 model: Lines 38 to 45 in 2e57b84
|
Hi, I think it still doesn't work for FReLU, when I follow the comments and modify it the colab still says no module called FReLU |
@bzha5848 you can import FReLU from utils.activations |
@glenn-jocher Hi I have seen the result you mentioned here, https://wandb.ai/glenn-jocher/activations. For FRelu-noBN-BiasTrue and FRelu-noBN-BiasFalse, it semms like you early stop it. Could you please tell me why you did this or are these two type of activations(FRelu-noBN-BiasTrue and FRelu-noBN-BiasFalse) better than original FRelu |
@ZhixiongSun I don’t remember exactly, but typical early stopping reasons may be excess resource usage, ie CUDA memory, or slow training speed. You should be able to reproduce the runs yourself using the commands shown in the wandb logs. |
@passerbythesun please keep in mind that the performance of activation functions can vary depending on the dataset and model architecture. While the FReLU activation function may yield promising results in some scenarios, it appears to have limited efficacy in your custom dataset, as indicated by the training procedure stopping at epoch 145 due to no observed improvement in the last 100 epochs. It is recommended to explore other activation functions or further tune the model parameters to achieve better performance on your dataset. |
Can LeakyReLU be used in YOLOv5x version and how far it will help improve the model (v5x) performance |
@IlamSaran LeakyReLU can be used in YOLOv5x and has been shown to improve performance in certain cases, particularly in addressing the vanishing gradient problem. However, the extent of performance improvement may vary depending on the specific dataset and model architecture. It is recommended to experiment with different activation functions and evaluate their impact on model performance to determine the most effective configuration for your use case. |
HI! |
@IlamSaran The validation set is used during training for model selection and hyperparameter tuning, while the test set is reserved for final evaluation of the trained model's performance. Testing on an unlabeled set of data, containing similar classes to those in the training set, helps assess the model's generalization capabilities and its ability to make accurate predictions on previously unseen examples. |
My DL model for object detection task results with mAP@0.5 = 90% and mAP@0.5:0.95 =78% on my custom created dataset. |
@IlamSaran Your model's performance as measured by mAP@0.5 is indeed higher on the public benchmark dataset, indicating good detection at a specific IoU threshold of 0.5. However, the lower mAP@0.5:0.95 score suggests that the model's performance across a range of IoU thresholds from 0.5 to 0.95 is not as robust on the public dataset compared to your custom dataset. The mAP@0.5:0.95 metric provides a more comprehensive assessment of detection performance across various IoU levels, reflecting both localization and detection accuracy. The comparative decrease in this metric for the public dataset suggests that while the model detects objects well at a lower IoU threshold, it struggles with precise localization at higher IoU thresholds. In conclusion, your model does appear to work better on your custom dataset when considering the overall detection and localization performance (mAP@0.5:0.95). However, it's also important to consider dataset size, diversity, and difficulty when comparing these metrics. |
❔Question
Hello author, I have seen that there are new activation functions added to the program, but I'm not quite sure if I've modified the code correctly, and I'd like you to give me some advice.
Additional context
I see that you have given the prepared lines of code under this question, but I am a bit confused:
yolov5/models/common.py
Lines 34 to 51 in c9c95fb
self.act = nn.FRelu() if act is True else (act if isinstance(act, nn.Module) else nn.Identity())
orself.act = FRelu() if act is True else (act if isinstance(act, nn.Module) else nn.Identity())
The text was updated successfully, but these errors were encountered: