How to modify the activation function? #3013

zxsitu · 2021-05-02T16:26:21Z

❔Question

Hello author, I have seen that there are new activation functions added to the program, but I'm not quite sure if I've modified the code correctly, and I'd like you to give me some advice.

Additional context

I see that you have given the prepared lines of code under this question, but I am a bit confused:

yolov5/models/common.py

Lines 34 to 51 in c9c95fb

    
           class Conv(nn.Module): 
        
               # Standard convolution 
        
               def __init__(self, c1, c2, k=1, s=1, p=None, g=1, act=True):  # ch_in, ch_out, kernel, stride, padding, groups 
        
                   super(Conv, self).__init__() 
        
                   self.conv = nn.Conv2d(c1, c2, k, s, autopad(k, p), groups=g, bias=False) 
        
                   self.bn = nn.BatchNorm2d(c2) 
        
                   # self.act = nn.Identity() if act is True else (act if isinstance(act, nn.Module) else nn.Identity()) 
        
                   # self.act = nn.Tanh() if act is True else (act if isinstance(act, nn.Module) else nn.Identity()) 
        
                   # self.act = nn.Sigmoid() if act is True else (act if isinstance(act, nn.Module) else nn.Identity()) 
        
                   # self.act = nn.ReLU() if act is True else (act if isinstance(act, nn.Module) else nn.Identity()) 
        
                   # self.act = nn.LeakyReLU(0.1) if act is True else (act if isinstance(act, nn.Module) else nn.Identity()) 
        
                   # self.act = nn.Hardswish() if act is True else (act if isinstance(act, nn.Module) else nn.Identity()) 
        
                   # self.act = nn.SiLU() if act is True else (act if isinstance(act, nn.Module) else nn.Identity()) 
        
                   # self.act = Mish() if act is True else (act if isinstance(act, nn.Module) else nn.Identity()) 
        
                   # self.act = AconC() if act is True else (act if isinstance(act, nn.Module) else nn.Identity()) 
        
                   # self.act = MetaAconC() if act is True else (act if isinstance(act, nn.Module) else nn.Identity()) 
        
                   # self.act = SiLU_beta() if act is True else (act if isinstance(act, nn.Module) else nn.Identity()) 
        
                   self.act = MetaAconC(c2) if act is True else (act if isinstance(act, nn.Module) else nn.Identity())

How should FRelu be added to it? What is the format?
self.act = nn.FRelu() if act is True else (act if isinstance(act, nn.Module) else nn.Identity()) or
self.act = FRelu() if act is True else (act if isinstance(act, nn.Module) else nn.Identity())
Why some activation functions start with nn.xxx, while some directly start with the name of the activation function? Should I use the former or the latter?

The text was updated successfully, but these errors were encountered:

glenn-jocher · 2021-05-02T18:37:29Z

@ilem777 see Conv() module in activations study branch for example implementations of alternative activation functions:

yolov5/models/common.py

Lines 34 to 57 in 0824388

    
           class Conv(nn.Module): 
        
               # Standard convolution 
        
               def __init__(self, c1, c2, k=1, s=1, p=None, g=1, act=True):  # ch_in, ch_out, kernel, stride, padding, groups 
        
                   super(Conv, self).__init__() 
        
                   self.conv = nn.Conv2d(c1, c2, k, s, autopad(k, p), groups=g, bias=False) 
        
                   self.bn = nn.BatchNorm2d(c2) 
        
                   # self.act = nn.Identity() if act is True else (act if isinstance(act, nn.Module) else nn.Identity()) 
        
                   # self.act = nn.Tanh() if act is True else (act if isinstance(act, nn.Module) else nn.Identity()) 
        
                   # self.act = nn.Sigmoid() if act is True else (act if isinstance(act, nn.Module) else nn.Identity()) 
        
                   # self.act = nn.ReLU() if act is True else (act if isinstance(act, nn.Module) else nn.Identity()) 
        
                   # self.act = nn.LeakyReLU(0.1) if act is True else (act if isinstance(act, nn.Module) else nn.Identity()) 
        
                   # self.act = nn.Hardswish() if act is True else (act if isinstance(act, nn.Module) else nn.Identity()) 
        
                   # self.act = nn.SiLU() if act is True else (act if isinstance(act, nn.Module) else nn.Identity()) 
        
                   # self.act = Mish() if act is True else (act if isinstance(act, nn.Module) else nn.Identity()) 
        
                   # self.act = FReLU(c2) if act is True else (act if isinstance(act, nn.Module) else nn.Identity()) 
        
                   # self.act = AconC(c2) if act is True else (act if isinstance(act, nn.Module) else nn.Identity()) 
        
                   # self.act = MetaAconC(c2) if act is True else (act if isinstance(act, nn.Module) else nn.Identity()) 
        
                   # self.act = SiLU_beta(c2) if act is True else (act if isinstance(act, nn.Module) else nn.Identity()) 
        
                   self.act = FReLU_noBN_biasFalse(c2) if act is True else (act if isinstance(act, nn.Module) else nn.Identity()) 
        
                   # self.act = FReLU_noBN_biasTrue(c2) if act is True else (act if isinstance(act, nn.Module) else nn.Identity()) 
        
               def forward(self, x): 
        
                   return self.act(self.bn(self.conv(x)))

Guemann-ui · 2021-11-08T14:00:44Z

Hi, @glenn-jocher why did you replace the Relu activation function with the sigmoid function in the last version? I'm really curious to understand the results that provides you ( I don't have the time to try all the parameters that's why I'm asking XD)
Thanks.

glenn-jocher · 2021-11-08T14:10:41Z

@besmaGuesmi architecture updates are typically informed by empirical results of experiments and studies we run. You can see our Activations Study at https://wandb.ai/glenn-jocher/activations, and a discussion at #2891

Guemann-ui · 2021-11-08T14:26:49Z

@glenn-jocher what I've seen is that the FReLU provides the best result! why didn't you choose it?

glenn-jocher · 2021-11-08T14:47:31Z

@besmaGuesmi FReLU may be suitable for smaller models like 5n and 5s, but it adds too many operations to larger models and causes earlier overfitting. It also requires substantially increased resources like CUDA memory, which is not compatible with our goal of good results on consumer hardware using less resources.

Guemann-ui · 2021-11-08T14:57:24Z

Understood, could you please tell me how can I change the activation function in the model, I really like to use the FReLU function instead of SiLU because I used the Yolov5n model.

Thanks.

Guemann-ui · 2021-11-08T15:01:03Z

Is it enough to change it only in the common.py and expriment.py files?

glenn-jocher · 2021-11-08T16:10:29Z

@besmaGuesmi activations are defined in one place for all official YOLOv5 models:

yolov5/models/common.py

Line 38 in 79bca2b

    
           self.act = nn.SiLU() if act is True else (act if isinstance(act, nn.Module) else nn.Identity())

Guemann-ui · 2021-11-08T16:55:08Z

@glenn-jocher I tried the FReLU function as below

`
class FReLU(nn.Module):
    def __init__(self, c1, k=3):  # ch_in, kernel
        super().__init()__()
        self.conv = nn.Conv2d(c1, c1, k, 1, 1, groups=c1)
        self.bn = nn.BatchNorm2d(c1)

    @staticmethod
    def forward(self, x):
        return torch.max(x, self.bn(self.conv(x)))

class Conv(nn.Module):
    # Standard convolution
    def __init__(self, c1, c2, k=1, s=1, p=None, g=1, act=True):  # ch_in, ch_out, kernel, stride, padding, groups
        super().__init__()
        self.conv = nn.Conv2d(c1, c2, k, s, autopad(k, p), groups=g, bias=False)
        self.bn = nn.BatchNorm2d(c2)
        self.act = nn.FReLU() if act is True else (act if isinstance(act, nn.Module) else nn.Identity())`

But there is an error in the training! are there any mistakes in the implementation above? Thanks

glenn-jocher · 2021-11-08T17:18:59Z

@besmaGuesmi see #3013 (comment)

Guemann-ui · 2021-11-09T20:42:40Z

Hi @glenn-jocher, When I tried to change the Conv class, there are some issues because of the same files, what I understood is I have to change some files content according to https://github.com/ultralytics/yolov5/tree/0824388b9e1afb5a888ce4c302acfe2ad3da8101, but is there any other method to use directly FReLU without the need to change files like general.py, utils.py, etc.

Thanks

glenn-jocher · 2021-11-09T22:07:44Z

@besmaGuesmi the only file you need to update is common.py, you just import and use FReLU as in #3013 (comment)

Guemann-ui · 2021-11-09T22:46:54Z

@glenn-jocher yes I did exactly the same work but I faced an error in the training, do I have to remove the forward_fuse function from the Conv?

glenn-jocher · 2021-11-09T23:00:41Z

@besmaGuesmi your python indentations are incorrect. This is unrelated to YOLOv5. You may want to take a beginner's python course first to learn the basics.

Guemann-ui · 2021-11-09T23:09:41Z

sorry, I uploaded the wrong screenshot, I talked about this one, what I understand is that when I cloned yolov5, the version is not updated so we have to change other python files (activation.py, general.py ...) in addition, when I checked the common.py it also missed the activation function importation (FReLU, FReLU_noBN_biasFalse, FReLU_noBN_biasTrue..) the common.py, the activation.py, etc I have obtained after cloned are not the same here:https://github.com/ultralytics/yolov5/tree/0824388b9e1afb5a888ce4c302acfe2ad3da8101/models. Did you understand what I mean by my first question?

Guemann-ui · 2021-11-09T23:46:27Z

Solved by removing the other activation function, thanks

ppogg · 2021-12-04T09:45:28Z

Sir, I will experiment with the new activation function, as well as a lighter backbone, etc. If there is progress, I will let you know～

fanghua2021 · 2021-12-04T12:46:30Z

Hello, I used silu, h-swish, and leaky relu respectively in yolov5s for experiments. The results show: (1) map: h-swish>swish>leaky relu; (2) FPS: silu=leaky relu> h- swish>.

I have a question that h-swish does not use an exponential function, shouldn't the speed be greater than silu? In addition, I see that FReLU’s map is the best, so is his speed the best in yolov5s?

glenn-jocher · 2021-12-04T13:39:12Z

@fanghua2021 architecture updates are typically informed by empirical results of experiments and studies we run. You can see our Activations Study with YOLOv5s on COCO for 300 epochs at https://wandb.ai/glenn-jocher/activations, and a discussion at #2891

Results may vary by dataset and model.

hellodennis4 · 2021-12-12T07:38:51Z

Hi @glenn-jocher,Dy-relu,how do you think?it might be work better.

glenn-jocher · 2021-12-12T22:05:38Z

@hellodennis4 you can see our Activations Study with YOLOv5s on COCO for 300 epochs at https://wandb.ai/glenn-jocher/activations, and a discussion at #2891

Results may vary by dataset and model.

XhHello · 2021-12-14T08:17:07Z

Hello, I used silu, h-swish, and leaky relu respectively in yolov5s for experiments. The results show: (1) map: h-swish>swish>leaky relu; (2) FPS: silu=leaky relu> h- swish>.

I have a question that h-swish does not use an exponential function, shouldn't the speed be greater than silu? In addition, I see that FReLU’s map is the best, so is his speed the best in yolov5s?

Friend, how did you draw this curve

marziyemahmoudifar · 2022-08-11T08:36:00Z

@ilem777 see Conv() module in activations study branch for example implementations of alternative activation functions:

yolov5/models/common.py

Lines 34 to 57 in 0824388

class Conv(nn.Module):

# Standard convolution

def __init__(self, c1, c2, k=1, s=1, p=None, g=1, act=True): # ch_in, ch_out, kernel, stride, padding, groups

super(Conv, self).__init__()

self.conv = nn.Conv2d(c1, c2, k, s, autopad(k, p), groups=g, bias=False)

self.bn = nn.BatchNorm2d(c2)

# self.act = nn.Identity() if act is True else (act if isinstance(act, nn.Module) else nn.Identity())

# self.act = nn.Tanh() if act is True else (act if isinstance(act, nn.Module) else nn.Identity())

# self.act = nn.Sigmoid() if act is True else (act if isinstance(act, nn.Module) else nn.Identity())

# self.act = nn.ReLU() if act is True else (act if isinstance(act, nn.Module) else nn.Identity())

# self.act = nn.LeakyReLU(0.1) if act is True else (act if isinstance(act, nn.Module) else nn.Identity())

# self.act = nn.Hardswish() if act is True else (act if isinstance(act, nn.Module) else nn.Identity())

# self.act = nn.SiLU() if act is True else (act if isinstance(act, nn.Module) else nn.Identity())

# self.act = Mish() if act is True else (act if isinstance(act, nn.Module) else nn.Identity())

# self.act = FReLU(c2) if act is True else (act if isinstance(act, nn.Module) else nn.Identity())

# self.act = AconC(c2) if act is True else (act if isinstance(act, nn.Module) else nn.Identity())

# self.act = MetaAconC(c2) if act is True else (act if isinstance(act, nn.Module) else nn.Identity())

# self.act = SiLU_beta(c2) if act is True else (act if isinstance(act, nn.Module) else nn.Identity())

self.act = FReLU_noBN_biasFalse(c2) if act is True else (act if isinstance(act, nn.Module) else nn.Identity())

# self.act = FReLU_noBN_biasTrue(c2) if act is True else (act if isinstance(act, nn.Module) else nn.Identity())

def forward(self, x):

return self.act(self.bn(self.conv(x)))

How do I add a new ELU activation function to yolov5 and use it?

glenn-jocher · 2022-08-11T12:46:55Z

@marziyemahmoudifar you can simply replace the default nn.SiLU() activation here on models.py L44 with another one of your design. This will affect all activations in the whole YOLOv5 model:

yolov5/models/common.py

Lines 38 to 45 in 2e57b84

    
           class Conv(nn.Module): 
        
               # Standard convolution 
        
               def __init__(self, c1, c2, k=1, s=1, p=None, g=1, act=True):  # ch_in, ch_out, kernel, stride, padding, groups 
        
                   super().__init__() 
        
                   self.conv = nn.Conv2d(c1, c2, k, s, autopad(k, p), groups=g, bias=False) 
        
                   self.bn = nn.BatchNorm2d(c2) 
        
                   self.act = nn.SiLU() if act is True else (act if isinstance(act, nn.Module) else nn.Identity())

bzha5848 · 2022-08-16T19:17:53Z

Hi, I think it still doesn't work for FReLU, when I follow the comments and modify it the colab still says no module called FReLU

bzha5848 · 2022-08-16T19:20:18Z

glenn-jocher · 2022-08-17T23:43:02Z

@bzha5848 you can import FReLU from utils.activations

ZhixiongSun · 2022-09-23T03:37:58Z

@glenn-jocher Hi I have seen the result you mentioned here, https://wandb.ai/glenn-jocher/activations. For FRelu-noBN-BiasTrue and FRelu-noBN-BiasFalse, it semms like you early stop it. Could you please tell me why you did this or are these two type of activations（FRelu-noBN-BiasTrue and FRelu-noBN-BiasFalse） better than original FRelu

glenn-jocher · 2022-09-23T11:22:50Z

@ZhixiongSun I don’t remember exactly, but typical early stopping reasons may be excess resource usage, ie CUDA memory, or slow training speed.

You should be able to reproduce the runs yourself using the commands shown in the wandb logs.

passerbythesun · 2023-10-26T05:40:17Z

Give a DP about FReLU:
I've tested FReLU(blue line) in our custom dataset, just as describe aboved, the result is not good. The training procedure stopped at epoch 145, since "no improvement observed in last 100 epochs".

glenn-jocher · 2023-10-26T09:19:31Z

@passerbythesun please keep in mind that the performance of activation functions can vary depending on the dataset and model architecture. While the FReLU activation function may yield promising results in some scenarios, it appears to have limited efficacy in your custom dataset, as indicated by the training procedure stopping at epoch 145 due to no observed improvement in the last 100 epochs. It is recommended to explore other activation functions or further tune the model parameters to achieve better performance on your dataset.

IlamSaran · 2023-12-19T05:53:14Z

Can LeakyReLU be used in YOLOv5x version and how far it will help improve the model (v5x) performance

glenn-jocher · 2023-12-19T09:56:34Z

@IlamSaran LeakyReLU can be used in YOLOv5x and has been shown to improve performance in certain cases, particularly in addressing the vanishing gradient problem. However, the extent of performance improvement may vary depending on the specific dataset and model architecture. It is recommended to experiment with different activation functions and evaluate their impact on model performance to determine the most effective configuration for your use case.

IlamSaran · 2023-12-26T04:37:18Z

HI!
Could you please clarify the difference between Validation set and Test set in training an Deep learning model (Train, Valid???, Test???). And what is testing of unlabeled set of data / unseen set of data that contains similar classes on the custom trained model?

glenn-jocher · 2023-12-26T11:35:38Z

@IlamSaran The validation set is used during training for model selection and hyperparameter tuning, while the test set is reserved for final evaluation of the trained model's performance. Testing on an unlabeled set of data, containing similar classes to those in the training set, helps assess the model's generalization capabilities and its ability to make accurate predictions on previously unseen examples.

IlamSaran · 2024-01-18T08:34:50Z

My DL model for object detection task results with mAP@0.5 = 90% and mAP@0.5:0.95 =78% on my custom created dataset.
But the same model results with 96% for mAP@0.5 and only 55% mAP@0.5:0.95 for a public benchmark dataset. Note (both dataset contain same classes). Though mAP@0.5 is greater for public dataset, mAP@0.5:0.95 is COMPARITBELY VERY LESS COMPARED TO CUSTOM dataset. Please clarify and can i conclude that my model works better on my custom dataset.

glenn-jocher · 2024-01-28T03:02:48Z

@IlamSaran Your model's performance as measured by mAP@0.5 is indeed higher on the public benchmark dataset, indicating good detection at a specific IoU threshold of 0.5. However, the lower mAP@0.5:0.95 score suggests that the model's performance across a range of IoU thresholds from 0.5 to 0.95 is not as robust on the public dataset compared to your custom dataset.

The mAP@0.5:0.95 metric provides a more comprehensive assessment of detection performance across various IoU levels, reflecting both localization and detection accuracy. The comparative decrease in this metric for the public dataset suggests that while the model detects objects well at a lower IoU threshold, it struggles with precise localization at higher IoU thresholds.

In conclusion, your model does appear to work better on your custom dataset when considering the overall detection and localization performance (mAP@0.5:0.95). However, it's also important to consider dataset size, diversity, and difficulty when comparing these metrics.

zxsitu added the question Further information is requested label May 2, 2021

zxsitu closed this as completed May 6, 2021

How to modify the activation function? #3013

How to modify the activation function? #3013

Comments

zxsitu commented May 2, 2021

❔Question

Additional context

glenn-jocher commented May 2, 2021

Guemann-ui commented Nov 8, 2021

glenn-jocher commented Nov 8, 2021

Guemann-ui commented Nov 8, 2021

glenn-jocher commented Nov 8, 2021

Guemann-ui commented Nov 8, 2021

Guemann-ui commented Nov 8, 2021 • edited Loading

glenn-jocher commented Nov 8, 2021

Guemann-ui commented Nov 8, 2021 • edited Loading

glenn-jocher commented Nov 8, 2021

Guemann-ui commented Nov 9, 2021

glenn-jocher commented Nov 9, 2021

Guemann-ui commented Nov 9, 2021 • edited Loading

glenn-jocher commented Nov 9, 2021

Guemann-ui commented Nov 9, 2021

Guemann-ui commented Nov 9, 2021

ppogg commented Dec 4, 2021

fanghua2021 commented Dec 4, 2021

glenn-jocher commented Dec 4, 2021

hellodennis4 commented Dec 12, 2021

glenn-jocher commented Dec 12, 2021

XhHello commented Dec 14, 2021

marziyemahmoudifar commented Aug 11, 2022

glenn-jocher commented Aug 11, 2022

bzha5848 commented Aug 16, 2022

bzha5848 commented Aug 16, 2022

glenn-jocher commented Aug 17, 2022

ZhixiongSun commented Sep 23, 2022

glenn-jocher commented Sep 23, 2022

passerbythesun commented Oct 26, 2023

glenn-jocher commented Oct 26, 2023

IlamSaran commented Dec 19, 2023

glenn-jocher commented Dec 19, 2023

IlamSaran commented Dec 26, 2023

glenn-jocher commented Dec 26, 2023

IlamSaran commented Jan 18, 2024

glenn-jocher commented Jan 28, 2024

Guemann-ui commented Nov 8, 2021 •

edited

Loading

Guemann-ui commented Nov 8, 2021 •

edited

Loading

Guemann-ui commented Nov 9, 2021 •

edited

Loading