Negative Loss, Transfer Learning/Fine-Tuning Question #6

rsomani95 · 2020-06-22T12:56:20Z

Hi! Thanks for sharing this repo -- really clean and easy to use.

When training using the PyTorch Lightning script from the repo, my loss is negative (and gets more negative over time) when training. Is this expected?

I'm curious to know if you've fine-tuned a pretrained model using this BYOL as the README example suggested. If yes, how were the results? Any intuition regarding how many epochs to fine-tune for?

Thanks!

nihal1294 · 2020-06-22T16:25:34Z

Check this function:

byol-pytorch/byol_pytorch/byol_pytorch.py

Lines 36 to 39 in e25b588

    
           def loss_fn(x, y): 
        
               x = F.normalize(x, dim=-1, p=2) 
        
               y = F.normalize(y, dim=-1, p=2) 
        
               return -2 * (x * y).sum(dim=-1)

I believe it is not the same as in the paper : https://arxiv.org/pdf/2006.07733.pdf

Check page 28, section G.3 Loss function. Though it is in JAX (and not very clear to me), not sure if it is the same. But I could be wrong and if someone could shed some light on this I would really appreciate it!

Thanks.

lucidrains · 2020-06-22T16:46:51Z

@Nihal94 it should be the same, the loss function is nothing more than the negative of the cosine similarity (negative because we are trying to maximize the similarity)

import torch.nn.functional as F
import torch
from torch import nn

def loss_fn(x, y): 
     x = F.normalize(x, dim=-1, p=2) 
     y = F.normalize(y, dim=-1, p=2) 
     return -2 * (x * y).sum(dim=-1)

def loss_fn_2(x, y):
    return -2 * ((x * y).sum(dim=-1) / (x.norm(dim=-1) * y.norm(dim=-1)))

x = torch.randn(2, 4)
y = torch.randn(2, 4)

l1 = loss_fn(x, y)
l2 = loss_fn_2(x, y)
l3 = -2 * nn.CosineSimilarity(dim=-1)(x, y)

print(l1, l2, l3)

lucidrains · 2020-06-22T16:47:30Z

@rsomani95 I haven't yet, I was hoping someone will! Currently doing some transformers work, but will get back to this later this week!

nihal1294 · 2020-06-22T16:59:26Z

@lucidrains thanks for clarifying. Now it makes sense. Also check the screenshot below, 4th epoch and that's the loss currently

lucidrains · 2020-06-22T17:08:35Z

@tiredrandomuser yea, maybe I should normalize the loss so it lies between 0 and -1, just so it doesn't scare people into thinking it is broken

lucidrains · 2020-06-22T17:09:30Z

@tiredrandomuser it shouldn't make a difference to training anyhow

lucidrains · 2020-06-22T17:14:12Z

ok done! 1cce49d losses should fall around [-4, 0] now

nihal1294 · 2020-06-22T17:22:40Z

@lucidrains yes. I'm yet to check the model. Will update here soon if possible. And thanks for the quick and clean implementation

rsomani95 · 2020-06-22T17:27:03Z

@rsomani95 I haven't yet, I was hoping someone will! Currently doing some transformers work, but will get back to this later this week!

Awesome. I should have some more time on my hands in a couple of weeks to play around with this too. The biggest hurdle is compute time. A dataset of 120,000 images takes about 1.5 hrs per epoch on a K40 on colab.

lucidrains · 2020-06-22T17:34:18Z

@rsomani95 let us know! finally it seems self-supervised learning could be accessible to us mere peasants lol

rsomani95 · 2020-06-22T17:38:04Z

Hahahaha that's precisely how I feel

[#6] make loss positive

lucidrains · 2020-06-24T19:02:36Z

solved with #7 !

AderonHuang · 2020-10-22T14:07:57Z

good

lucidrains added a commit that referenced this issue Jun 24, 2020

Merge pull request #7 from NaxAlpha/patch-1

7126694

[#6] make loss positive

lucidrains closed this as completed Jun 24, 2020

IQ17 mentioned this issue May 18, 2022

loss_fn is not cosine_similarity?? #83

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Negative Loss, Transfer Learning/Fine-Tuning Question #6

Negative Loss, Transfer Learning/Fine-Tuning Question #6

rsomani95 commented Jun 22, 2020

nihal1294 commented Jun 22, 2020 •

edited

lucidrains commented Jun 22, 2020

lucidrains commented Jun 22, 2020

nihal1294 commented Jun 22, 2020

lucidrains commented Jun 22, 2020 •

edited

lucidrains commented Jun 22, 2020

lucidrains commented Jun 22, 2020

nihal1294 commented Jun 22, 2020

rsomani95 commented Jun 22, 2020

lucidrains commented Jun 22, 2020 •

edited

rsomani95 commented Jun 22, 2020

lucidrains commented Jun 24, 2020

AderonHuang commented Oct 22, 2020

Negative Loss, Transfer Learning/Fine-Tuning Question #6

Negative Loss, Transfer Learning/Fine-Tuning Question #6

Comments

rsomani95 commented Jun 22, 2020

nihal1294 commented Jun 22, 2020 • edited

lucidrains commented Jun 22, 2020

lucidrains commented Jun 22, 2020

nihal1294 commented Jun 22, 2020

lucidrains commented Jun 22, 2020 • edited

lucidrains commented Jun 22, 2020

lucidrains commented Jun 22, 2020

nihal1294 commented Jun 22, 2020

rsomani95 commented Jun 22, 2020

lucidrains commented Jun 22, 2020 • edited

rsomani95 commented Jun 22, 2020

lucidrains commented Jun 24, 2020

AderonHuang commented Oct 22, 2020

nihal1294 commented Jun 22, 2020 •

edited

lucidrains commented Jun 22, 2020 •

edited

lucidrains commented Jun 22, 2020 •

edited