Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

why loss_clf = F.cross_entropy(logits[:, self._known_classes:], fake_targets)? #7

Closed
chester-w-xie opened this issue May 26, 2022 · 3 comments

Comments

@chester-w-xie
Copy link

Thank you very much for your excellent work.

In models/finetune.py , line 117:

                fake_targets=targets-self._known_classes
                loss_clf = F.cross_entropy(logits[:,self._known_classes:], fake_targets)
                
                loss=loss_clf

why not :

                loss_clf = F.cross_entropy(logits, targets)
                loss=loss_clf
@G-U-N
Copy link
Owner

G-U-N commented May 26, 2022

Good question. Just as we claimed in the README file

By default, weights corresponding to the outputs of previous classes are not updated.

This implementation is to avoid updating the prototypes of old categories. And therefore, the training of new classes will only corrupt the feature extractor while the prototypes are preserved.

@G-U-N
Copy link
Owner

G-U-N commented May 26, 2022

But here is indeed a bug caused by weight decay that I ignored before. Since even the old prototypes are not used to calculate the final loss, the old prototypes will still be slightly changed by the weight decay. To understand this problem, run the following python script. I will correct the bug later. And if you have any insights into this bug, welcome to contribute to our repo.

a=nn.Linear(10,2)
optimizer=torch.optim.SGD(a.parameters(),lr=0.1,weight_decay=5e-4)
for i in range(1000):
    optimizer.zero_grad()
    output=a(torch.randn(10))
    output[0].backward()
    optimizer.step()
    print(a.bias.data)

@zhoudw-zdw
Copy link
Collaborator

Typical finetuning loss should be the latter one you mentioned. We make some modifications to improve its performance.

But I think this does not influence much of the performance since finetuning is always the worst choice of incremental learning. Just replace it and have a try.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants