Adding DINO to lightly #698

Atharva-Phatak · 2022-02-22T01:13:21Z

First of all Kudos on creating this amazing library ❤️.

I think all of us in self supervised learning community have heard of DINO. For the past couple of weeks, I have been trying to port the DINO implementation in facebook's implementation to PL. I have implemented it atleast for my use case which is a kaggle competition. I initially looked at lightly for implmenetation but I did not see any, so I borrowed and adopted code from original implementation and converted it into lightning.

Honestly it was a tedious task. I was wondering if you guys would be interested in adding DINO to lightly.

Here's how I think we should structure the implementations

Augmentations : Dino heavily relies on multi-crop strategy. Since lightly already has a collate_fn for implementing augmentations, dino augmentations can be implemented in the same way.
Model forward pass and Model Heads : The forward pass for the model is weird since we have to deal with multicrop and global crops, so this needs to be implemented as a nn.Module like other heads in lightly.
Loss Functions : I am not sure how this should be implemented for lightly, although FB have a custom class for that.
Utility functions : FB has used some tricks for stable training and results, so these need to be included as well.

I have used lightning to implement all this and so far at least for my use case I was able to train vit-base-16 due my hardware constraints.

This goes without saying I would personally like to work on PR ❤️

Please let me know.

guarin · 2022-02-22T07:52:12Z

Hi @Atharva-Phatak! Thanks for reaching out!

We recently implemented DINO but did not announce it yet. All the code should be available in the latest release (1.2.7) and you can find the docs on how to use it here: https://docs.lightly.ai/examples/dino.html

We implemented the following parts:

Loss (it somehow did not make it into the docs yet, will fix it)
Head
Collate Function

I hope this helps you! Let us know if you have any questions or feedback :)

Atharva-Phatak · 2022-02-23T00:53:43Z

So cool. Amazing work guys ! Code looks so clean.

I was thinking maybe if we could utilities to visualize attention maps for vit-models trained using SSL ? Just like DINO did and the cool video they released.

That will be the cherry on cake in my opinion if we added the visualization ?

What do you think ? I am very motivated to make contributions to lightly !

guarin · 2022-02-23T12:20:41Z

We thought about making a tutorial on how to use DINO with a transformer and visualize the results. Let us know if you would be interested in it. Contributions are of course always very welcome!

Atharva-Phatak · 2022-02-23T17:21:53Z

Yes I am interested in it. I can make a tutorial which uses visualization that will be fun to do. Could you please give me some instructions on how to proceed ?

guarin · 2022-02-24T15:25:14Z

Hi, that sounds great!

As an outline I would propose the following two steps:

Show how to train DINO with a transformer (you can take the code from the examples)
Visualize the transformer self attention for some images

For the visualization I would maybe refactor the original code into one or two easy to use functions. You could of course also use another tool for the visualization.

I guess a jupyter notebook or google colab would be the easiest way to share the code so you don't have to build our docs from scratch. Let me know what you think :)

Atharva-Phatak · 2022-02-24T21:28:11Z

That sounds, I should have that up and running quickly. Any data-sets you would recommend ? Maybe cifar10 ?

Also quick question, in the lightning implementation of Dino you guys seem to update momentum in the training step, but shouldn't it be done in on_train_batch_end hook in lightning ? Because even in original implementation they do it after optimizer.update(). Also you they have a training strategy where they cancel gradients of last layer, any plans on implementing ?

Please let me know.

guarin · 2022-02-25T08:34:39Z

Any data-sets you would recommend ? Maybe cifar10 ?

Cifar10 looks good. We can always change the dataset if it does not work well. Imagenette would be another option as it has larger images but has fewer images than imagenet.

Edit: I am actually not sure if Cifar10 works with the ViT backbone as it expects 224x224 images and 16x16 patches. So we probably have to go for Imagenette.

Edit 2: Nvm, you can set the image size to 32 when loading the backbone: torch.hub.load('facebookresearch/dino:main', 'dino_vits16', pretrained=False, image_size=[32])

Also quick question, in the lightning implementation of Dino you guys seem to update momentum in the training step, but shouldn't it be done in on_train_batch_end hook in lightning ? Because even in original implementation they do it after optimizer.update()

This should not change anything, as we still run the update between two training steps. Whether this is right before the step or right after a step does not matter.

Also they have a training strategy where they cancel gradients of last layer, any plans on implementing ?

Adding it makes training a bit more stable. We did not add it for simplicity but maybe it would make sense to add it as an extra method on the DINOHead 🤔 :

class DINOHead:
    ....
    def cancel_gradient_last_layer(self):
        self.last_layer.grad = None

and then we could simply call it in the training loop during the first epoch.

Atharva-Phatak · 2022-02-25T18:53:17Z

@guarin Maybe I can create a small PR ❤️ to add cancel_gradient_last_layer to DINO Head ? Then I will create an example which utilizes everything ?

guarin · 2022-02-28T08:33:14Z

Yes that would be great!

Atharva-Phatak · 2022-02-28T21:47:11Z

Awesome I will create a PR :)

philippmwirth · 2022-03-15T07:31:07Z

I think we can close this one 🙂

Atharva-Phatak mentioned this issue Mar 9, 2022

Dino tricks #731

Merged

philippmwirth closed this as completed Mar 15, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding DINO to lightly #698

Adding DINO to lightly #698

Atharva-Phatak commented Feb 22, 2022

guarin commented Feb 22, 2022

Atharva-Phatak commented Feb 23, 2022

guarin commented Feb 23, 2022

Atharva-Phatak commented Feb 23, 2022

guarin commented Feb 24, 2022

Atharva-Phatak commented Feb 24, 2022 •

edited

guarin commented Feb 25, 2022 •

edited

Atharva-Phatak commented Feb 25, 2022

guarin commented Feb 28, 2022

Atharva-Phatak commented Feb 28, 2022

philippmwirth commented Mar 15, 2022

Adding DINO to lightly #698

Adding DINO to lightly #698

Comments

Atharva-Phatak commented Feb 22, 2022

guarin commented Feb 22, 2022

Atharva-Phatak commented Feb 23, 2022

guarin commented Feb 23, 2022

Atharva-Phatak commented Feb 23, 2022

guarin commented Feb 24, 2022

Atharva-Phatak commented Feb 24, 2022 • edited

guarin commented Feb 25, 2022 • edited

Atharva-Phatak commented Feb 25, 2022

guarin commented Feb 28, 2022

Atharva-Phatak commented Feb 28, 2022

philippmwirth commented Mar 15, 2022

Atharva-Phatak commented Feb 24, 2022 •

edited

guarin commented Feb 25, 2022 •

edited