Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gradient analysis #3

Closed
Lilyo opened this issue Mar 31, 2020 · 4 comments
Closed

Gradient analysis #3

Lilyo opened this issue Mar 31, 2020 · 4 comments

Comments

@Lilyo
Copy link

Lilyo commented Mar 31, 2020

Hi, @tztztztztz !
How to collect average L2 norm of gradient of weight? Could you provide the detail step for gen the Fig.1 in paper. (It is better if there are source code!)

Thank a lot!

@tztztztztz
Copy link
Owner

tztztztztz commented Apr 2, 2020

You can use another weight term to filter out the gradients you don't want to collect.

For example, suppose the number of samples and the number of classes are 5 and 4 respectively. And the gt_label is [0, 1, 2, 3, 3]
Now if you want to collect gradient from only positive samples.
the weight would be:
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
0 0 0 1
Then you can collect the L2 norm of gradients of weights after each backward using a hook.

Similarly, if you want to collect gradients from only negative samples.
the weight would be:
0 1 1 1
1 0 1 1
1 1 0 1
1 1 1 0
1 1 1 0

So, the detailed step is:

  1. Determine what type of gradient you want to collect and the corresponding weight
  2. Resume the model from a ckpt
  3. Run the model for several epoch and collect the L2 norm after each backward (you may want not to update the model by setting learning rate to 0),

@Lilyo
Copy link
Author

Lilyo commented Apr 2, 2020

It's clear now! But I'm not sure if my understanding is correct. Based on your description, I feed data into a fixed weight model, then collect the gradients of (positive/negative) samples in the last classifier layer by using label as class-wise weight, is it?
Thank you for your reply!

@tztztztztz
Copy link
Owner

I'm not sure what "using label as class-wise weight" means.
You should first calculate the binary cross-entropy loss (or EQL loss), and multiply it with the weight term which I mentioned above. Then you backward the loss function and collect the L2 norm of gradients for each class.

@Lilyo
Copy link
Author

Lilyo commented Apr 3, 2020

I will try to reproduce the results, thx.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants