New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
What means about 'importance-weighted' in paper? #4
Comments
Thank you for the message :) What you are referring to is the "data-weighting" vector, which weights the contribution of each data point in the loss function. This is important for class-balancing for some methods, where examples in the coreset are weighted with higher importance. The "importance-weighted" distillation for our method is done with a frozen copy of the linear layer from the most recent task model. This weights the contribution of each feature in the embedding space. This is done in lines 340 and 341 in datafree.py. Does this answer your question? |
Thanks for reply,Thanks. "data-weighting" vector in Cifar100 datasets, the vector is [1,1,1...1], is it mean not weighted for each data point? (in default.py) "important-weighted" in lines 340 and 341 in datafree.py. weights means logits_pen[kd_index] and input[kd_index] in lines 340 and 341 in datafree.py? Thanks again |
Yes, the default [1,1,1...1] simply means no class-balancing weighting. It is put there for consistent implementation wrt the methods which have non-ones in this vector. The comment about 340 and 341 is nearly correct. logits_KD_past is the old task's features passed through the old task's linear classification head. logits_KD is the new task's features passed through the old task's linear classification head. The intuition here is that we focus on a feature distillation, but we want to penalize changes to features which were most important to the previous task. We can weight the importance of these features by using the linear head! "self.previous_linear" is the weighting vector. Logits_pen is the penultimate feature representation from the current task's model. self.previous_teacher.generate_scores_pen(x) generates the penultimate feature representation from the previous task's model. |
Great! nice idea! In lines 338 and 340 in datafree.py lines 338: lines 340 i don't understand what logits_pen([kd_index]) means? Thanks again. |
Thanks! logits_pen is the penultimate distribution because the pen=True flag returns the penultimate features. kd_index selects the data which we are using for distillation. For our method, we use all data; for other methods (such as DGR), we would only perform distillation over the synthetic/generated data. This is simply for syntax consistency with other implemented methods in our framework :) If you only wanted to use real data for distillation, you would set kd_index = np.arange(self.batch_size). If you only wanted to use the synthetic data, you would set kd_index = np.arange(self.batch_size,2*self.batch_size) |
Thanks for your replay! logits_pen = self.model.forward(x=inputs, pen=True) i notice parameter pen=Ture before, in my opinion, the model is composed of feature extractor and classifier by the way |
Thanks! :) You can find the pen=True part of the code in the model file! :) Look in models/resnet.py I am not sure what your last comment is asking. Importance-weighting part is a different part of the code, where the penultimate features are passed through the self.previous_linear |
Thanks ! :) |
Nice work, but i have some little question.
What means about 'importance-weighted' in paper?
in code,
### def data_weighting() in default.py,
i find self.dw_k is [1,1,1,1,........1]
### class AlwaysBeDreaming(DeepInversionGenBN) in datafree.py
the loss_kd = self.kd_criterion(logits_KD, logits_KD_past).sum(dim=1) * dw_KD
dw_KD is also [1,1,1,......1]
it's not useful,and i don't understead where means 'importance-weighted' or 'data weighting'?
The text was updated successfully, but these errors were encountered: