Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use the code for ADI? #27

Open
theAverageArchit opened this issue Apr 5, 2023 · 2 comments
Open

How to use the code for ADI? #27

theAverageArchit opened this issue Apr 5, 2023 · 2 comments

Comments

@theAverageArchit
Copy link

I am having trouble figuring out how to use the code for performing ADI. What are the settings that we need to set for performing ADI?

@tallesbrito
Copy link

Hi @theAverageArchit ,

I found that in the ImageNet implementation in this repository, you can set --adi_scale to non-zero (e.g. --adi_scale=0.2). This option activates a pre-trained Resnet18 as a student and activates the ADI loss computation. In this case, the ResNet18 weights are fixed (there is no backpropagated gradient to the ResNet18).

However, it seems to me that this is different from what is described in the paper: in the paper, the student model is initialized from scratch and is modified by gradients.

Can anyone explain this?

After all, is there a full implementation of ADI in this repository? Wouldn't I need to use knowledge distillation from teacher to student in order to enable ADI, as described in the paper?

@unistdJRZ
Copy link

Hi @theAverageArchit ,

I found that in the ImageNet implementation in this repository, you can set --adi_scale to non-zero (e.g. --adi_scale=0.2). This option activates a pre-trained Resnet18 as a student and activates the ADI loss computation. In this case, the ResNet18 weights are fixed (there is no backpropagated gradient to the ResNet18).

However, it seems to me that this is different from what is described in the paper: in the paper, the student model is initialized from scratch and is modified by gradients.

Can anyone explain this?

After all, is there a full implementation of ADI in this repository? Wouldn't I need to use knowledge distillation from teacher to student in order to enable ADI, as described in the paper?

I'm fully support what u saying, there is no graident update for student model, which means no-way to progressively generate new samples, which different from paper.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants