-
Notifications
You must be signed in to change notification settings - Fork 129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
What's the main purpose of training of stage_1.py #22
Comments
@jchhuang Modulated attention depends largely on a good initial feature map. If we use attention with randomly initialized weights, the results will be much lower because the attention won't work properly. Same as meta embedding which relies on a good initialized memory. If the memory is randomly initialized, the whole training process will break. Cosine classifier is embedded in meta emdedding classifier so won't be used in stage 1. |
@zhmiao Thanks for your reply. For my understanding, memory isn't trained by the stage-1,but initialed by the results of stage-1 at the beginning stage-2, and finally updated by the vmeta, does it? |
@jchhuang Yes, memory is not trained at stage1. However, without any trained weights, there is no proper feature space that can generate a good memory, which is constructed as feature centroids of each class. Once stage1 is finished, we can use the pretrained weights to initialize the memory, because the feature space is not random any more. |
Thanks for your good explain! |
In my understanding, all of the modulated attention, dynamic meta-embedding and cosine classifier are not used in stage_1, so I have a question what's the main purpose of training of stage_1.py? Just in order to finetune the ResNet152?
The text was updated successfully, but these errors were encountered: