Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Train from scratch #7

Closed
stomachacheGE opened this issue Dec 19, 2020 · 1 comment
Closed

Train from scratch #7

stomachacheGE opened this issue Dec 19, 2020 · 1 comment

Comments

@stomachacheGE
Copy link

Thanks for your work.

I have a question concerning training from scratch. I checked the source code, and it seems that there is no implementation of position embedding. One can only load position embedding from the pretrained models. If I want to train from scratch, should I implement position embedding by myself, or is there something I overlooked? Any other things I should be careful with if training from scratch?

@jeonsworld
Copy link
Owner

jeonsworld commented Dec 20, 2020

Training from scratch is the same as position embedding in the source code, the difference is that it doesn't initialize with pre-train weights.
Referring to the author's code, the author initializes the weights with the normal distribution (std=0.02).

The pytorch implementation is as follows:

position_embeddings = nn.Parameter(torch.empty(1, n_patches+1, hidden_size), requires_grad=True)
nn.init.normal_(position_embeddings, std=0.02)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants