Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

some questions about premodel #4

Closed
ld0714 opened this issue Feb 15, 2019 · 2 comments
Closed

some questions about premodel #4

ld0714 opened this issue Feb 15, 2019 · 2 comments

Comments

@ld0714
Copy link

ld0714 commented Feb 15, 2019

Thanks for your model ,I want to ask you about the specific details of training premodel.
Q1:What's the actual size of the dataset used to trained in premodel?
Q2:How many the number of used GPUS when training and how long is the training?
Q3:How many the steps of pre-training

@Shun14
Copy link
Owner

Shun14 commented Feb 22, 2019

Sorry for the late answer.

  1. I used the synth80k to train the pre-model
  2. 1 gpu with batch size 16(384 scale). I don't remember the precise time , maybe 24 hours.
  3. I choose this model around 12w steps.

@Shun14 Shun14 closed this as completed Feb 22, 2019
@novioleo
Copy link

Sorry for the late answer.

  1. I used the synth80k to train the pre-model
  2. 1 gpu with batch size 16(384 scale). I don't remember the precise time , maybe 24 hours.
  3. I choose this model around 12w steps.

老外就不会懂12w了。。。。233333.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants