We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hello,
It would be great to have a training mode or arg which would enable FP16 mode, for enhanced performance on CUDA traininig.
I use this training code for regular hugginface training:
`scaler = torch.cuda.amp.GradScaler()
def train_fp16(epoch): model.train() for _,data in enumerate(training_loader, 0): ids = data['ids'].to(device, dtype = torch.long) mask = data['mask'].to(device, dtype = torch.long) token_type_ids = data['token_type_ids'].to(device, dtype = torch.long) targets = data['targets'].to(device, dtype = torch.float)
with torch.cuda.amp.autocast(): outputs = model(ids, mask, token_type_ids) optimizer.zero_grad() loss = loss_fn(outputs, targets) #if _%1000==0: # print(f'Epoch: {epoch}, Loss: {loss.item()}') optimizer.zero_grad() scaler.scale(loss).backward() scaler.step(optimizer) scaler.update()
def train_fp32(epoch): model.train() for _,data in enumerate(training_loader, 0): ids = data['ids'].to(device, dtype = torch.long) mask = data['mask'].to(device, dtype = torch.long) token_type_ids = data['token_type_ids'].to(device, dtype = torch.long) targets = data['targets'].to(device, dtype = torch.float)
outputs = model(ids, mask, token_type_ids) optimizer.zero_grad() loss = loss_fn(outputs, targets) if _%1000==0: print(f'Epoch: {epoch}, Loss: {loss.item()}') optimizer.zero_grad() loss.backward() optimizer.step()`
Thanks!
The text was updated successfully, but these errors were encountered:
Thanks for the suggestion! I'll look into it. I agree with adding more features to reduce memory consumption.
Sorry, something went wrong.
Once again, thanks for this suggestion. I just published a new version that allows you to enable fp16.
No branches or pull requests
Hello,
It would be great to have a training mode or arg which would enable FP16 mode, for enhanced performance on CUDA traininig.
I use this training code for regular hugginface training:
`scaler = torch.cuda.amp.GradScaler()
def train_fp16(epoch):
model.train()
for _,data in enumerate(training_loader, 0):
ids = data['ids'].to(device, dtype = torch.long)
mask = data['mask'].to(device, dtype = torch.long)
token_type_ids = data['token_type_ids'].to(device, dtype = torch.long)
targets = data['targets'].to(device, dtype = torch.float)
def train_fp32(epoch):
model.train()
for _,data in enumerate(training_loader, 0):
ids = data['ids'].to(device, dtype = torch.long)
mask = data['mask'].to(device, dtype = torch.long)
token_type_ids = data['token_type_ids'].to(device, dtype = torch.long)
targets = data['targets'].to(device, dtype = torch.float)
Thanks!
The text was updated successfully, but these errors were encountered: