Can I use bfloat16 when training? #114

yangyuya · 2024-07-01T06:20:15Z

I see the code use torch.float by default.

    model = VideoChatGPTLlamaForCausalLM.from_pretrained(
        model_args.model_name_or_path,
        cache_dir=training_args.cache_dir,
        # torch_dtype=torch.bfloat16 if training_args.bf16 else torch.float,
    )

Can I use bfloat16 when training? I find if I use bfloat16 I can train with 24G GPUs. But I'm not sure how much this affects model performance? Can you give me some advice?

The text was updated successfully, but these errors were encountered:

mmaaz60 · 2024-07-08T19:24:54Z

Hi @yangyuya,

Thank you for you interest in our work. I have not tried it though the training in bf16 mode should work and give similar results.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can I use bfloat16 when training? #114

Can I use bfloat16 when training? #114

yangyuya commented Jul 1, 2024

mmaaz60 commented Jul 8, 2024

Can I use bfloat16 when training? #114

Can I use bfloat16 when training? #114

Comments

yangyuya commented Jul 1, 2024

mmaaz60 commented Jul 8, 2024