Skip to content

bfloat16 support for quickgelugrad#18336

Merged
prathikr merged 3 commits intomainfrom
prathikrao/quickgelugrad-bfloat16
Nov 8, 2023
Merged

bfloat16 support for quickgelugrad#18336
prathikr merged 3 commits intomainfrom
prathikrao/quickgelugrad-bfloat16

Conversation

@prathikr
Copy link
Copy Markdown
Contributor

@prathikr prathikr commented Nov 7, 2023

Description

Registers BFloat16 datatype as valid input type for CUDA QuickGeluGrad Kernel.

Motivation and Context

Enabling meta-llama/Llama-2-70b to be finetuned with ONNX Runtime training.

@prathikr prathikr requested a review from hanbitmyths November 8, 2023 05:04
@prathikr prathikr merged commit 34f77ea into main Nov 8, 2023
@prathikr prathikr deleted the prathikrao/quickgelugrad-bfloat16 branch November 8, 2023 16:40
kleiti pushed a commit to kleiti/onnxruntime that referenced this pull request Mar 22, 2024
### Description
<!-- Describe your changes. -->

Registers BFloat16 datatype as valid input type for CUDA QuickGeluGrad
Kernel.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

Enabling `meta-llama/Llama-2-70b` to be finetuned with ONNX Runtime
training.

---------

Co-authored-by: Prathik Rao <prathikrao@microsoft.com@orttrainingdev8.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants