added mlp and attn bias option to flash and paged llama models #85

JRosenkranz · 2024-05-06T15:35:59Z

Motivation

[Describe why this change is needed]

The Calico models currently set the mlp and attention bias to true, which was hard-coded to false in flash and paged llama implementations. This will use the config params set in huggingface/transformers#30031 to set those values properly.

Modifications

[Describe the code changes]

added attention_bias, mlp_bias to config for Flash and Paged Llama implementations (default is False)
set bias in attention and mlp to the config value

Result

[Describe how the changes affects existing behavior and how to test it]

Models should be able to load properly if containing attention and mlp bias

Related Issues

NA

Signed-off-by: Joshua Rosenkranz <jmrosenk@us.ibm.com>

Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>

njhill

Thanks @JRosenkranz @joerunde!

…atahub-io#85) #### Motivation The `Calico` models currently set the mlp and attention bias to true, which was hard-coded to false in flash and paged llama implementations. This will use the config params set in huggingface/transformers#30031 to set those values properly. #### Modifications - added attention_bias, mlp_bias to config for Flash and Paged Llama implementations (default is False) - set bias in attention and mlp to the config value #### Result Models should be able to load properly if containing attention and mlp bias --------- Signed-off-by: Joshua Rosenkranz <jmrosenk@us.ibm.com> Signed-off-by: Joe Runde <Joseph.Runde@ibm.com> Co-authored-by: Joe Runde <Joseph.Runde@ibm.com>

added mlp and attn bias option to flash and paged llama models (opendatahub-io#85)

added mlp and attn bias to flash and paged llama models

7d57879

Signed-off-by: Joshua Rosenkranz <jmrosenk@us.ibm.com>

JRosenkranz requested review from tdoublep and njhill May 6, 2024 15:36

JRosenkranz self-assigned this May 6, 2024

🐛 add lm_head.weight alias

b9aa214

Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>

njhill approved these changes May 6, 2024

View reviewed changes

njhill merged commit deb99f6 into main May 6, 2024
7 checks passed

njhill deleted the attn_mlp_bias branch May 6, 2024 21:18

openshift-merge-bot bot referenced this pull request in red-hat-data-services/text-generation-inference May 7, 2024

Merge pull request #32 from heyselbi/rhoai-2-8-3

3853574

added mlp and attn bias option to flash and paged llama models (opendatahub-io#85)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

added mlp and attn bias option to flash and paged llama models #85

added mlp and attn bias option to flash and paged llama models #85

JRosenkranz commented May 6, 2024

njhill left a comment

added mlp and attn bias option to flash and paged llama models #85

added mlp and attn bias option to flash and paged llama models #85

Conversation

JRosenkranz commented May 6, 2024

Motivation

Modifications

Result

Related Issues

njhill left a comment

Choose a reason for hiding this comment