If Alibi is on, we should turn learned_pos_emb to False #489

bcui19 · 2023-07-25T20:40:57Z

If we use alibi we need to turn the learned_pos_emb to False. Otherwise we still go down the learned_pos_emb codepath.

…earned_pos_emb to false

vchiley · 2023-07-25T20:45:25Z

Similarly the argument could be made that if alibi == False set learned_pos_emb = True BUT I figured alibi = False and learned_pos_emb = False is NoPE so I left the option open.
Similarly, I wanted to give ppl the ability to use both alibi and learned_pos_emb

maybe add a warning or something???
maybe just ignore me... 😅

bcui19 · 2023-07-25T20:50:28Z

Hm... Okay, so one thing is if we train models w/ ALiBi (without learned_pos_embed) and load that model with learned_pos_embed=True it could really affect model performance (since now you have this random position embedding...)

I also am not 100% sure what's the right thing. The "safest" thing might be to add a warning that we're changing it? Since if a user specifies ALiBi they usually mean the case where you don't add in the positional embedding.

vchiley · 2023-07-25T21:05:52Z

if we load a trained model, should the model config be part of the ckpt?

In general, the change you made + a warning sounds good.

samhavens · 2023-07-25T21:10:49Z

would a good middle ground being changing the default value so that learned_pos_emb: bool = False,? Alternatively, we can default learned_pos_emb to None, and then later change the None to learned_pos_emb = not alibi. Users could still explicitly set both to be on, but we shouldn't run in to accidentally having both?

bcui19 · 2023-07-25T22:26:52Z

@vchiley It might be good to independently save it from composer, since also the composer checkpoint is unwieldy.

@samhavens I agree, so I did sth like this:

        if self.learned_pos_emb is None:
            self.learned_pos_emb = not self.attn_config.get('alibi', False)
            warnings.warn(f'`learned_pos_emb` has not been set, setting it to {self.learned_pos_emb}')

And it leads to this super weird warning... ):

Let me know if I'm doing something obviously wrong (I think there's something weird where in transformers where the class gets initialized twice...:
https://github.com/huggingface/transformers/blob/main/src/transformers/configuration_utils.py#L816

And we also can't set this up in _validate_config since we need it if we were to create a model using from_pretrained.

So I think unfortunately, I think I'm just going to add an additional warning, unless someone else has a way around this.

… fix-pos-embed

vchiley · 2023-07-25T23:00:46Z

@bcui19 it sets it to false, then sets it to true?

bcui19 · 2023-07-25T23:24:09Z

I think... When you create MPTConfig the way we do it now, HF does something weird and it ends up creating two MPTConfig's...

samhavens · 2023-07-25T23:28:36Z

If HF is going to make us choose between extremely cryptic warnings like that or just "set pos_emb to false if alibi is true" then I guess we should revert to what you had before and just let the user know they specified alibi so we are setting pos emb to false.

Unless, does setting the default differently solve this? learned_pos_emb: bool = False ?

llmfoundry/models/mpt/configuration_mpt.py

bcui19 · 2023-07-26T22:40:35Z

Running w/ the current set of changes:

Going to merge. This is partly me assuming that very few people would want to mix positional embeddings w/ Alibi

Adding check in MPTConfig checker if alibi is on, we need to switch l…

823cc71

…earned_pos_emb to false

bcui19 requested a review from vchiley July 25, 2023 20:40

Merge branch 'main' into fix-pos-embed

971bf86

bcui19 added 2 commits July 25, 2023 22:34

Adding warnings, fixing bug

7bfbec1

Merge remote-tracking branch 'refs/remotes/origin/fix-pos-embed' into…

3bb1019

… fix-pos-embed

vchiley reviewed Jul 25, 2023

View reviewed changes

llmfoundry/models/mpt/configuration_mpt.py Show resolved Hide resolved

vchiley approved these changes Jul 25, 2023

View reviewed changes

bcui19 merged commit 634597d into main Jul 26, 2023
10 checks passed

dakinggg deleted the fix-pos-embed branch October 11, 2023 21:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

If Alibi is on, we should turn learned_pos_emb to False #489

If Alibi is on, we should turn learned_pos_emb to False #489

bcui19 commented Jul 25, 2023 •

edited

vchiley commented Jul 25, 2023

bcui19 commented Jul 25, 2023 •

edited

vchiley commented Jul 25, 2023 •

edited

samhavens commented Jul 25, 2023 •

edited

bcui19 commented Jul 25, 2023 •

edited

vchiley commented Jul 25, 2023

bcui19 commented Jul 25, 2023 •

edited

samhavens commented Jul 25, 2023

bcui19 commented Jul 26, 2023

If Alibi is on, we should turn learned_pos_emb to False #489

If Alibi is on, we should turn learned_pos_emb to False #489

Conversation

bcui19 commented Jul 25, 2023 • edited

vchiley commented Jul 25, 2023

bcui19 commented Jul 25, 2023 • edited

vchiley commented Jul 25, 2023 • edited

samhavens commented Jul 25, 2023 • edited

bcui19 commented Jul 25, 2023 • edited

vchiley commented Jul 25, 2023

bcui19 commented Jul 25, 2023 • edited

samhavens commented Jul 25, 2023

bcui19 commented Jul 26, 2023

bcui19 commented Jul 25, 2023 •

edited

bcui19 commented Jul 25, 2023 •

edited

vchiley commented Jul 25, 2023 •

edited

samhavens commented Jul 25, 2023 •

edited

bcui19 commented Jul 25, 2023 •

edited

bcui19 commented Jul 25, 2023 •

edited