Bugfix/attention mask and implementation #49

Alvant · 2023-12-27T08:00:06Z

Issue: #46

I had to change attn implementation initialization. Unexpectedly (at least for me 😅), it turned out that it is not possible to specify attention in the model's config.json file. One can only set it as an argument when creating an object (attn_implementation is taken from "kwargs", not "config_dict", see: https://github.com/huggingface/transformers/blob/v4.36.1/src/transformers/configuration_utils.py#L772). So, I add attention to args and give it to config object when it is created.

I hope that this change is OK (new argument to parser + modified AutoConfig.from_pretrained call).

ChenMnZ · 2023-12-27T08:06:42Z

Good job! Thanks for your time again.

Alvant added 3 commits December 27, 2023 01:28

fix case when attn mask None, set attn eager by default

23ecf30

use attn implementation from args

9eac03b

Merge branch 'main' into bugfix/attention-mask-and-implementation

f46edc4

Alvant mentioned this pull request Dec 27, 2023

attention_mask may appear None for newer versions of LLaMA? #46

Closed

ChenMnZ merged commit 9790164 into OpenGVLab:main Dec 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bugfix/attention mask and implementation #49

Bugfix/attention mask and implementation #49

Alvant commented Dec 27, 2023

ChenMnZ commented Dec 27, 2023

Bugfix/attention mask and implementation #49

Bugfix/attention mask and implementation #49

Conversation

Alvant commented Dec 27, 2023

ChenMnZ commented Dec 27, 2023