Skip to content

Fix issue with GQA initialization for Qwen2#514

Merged
arnavgarg1 merged 1 commit intomainfrom
qwen2_fixes
Jun 13, 2024
Merged

Fix issue with GQA initialization for Qwen2#514
arnavgarg1 merged 1 commit intomainfrom
qwen2_fixes

Conversation

@arnavgarg1
Copy link
Contributor

@arnavgarg1 arnavgarg1 commented Jun 13, 2024

Fixes:

in load_attention
lorax     base_layer = load_attention_multi(config, prefix, weights)
lorax 
lorax   File "/opt/conda/lib/python3.10/site-packages/lorax_server/models/custom_modeling/flash_qwen2_modeling.py", line 109, in load_attention_multi
lorax     return _load_gqa(config, prefix, weights)
lorax 
lorax   File "/opt/conda/lib/python3.10/site-packages/lorax_server/models/custom_modeling/flash_qwen2_modeling.py", line 141, in _load_gqa
lorax     return TensorParallelColumnLinear(get_linear(weight, bias=True, quantize=config.quantize))
lorax 
lorax   File "/opt/conda/lib/python3.10/site-packages/lorax_server/utils/layers.py", line 326, in get_linear
lorax     linear = FastLinear(weight, bias)
lorax 
lorax   File "/opt/conda/lib/python3.10/site-packages/lorax_server/utils/layers.py", line 108, in __init__
lorax     self.bias = nn.Parameter(bias)
lorax 
lorax   File "/opt/conda/lib/python3.10/site-packages/torch/nn/parameter.py", line 43, in __new__
lorax     t = data.detach().requires_grad_(requires_grad)
lorax 
lorax AttributeError: 'bool' object has no attribute 'detach'

With the fix, everything works correctly. Here's a sample input and output:

Input:

curl 127.0.0.1:8080/generate \
    -X POST \
    -d '{
        "inputs": "[INST] Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May? [/INST]",
        "parameters": {
            "max_new_tokens": 256
        }
    }' \
    -H 'Content-Type: application/json'

Output:

{"generated_text":"\nTo find the total number of clips Natalia sold in April and May, we first need to determine how many clips she sold in May.\n\nNatalia sold clips to 48 of her friends in April. In May, she sold half as many clips as she did in April. So, in May, she sold:\n\n\\[ \\frac{48}{2} = 24 \\text{ clips} \\]\n\nTo find the total number of clips sold in April and May, we add the number of clips sold in April to the number of clips sold in May:\n\n\\[ 48 \\text{ clips (April)} + 24 \\text{ clips (May)} = 72 \\text{ clips} \\]\n\nTherefore, Natalia sold a total of 72 clips altogether in April and May."}

@arnavgarg1 arnavgarg1 marked this pull request as ready for review June 13, 2024 02:44
@arnavgarg1 arnavgarg1 merged commit 9bed4da into main Jun 13, 2024
@arnavgarg1 arnavgarg1 deleted the qwen2_fixes branch June 13, 2024 19:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants