Traceback during inference. #6

Hello1024 · 2023-03-22T13:16:58Z

Colab, gives the following error during inference:

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/dist-packages/gradio/routes.py", line 394, in run_predict
    output = await app.get_blocks().process_api(
  File "/usr/local/lib/python3.9/dist-packages/gradio/blocks.py", line 1075, in process_api
    result = await self.call_function(
  File "/usr/local/lib/python3.9/dist-packages/gradio/blocks.py", line 884, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "/usr/local/lib/python3.9/dist-packages/anyio/to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/usr/local/lib/python3.9/dist-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "/usr/local/lib/python3.9/dist-packages/anyio/_backends/_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "/usr/local/lib/python3.9/dist-packages/gradio/helpers.py", line 587, in tracked_fn
    response = fn(*args)
  File "/content/simple-llama-finetuner/main.py", line 121, in generate_text
    generation_output = model.generate(
  File "/usr/local/lib/python3.9/dist-packages/peft/peft_model.py", line 581, in generate
    outputs = self.base_model.generate(**kwargs)
  File "/usr/local/lib/python3.9/dist-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/transformers/generation/utils.py", line 1451, in generate
    return self.sample(
  File "/usr/local/lib/python3.9/dist-packages/transformers/generation/utils.py", line 2467, in sample
    outputs = self(
  File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/accelerate/hooks.py", line 165, in new_forward
    output = old_forward(*args, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/transformers/models/llama/modeling_llama.py", line 765, in forward
    outputs = self.model(
  File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/transformers/models/llama/modeling_llama.py", line 614, in forward
    layer_outputs = decoder_layer(
  File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/accelerate/hooks.py", line 165, in new_forward
    output = old_forward(*args, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/transformers/models/llama/modeling_llama.py", line 309, in forward
    hidden_states, self_attn_weights, present_key_value = self.self_attn(
  File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/accelerate/hooks.py", line 165, in new_forward
    output = old_forward(*args, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/transformers/models/llama/modeling_llama.py", line 209, in forward
    query_states = self.q_proj(hidden_states).view(bsz, q_len, self.num_heads, self.head_dim).transpose(1, 2)
  File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/accelerate/hooks.py", line 165, in new_forward
    output = old_forward(*args, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/peft/tuners/lora.py", line 522, in forward
    result = super().forward(x)
  File "/usr/local/lib/python3.9/dist-packages/bitsandbytes/nn/modules.py", line 242, in forward
    out = bnb.matmul(x, self.weight, bias=self.bias, state=self.state)
  File "/usr/local/lib/python3.9/dist-packages/bitsandbytes/autograd/_functions.py", line 488, in matmul
    return MatMul8bitLt.apply(A, B, out, bias, state)
  File "/usr/local/lib/python3.9/dist-packages/bitsandbytes/autograd/_functions.py", line 317, in forward
    state.CxB, state.SB = F.transform(state.CB, to_order=formatB)
  File "/usr/local/lib/python3.9/dist-packages/bitsandbytes/functional.py", line 1698, in transform
    prev_device = pre_call(A.device)
AttributeError: 'NoneType' object has no attribute 'device'

The text was updated successfully, but these errors were encountered:

lxe · 2023-03-22T14:21:12Z

Thanks for the report. Looks like my janky model loading/reloading isn't working. Fixing.

lxe · 2023-03-22T23:04:17Z

Should be fixed in latest master!

calz1 · 2023-03-23T00:13:33Z

@lxe I just did a git clone to try out your fix and I still get this on colab :(

lxe · 2023-03-23T00:26:34Z

I just tried on a fresh Tesla T4 colab and unable to reproduce. Using the vanilla model with no LoRA.

lxe · 2023-03-23T00:36:43Z

Just reproduced! Will fix!

lxe · 2023-03-23T00:37:42Z

tloen/alpaca-lora#21

lxe · 2023-03-25T03:41:06Z

I would try again on colab

calz1 · 2023-03-28T11:58:30Z

@lxe I just deleted my runtime to get a fresh environment, saw git clone pull a new copy of the code, and I still get the error on a GPU environment on Colab.

Error:
Traceback (most recent call last): File "/usr/local/lib/python3.9/dist-packages/gradio/routes.py", line 394, in run_predict output = await app.get_blocks().process_api( File "/usr/local/lib/python3.9/dist-packages/gradio/blocks.py", line 1075, in process_api result = await self.call_function( File "/usr/local/lib/python3.9/dist-packages/gradio/blocks.py", line 884, in call_function prediction = await anyio.to_thread.run_sync( File "/usr/local/lib/python3.9/dist-packages/anyio/to_thread.py", line 31, in run_sync return await get_asynclib().run_sync_in_worker_thread( File "/usr/local/lib/python3.9/dist-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread return await future File "/usr/local/lib/python3.9/dist-packages/anyio/_backends/_asyncio.py", line 867, in run result = context.run(func, *args) File "/usr/local/lib/python3.9/dist-packages/gradio/helpers.py", line 587, in tracked_fn response = fn(*args) File "/content/simple-llama-finetuner/main.py", line 104, in generate_text output = model.generate( # type: ignore File "/usr/local/lib/python3.9/dist-packages/peft/peft_model.py", line 581, in generate outputs = self.base_model.generate(**kwargs) File "/usr/local/lib/python3.9/dist-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(*args, **kwargs) File "/usr/local/lib/python3.9/dist-packages/transformers/generation/utils.py", line 1462, in generate return self.sample( File "/usr/local/lib/python3.9/dist-packages/transformers/generation/utils.py", line 2478, in sample outputs = self( File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "/usr/local/lib/python3.9/dist-packages/accelerate/hooks.py", line 165, in new_forward output = old_forward(*args, **kwargs) File "/usr/local/lib/python3.9/dist-packages/transformers/models/llama/modeling_llama.py", line 705, in forward outputs = self.model( File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "/usr/local/lib/python3.9/dist-packages/transformers/models/llama/modeling_llama.py", line 593, in forward layer_outputs = decoder_layer( File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "/usr/local/lib/python3.9/dist-packages/accelerate/hooks.py", line 165, in new_forward output = old_forward(*args, **kwargs) File "/usr/local/lib/python3.9/dist-packages/transformers/models/llama/modeling_llama.py", line 311, in forward hidden_states, self_attn_weights, present_key_value = self.self_attn( File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "/usr/local/lib/python3.9/dist-packages/accelerate/hooks.py", line 165, in new_forward output = old_forward(*args, **kwargs) File "/usr/local/lib/python3.9/dist-packages/transformers/models/llama/modeling_llama.py", line 212, in forward query_states = self.q_proj(hidden_states).view(bsz, q_len, self.num_heads, self.head_dim).transpose(1, 2) File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "/usr/local/lib/python3.9/dist-packages/accelerate/hooks.py", line 165, in new_forward output = old_forward(*args, **kwargs) File "/usr/local/lib/python3.9/dist-packages/peft/tuners/lora.py", line 522, in forward result = super().forward(x) File "/usr/local/lib/python3.9/dist-packages/bitsandbytes/nn/modules.py", line 242, in forward out = bnb.matmul(x, self.weight, bias=self.bias, state=self.state) File "/usr/local/lib/python3.9/dist-packages/bitsandbytes/autograd/_functions.py", line 488, in matmul return MatMul8bitLt.apply(A, B, out, bias, state) File "/usr/local/lib/python3.9/dist-packages/bitsandbytes/autograd/_functions.py", line 317, in forward state.CxB, state.SB = F.transform(state.CB, to_order=formatB) File "/usr/local/lib/python3.9/dist-packages/bitsandbytes/functional.py", line 1698, in transform prev_device = pre_call(A.device) AttributeError: 'NoneType' object has no attribute 'device'

lxe added a commit that referenced this issue Mar 22, 2023

Refactor; fix model/lora loading/reloading in inference. Fixes #10, #6

ecf29d8

lxe added the bug label Mar 22, 2023

lxe mentioned this issue Mar 23, 2023

Inference works just once #12

Closed

lxe added a commit that referenced this issue Mar 25, 2023

Changed device_map to force GPU, see #6, tloen/alpaca-lora#21

f9e36a5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Traceback during inference. #6

Traceback during inference. #6

Hello1024 commented Mar 22, 2023

lxe commented Mar 22, 2023

lxe commented Mar 22, 2023

calz1 commented Mar 23, 2023

lxe commented Mar 23, 2023

lxe commented Mar 23, 2023

lxe commented Mar 23, 2023

lxe commented Mar 25, 2023

calz1 commented Mar 28, 2023

Traceback during inference. #6

Traceback during inference. #6

Comments

Hello1024 commented Mar 22, 2023

lxe commented Mar 22, 2023

lxe commented Mar 22, 2023

calz1 commented Mar 23, 2023

lxe commented Mar 23, 2023

lxe commented Mar 23, 2023

lxe commented Mar 23, 2023

lxe commented Mar 25, 2023

calz1 commented Mar 28, 2023