Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: The size of tensor a (2048) must match the size of tensor b (2049) at non-singleton dimension 3 #67

Closed
pseudotensor opened this issue Apr 20, 2023 · 2 comments

Comments

@pseudotensor
Copy link
Collaborator

When max new tokens >~ 2048

See bottom of: #66

@pseudotensor
Copy link
Collaborator Author

  File "/home/jon/h2o-llm/h2oai_pipeline.py", line 54, in _forward
    return super()._forward(model_inputs, **generate_kwargs)
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/transformers/pipelines/text_generation.py", line 251, in _forward
    generated_sequence = self.model.generate(input_ids=input_ids, attention_mask=attention_mask, **generate_kwargs)
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/transformers/generation/utils.py", line 1437, in generate
    return self.greedy_search(
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/transformers/generation/utils.py", line 2248, in greedy_search
    outputs = self(
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward
    output = old_forward(*args, **kwargs)
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/transformers/models/gpt_neox/modeling_gpt_neox.py", line 662, in forward
    outputs = self.gpt_neox(
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward
    output = old_forward(*args, **kwargs)
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/transformers/models/gpt_neox/modeling_gpt_neox.py", line 553, in forward
    outputs = layer(
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward
    output = old_forward(*args, **kwargs)
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/transformers/models/gpt_neox/modeling_gpt_neox.py", line 320, in forward
    attention_layer_outputs = self.attention(
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward
    output = old_forward(*args, **kwargs)
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/transformers/models/gpt_neox/modeling_gpt_neox.py", line 152, in forward
    attn_output, attn_weights = self._attn(query, key, value, attention_mask, head_mask)
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/transformers/models/gpt_neox/modeling_gpt_neox.py", line 219, in _attn
    attn_scores = torch.where(causal_mask, attn_scores, mask_value)
RuntimeError: The size of tensor a (2048) must match the size of tensor b (2713) at non-singleton dimension 3

@pseudotensor
Copy link
Collaborator Author

pseudotensor commented May 27, 2023

Distance: min: 1.2423040866851807 max: 1.3549838066101074 mean: 1.300344705581665 median: 1.302045464515686
thread exception: (<class 'RuntimeError'>, RuntimeError('The size of tensor a (2048) must match the size of tensor b (3718) at non-singleton dimension 3'), <traceback object at 0x7fef33c2d080>)
make stop: (<class 'RuntimeError'>, RuntimeError('The size of tensor a (2048) must match the size of tensor b (3718) at non-singleton dimension 3'), <traceback object at 0x7fef33c2d080>)
hit stop
Traceback (most recent call last):
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/gradio/routes.py", line 414, in run_predict
    output = await app.get_blocks().process_api(
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/gradio/blocks.py", line 1323, in process_api
    result = await self.call_function(
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/gradio/blocks.py", line 1067, in call_function
    prediction = await utils.async_iteration(iterator)
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/gradio/utils.py", line 339, in async_iteration
    return await iterator.__anext__()
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/gradio/utils.py", line 332, in __anext__
    return await anyio.to_thread.run_sync(
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/anyio/to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/gradio/utils.py", line 315, in run_sync_iterator_async
    return next(iterator)
  File "/home/jon/h2o-llm/gradio_runner.py", line 930, in bot
    for output_fun in fun1(*tuple(args_list)):
  File "/home/jon/h2o-llm/generate.py", line 915, in evaluate
    for r in run_qa_db(query=query,
  File "/home/jon/h2o-llm/gpt_langchain.py", line 944, in _run_qa_db
    raise thread.exc
  File "/home/jon/h2o-llm/utils.py", line 312, in run
    self._return = self._target(*self._args, **self._kwargs)
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/langchain/chains/base.py", line 140, in __call__
    raise e
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/langchain/chains/base.py", line 134, in __call__
    self._call(inputs, run_manager=run_manager)
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/langchain/chains/combine_documents/base.py", line 84, in _call
    output, extra_return_dict = self.combine_docs(
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/langchain/chains/combine_documents/stuff.py", line 87, in combine_docs
    return self.llm_chain.predict(callbacks=callbacks, **inputs), {}
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/langchain/chains/llm.py", line 213, in predict
    return self(kwargs, callbacks=callbacks)[self.output_key]
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/langchain/chains/base.py", line 140, in __call__
    raise e
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/langchain/chains/base.py", line 134, in __call__
    self._call(inputs, run_manager=run_manager)
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/langchain/chains/llm.py", line 69, in _call
    response = self.generate([inputs], run_manager=run_manager)
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/langchain/chains/llm.py", line 79, in generate
    return self.llm.generate_prompt(
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/langchain/llms/base.py", line 134, in generate_prompt
    return self.generate(prompt_strings, stop=stop, callbacks=callbacks)
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/langchain/llms/base.py", line 191, in generate
    raise e
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/langchain/llms/base.py", line 185, in generate
    self._generate(prompts, stop=stop, run_manager=run_manager)
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/langchain/llms/base.py", line 411, in _generate
    self._call(prompt, stop=stop, run_manager=run_manager)
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/langchain/llms/huggingface_pipeline.py", line 159, in _call
    response = self.pipeline(prompt)
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/transformers/pipelines/text_generation.py", line 201, in __call__
    return super().__call__(text_inputs, **kwargs)
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/transformers/pipelines/base.py", line 1119, in __call__
    return self.run_single(inputs, preprocess_params, forward_params, postprocess_params)
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/transformers/pipelines/base.py", line 1126, in run_single
    model_outputs = self.forward(model_inputs, **forward_params)
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/transformers/pipelines/base.py", line 1025, in forward
    model_outputs = self._forward(model_inputs, **forward_params)
  File "/home/jon/h2o-llm/h2oai_pipeline.py", line 55, in _forward
    return self.__forward(model_inputs, **generate_kwargs)
  File "/home/jon/h2o-llm/h2oai_pipeline.py", line 91, in __forward
    generated_sequence = self.model.generate(input_ids=input_ids, attention_mask=attention_mask, **generate_kwargs)
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/transformers/generation/utils.py", line 1515, in generate
    return self.greedy_search(
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/transformers/generation/utils.py", line 2332, in greedy_search
    outputs = self(
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward
    output = old_forward(*args, **kwargs)
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/transformers/models/gpt_neox/modeling_gpt_neox.py", line 672, in forward
    outputs = self.gpt_neox(
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward
    output = old_forward(*args, **kwargs)
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/transformers/models/gpt_neox/modeling_gpt_neox.py", line 563, in forward
    outputs = layer(
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward
    output = old_forward(*args, **kwargs)
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/transformers/models/gpt_neox/modeling_gpt_neox.py", line 330, in forward
    attention_layer_outputs = self.attention(
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward
    output = old_forward(*args, **kwargs)
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/transformers/models/gpt_neox/modeling_gpt_neox.py", line 162, in forward
    attn_output, attn_weights = self._attn(query, key, value, attention_mask, head_mask)
  File "/home/jon/miniconda3/envs/h2ollm/lib/python3.10/site-packages/transformers/models/gpt_neox/modeling_gpt_neox.py", line 229, in _attn
    attn_scores = torch.where(causal_mask, attn_scores, mask_value)
RuntimeError: The size of tensor a (2048) must match the size of tensor b (3718) at non-singleton dimension 3

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant