[Good First Issue]: Verify mpt-7b-chat with GenAI text_generation #267

p-wysocki · 2024-03-01T12:22:21Z

Context

This task regards enabling tests for mpt-7b-chat. You can find more details under openvino_notebooks LLM chatbot README.md.

Please ask general questions in the main issue at #259

What needs to be done?

Described in the main Discussion issue at: #259

Example Pull Requests

Described in the main Discussion issue at: #259

Resources

Contribution guide - start here!
Intel DevHub Discord channel - engage in discussions, ask questions and talk to OpenVINO developers

Contact points

Described in the main Discussion issue at: #259

Ticket

No response

qxprakash · 2024-03-05T12:24:07Z

.take

github-actions · 2024-03-05T12:24:21Z

Thank you for looking into this issue! Please let us know if you have any questions or require any help.

Utkarsh-2002 · 2024-03-05T16:52:29Z

.take

github-actions · 2024-03-05T16:52:41Z

Thanks for being interested in this issue. It looks like this ticket is already assigned to a contributor. Please communicate with the assigned contributor to confirm the status of the issue.

p-wysocki · 2024-03-12T09:44:14Z

Hello @qxprakash, are you still working on this? Is there anything we could help you with?

qxprakash · 2024-03-12T15:45:11Z

hello @p-wysocki yes I am working on it , I faced error while trying to run mpt-chat

python3 ../../../llm_bench/python/convert.py --model_id mosaicml/mpt-7b-chat --output_dir ./MPT_CHAT --precision FP16

[ INFO ] Removing bias from module=LPLayerNorm((4096,), eps=1e-05, elementwise_affine=True).
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [01:02<00:00, 31.27s/it]
generation_config.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████████| 121/121 [00:00<00:00, 781kB/s]
/home/prakash/.cache/huggingface/modules/transformers_modules/mosaicml/mpt-7b-chat/1fe2374291e730f7c58ceb1bf49960082371b551/attention.py:87: UserWarning: Propagating key_padding_mask to the attention module and applying it within the attention module can cause unnecessary computation/memory usage. Consider integrating into attn_bias once and passing that to each attention module instead.
  warnings.warn('Propagating key_padding_mask to the attention module ' + 'and applying it within the attention module can cause ' + 'unnecessary computation/memory usage. Consider integrating ' + 'into attn_bias once and passing that to each attention ' + 'module instead.')
/home/prakash/.cache/huggingface/modules/transformers_modules/mosaicml/mpt-7b-chat/1fe2374291e730f7c58ceb1bf49960082371b551/modeling_mpt.py:311: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  assert S <= self.config.max_seq_len, f'Cannot forward input with seq_len={S}, this model only supports seq_len<={self.config.max_seq_len}'
/home/prakash/.cache/huggingface/modules/transformers_modules/mosaicml/mpt-7b-chat/1fe2374291e730f7c58ceb1bf49960082371b551/modeling_mpt.py:253: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  _s_k = max(0, attn_bias.size(-1) - s_k)
/home/prakash/.cache/huggingface/modules/transformers_modules/mosaicml/mpt-7b-chat/1fe2374291e730f7c58ceb1bf49960082371b551/attention.py:78: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  _s_q = max(0, attn_bias.size(2) - s_q)
/home/prakash/.cache/huggingface/modules/transformers_modules/mosaicml/mpt-7b-chat/1fe2374291e730f7c58ceb1bf49960082371b551/attention.py:79: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  _s_k = max(0, attn_bias.size(3) - s_k)
/home/prakash/.cache/huggingface/modules/transformers_modules/mosaicml/mpt-7b-chat/1fe2374291e730f7c58ceb1bf49960082371b551/attention.py:81: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if attn_bias.size(-1) != 1 and attn_bias.size(-1) != s_k or (attn_bias.size(-2) != 1 and attn_bias.size(-2) != s_q):
/home/prakash/.cache/huggingface/modules/transformers_modules/mosaicml/mpt-7b-chat/1fe2374291e730f7c58ceb1bf49960082371b551/attention.py:89: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if is_causal and (not q.size(2) == 1):
/home/prakash/.cache/huggingface/modules/transformers_modules/mosaicml/mpt-7b-chat/1fe2374291e730f7c58ceb1bf49960082371b551/attention.py:90: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  s = max(s_q, s_k)
[ WARNING ] Failed to send event with the following error: <urlopen error EOF occurred in violation of protocol (_ssl.c:2426)>
[ WARNING ] Failed to send event with the following error: <urlopen error EOF occurred in violation of protocol (_ssl.c:2426)>```

p-wysocki · 2024-03-12T15:48:43Z

cc @pavel-esir

qxprakash · 2024-03-15T20:25:21Z

@pavel-esir I hope this is not a memory issue ?

pavel-esir · 2024-03-19T10:52:23Z

@qxprakash thanks for your update. I see only connections warning in your logs. Did you get the converted IR? If so, did you run it with c++ sample?

tushar-thoriya · 2024-03-30T12:53:37Z

@qxprakash what is the progress ?

p-wysocki · 2024-05-06T08:55:15Z

Hello @qxprakash, are you still working on that issue? Do you need any help?

qxprakash · 2024-05-21T04:50:41Z

Hi @p-wysocki I was stuck , currently I'm not working on it.

rk119 · 2024-07-18T17:09:04Z

.take

github-actions · 2024-07-18T17:09:18Z

Thanks for being interested in this issue. It looks like this ticket is already assigned to a contributor. Please communicate with the assigned contributor to confirm the status of the issue.

rk119 · 2024-07-18T17:20:10Z

@p-wysocki I'd like to take upon this issue as well.

Wovchena · 2024-07-19T09:02:00Z

I reassigned to @rk119 because I'm not sure if @Utkarsh-2002 is still here. If you still want to work on that, let us know. If @rk119 confirms that they didn't start working on it by the time @Utkarsh-2002 replies, @Utkarsh-2002 can take the task.

rk119 · 2024-07-23T08:33:26Z

Hi @Wovchena,

Before raising a PR, I want to address that for samples prompt_lookup_decoding_lm and speculative_decoding_lm, I am getting an error of Exception from src/inference/src/cpp/infer_request.cpp:193: Check '::getPort(port, name, {_impl->get_inputs(), _impl->get_outputs()})' failed at src/inference/src/cpp/infer_request.cpp:193: Port for tensor name position_ids was not found.

p-wysocki added the good first issue Good for newcomers label Mar 1, 2024

p-wysocki mentioned this issue Mar 1, 2024

[Discussion][Good First Issue]: Verify different LLMs work with text_generation #259

Open

github-actions bot assigned qxprakash Mar 5, 2024

rk119 mentioned this issue Jul 18, 2024

[Good First Issue]: Verify chatglm3-6b with GenAI text_generation #268

Open

Wovchena assigned rk119 and unassigned qxprakash Jul 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Good First Issue]: Verify mpt-7b-chat with GenAI text_generation #267

[Good First Issue]: Verify mpt-7b-chat with GenAI text_generation #267

p-wysocki commented Mar 1, 2024

qxprakash commented Mar 5, 2024

github-actions bot commented Mar 5, 2024

Utkarsh-2002 commented Mar 5, 2024

github-actions bot commented Mar 5, 2024

p-wysocki commented Mar 12, 2024

qxprakash commented Mar 12, 2024

p-wysocki commented Mar 12, 2024

qxprakash commented Mar 15, 2024

pavel-esir commented Mar 19, 2024

tushar-thoriya commented Mar 30, 2024

p-wysocki commented May 6, 2024

qxprakash commented May 21, 2024

rk119 commented Jul 18, 2024

github-actions bot commented Jul 18, 2024

rk119 commented Jul 18, 2024

Wovchena commented Jul 19, 2024

rk119 commented Jul 23, 2024

[Good First Issue]: Verify mpt-7b-chat with GenAI text_generation #267

[Good First Issue]: Verify mpt-7b-chat with GenAI text_generation #267

Comments

p-wysocki commented Mar 1, 2024

Context

What needs to be done?

Example Pull Requests

Resources

Contact points

Ticket

qxprakash commented Mar 5, 2024

github-actions bot commented Mar 5, 2024

Utkarsh-2002 commented Mar 5, 2024

github-actions bot commented Mar 5, 2024

p-wysocki commented Mar 12, 2024

qxprakash commented Mar 12, 2024

p-wysocki commented Mar 12, 2024

qxprakash commented Mar 15, 2024

pavel-esir commented Mar 19, 2024

tushar-thoriya commented Mar 30, 2024

p-wysocki commented May 6, 2024

qxprakash commented May 21, 2024

rk119 commented Jul 18, 2024

github-actions bot commented Jul 18, 2024

rk119 commented Jul 18, 2024

Wovchena commented Jul 19, 2024

rk119 commented Jul 23, 2024