Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Mistral feature extraction export to ONNX is broken #1732

Closed
2 of 4 tasks
michaelroyzen opened this issue Feb 27, 2024 · 0 comments
Closed
2 of 4 tasks

[BUG] Mistral feature extraction export to ONNX is broken #1732

michaelroyzen opened this issue Feb 27, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@michaelroyzen
Copy link

System Info

Optimum 1.17.1
Transformers 4.38.1
PyTorch 2.1.0

Who can help?

@fx

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction (minimal, reproducible, runnable)

ONNX export for Mistral models is currently broken. Running optimum-cli export onnx --model Salesforce/SFR-Embedding-Mistral onnx/sfr-embedding-mistral using the latest version of Optimum and Transformers will error with a problem about Trilu:

Framework not specified. Using pt to export the model.
Loading checkpoint shards: 100%|██████████████████| 3/3 [00:02<00:00,  1.25it/s]
Automatic task detection to feature-extraction (possible synonyms are: default, mask-generation, sentence-similarity).
Using the export variant default. Available variants are:
    - default: The default ONNX variant.
Using framework PyTorch: 2.1.0+cu121
Overriding 1 configuration item(s)
	- use_cache -> False
[/home/ubuntu/.local/lib/python3.8/site-packages/transformers/modeling_attn_mask_utils.py:114](https://vscode-remote+ssh-002dremote-002bec2-002d18-002d232-002d85-002d159-002ecompute-002d1-002eamazonaws-002ecom.vscode-resource.vscode-cdn.net/home/ubuntu/.local/lib/python3.8/site-packages/transformers/modeling_attn_mask_utils.py:114): TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if (input_shape[-1] > 1 or self.sliding_window is not None) and self.is_causal:
[/home/ubuntu/.local/lib/python3.8/site-packages/transformers/modeling_attn_mask_utils.py:162](https://vscode-remote+ssh-002dremote-002bec2-002d18-002d232-002d85-002d159-002ecompute-002d1-002eamazonaws-002ecom.vscode-resource.vscode-cdn.net/home/ubuntu/.local/lib/python3.8/site-packages/transformers/modeling_attn_mask_utils.py:162): TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if past_key_values_length > 0:
[/home/ubuntu/.local/lib/python3.8/site-packages/transformers/models/mistral/modeling_mistral.py:120](https://vscode-remote+ssh-002dremote-002bec2-002d18-002d232-002d85-002d159-002ecompute-002d1-002eamazonaws-002ecom.vscode-resource.vscode-cdn.net/home/ubuntu/.local/lib/python3.8/site-packages/transformers/models/mistral/modeling_mistral.py:120): TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if seq_len > self.max_seq_len_cached:
[/home/ubuntu/.local/lib/python3.8/site-packages/transformers/models/mistral/modeling_mistral.py:287](https://vscode-remote+ssh-002dremote-002bec2-002d18-002d232-002d85-002d159-002ecompute-002d1-002eamazonaws-002ecom.vscode-resource.vscode-cdn.net/home/ubuntu/.local/lib/python3.8/site-packages/transformers/models/mistral/modeling_mistral.py:287): TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if attn_weights.size() != (bsz, self.num_heads, q_len, kv_seq_len):
[/home/ubuntu/.local/lib/python3.8/site-packages/transformers/models/mistral/modeling_mistral.py:294](https://vscode-remote+ssh-002dremote-002bec2-002d18-002d232-002d85-002d159-002ecompute-002d1-002eamazonaws-002ecom.vscode-resource.vscode-cdn.net/home/ubuntu/.local/lib/python3.8/site-packages/transformers/models/mistral/modeling_mistral.py:294): TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if attention_mask.size() != (bsz, 1, q_len, kv_seq_len):
[/home/ubuntu/.local/lib/python3.8/site-packages/transformers/models/mistral/modeling_mistral.py:306](https://vscode-remote+ssh-002dremote-002bec2-002d18-002d232-002d85-002d159-002ecompute-002d1-002eamazonaws-002ecom.vscode-resource.vscode-cdn.net/home/ubuntu/.local/lib/python3.8/site-packages/transformers/models/mistral/modeling_mistral.py:306): TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if attn_output.size() != (bsz, self.num_heads, q_len, self.head_dim):
[/home/ubuntu/.local/lib/python3.8/site-packages/sentence_transformers/models/Pooling.py:178](https://vscode-remote+ssh-002dremote-002bec2-002d18-002d232-002d85-002d159-002ecompute-002d1-002eamazonaws-002ecom.vscode-resource.vscode-cdn.net/home/ubuntu/.local/lib/python3.8/site-packages/sentence_transformers/models/Pooling.py:178): TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  assert gather_indices.shape == (bs, 1, hidden_dim)
Saving external data to one file...
Traceback (most recent call last):
  File "/home/ubuntu/.local/bin/optimum-cli", line 8, in <module>
    sys.exit(main())
  File "/home/ubuntu/.local/lib/python3.8/site-packages/optimum/commands/optimum_cli.py", line 163, in main
    service.run()
  File "/home/ubuntu/.local/lib/python3.8/site-packages/optimum/commands/export/onnx.py", line 261, in run
    main_export(
  File "/home/ubuntu/.local/lib/python3.8/site-packages/optimum/exporters/onnx/__main__.py", line 351, in main_export
    onnx_export_from_model(
  File "/home/ubuntu/.local/lib/python3.8/site-packages/optimum/exporters/onnx/convert.py", line 1152, in onnx_export_from_model
    _, onnx_outputs = export_models(
  File "/home/ubuntu/.local/lib/python3.8/site-packages/optimum/exporters/onnx/convert.py", line 763, in export_models
    export(
  File "/home/ubuntu/.local/lib/python3.8/site-packages/optimum/exporters/onnx/convert.py", line 897, in export
    config.fix_dynamic_axes(output, device=device, input_shapes=input_shapes, dtype=dtype)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/optimum/exporters/onnx/base.py", line 306, in fix_dynamic_axes
    session = InferenceSession(model_path.as_posix(), providers=providers, sess_options=session_options)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 419, in __init__
    self._create_inference_session(providers, provider_options, disabled_optimizers)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 483, in _create_inference_session
    sess.initialize_session(providers, provider_options, disabled_optimizers)
onnxruntime.capi.onnxruntime_pybind11_state.NotImplemented: [ONNXRuntimeError] : 9 : NOT_IMPLEMENTED : Could not find an implementation for Trilu(14) node with name '/0/auto_model/Trilu'

Expected behavior

I'd expect the export to finish successfully. Funny enough, it works with Optimum 1.14.1 and Transformers 4.35.2. Not sure what changed since then. It seems that the Model Patcher from Optimum is now gone, which previously fixed a similar issue for Falcon: #1391

@michaelroyzen michaelroyzen added the bug Something isn't working label Feb 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant