Skip to content

issues Search Results · repo:microsoft/onnxruntime-genai language:C++

Filter by

412 results
 (110 ms)

412 results

inmicrosoft/onnxruntime-genai (press backspace or delete to remove)

I have found that openvino relate code has been merged into main branch in latest commit history. I wonder how can I infer with openvino EP. If you can provide the openvino detail self build and infer ...
ep:DML
  • ZhangWei125521
  • Opened 
    2 days ago
  • #1501

Hello, I am trying to build onnxruntime-genai for iOS using this tutorial https://github.com/Azure-Samples/Phi-3MiniSamples/blob/main/ios/README.md. But i think it is now out of date as the is no branch ...
platform:mobile
  • sahmed53
  • Opened 
    9 days ago
  • #1484

Since the GeneratorParams.input_ids attribute has been decommissioned in the latest version of OGA, what is the alternative? input_tokens = test_enc[(i * seqlen) : ((i + 1) * seqlen - 1)] params.input_ids ...
  • satreysa
  • 2
  • Opened 
    16 days ago
  • #1458

Describe the bug Phi-4-multimodal-onnx audio task prompt |user| |audio_1| Transcribe the audio clip into text. |end| |assistant| responses with this instead of the text transcription: I m sorry, but I ...
  • Savvkin
  • Opened 
    17 days ago
  • #1455

I m experiencing problems using the Phi-4-mini-instruct model where it will generate responses that begin to repeat text until max_length is reached. To Reproduce I see this problem with my application, ...
  • f2bo
  • 9
  • Opened 
    18 days ago
  • #1450

If you have any plan to enable Qwen3-30B-A3B which architectures is Qwen3MoeForCausalLM
  • ZhangWei125521
  • 1
  • Opened 
    24 days ago
  • #1433

Describe the bug I have encountered the following OGA error message: Completion failure max_length (2347) cannot be greater than model context_length (2176) I am mainly trying to understand OGA s definition ...
  • jeremyfowers
  • 2
  • Opened 
    29 days ago
  • #1425

Hi all, I’ve optimized the finetuned Phi-4 MM Instruct vision model by converting it to ONNX and applying quantization — inference time dropped from 26s ➝ 7s. :tada: I have a few quick questions: Audio ...
  • MeemankGupta
  • Opened 
    on Apr 24
  • #1423

Describe the bug Windows on ARM users commonly use AMD64 python to execute models using ONNX runtime. This is needed because several python packages (eg. Torch, h5py, etc.) do not yet ship ARM64 for Windows ...
platform:windows
  • kory
  • 14
  • Opened 
    on Apr 23
  • #1417

The model builder currently uses opset 14 and IR version 7 for built models. I recommend adopting a later opset (18+) and IR version (10) for the models to leverage latest onnx features and help the ecosystem ...
  • justinchuby
  • Opened 
    on Apr 23
  • #1414
Issue origami icon

Learn how you can use GitHub Issues to plan and track your work.

Save views for sprints, backlogs, teams, or releases. Rank, sort, and filter issues to suit the occasion. The possibilities are endless.Learn more about GitHub Issues
ProTip! 
Restrict your search to the title by using the in:title qualifier.
Issue origami icon

Learn how you can use GitHub Issues to plan and track your work.

Save views for sprints, backlogs, teams, or releases. Rank, sort, and filter issues to suit the occasion. The possibilities are endless.Learn more about GitHub Issues
ProTip! 
Restrict your search to the title by using the in:title qualifier.
Issue search results · GitHub