Skip to content

Guide: Using Local ONNX Models

ThioJoe edited this page Jun 2, 2026 · 14 revisions

Thio's Universal Agent supports running fully local AI models, including ONNX Runtime GenAI models. It supports loading Onnx models directly, no other apps required at all.

Onnx models are structured slightly differently than what you might be used to (like a single .gguf file), so this guide will show you how to find and use them.

📝 Key Point: ONNX GenAI Models Come in a Folder, Not a File

  • You might have seen that other local LLM tools (like Ollama or llama.cpp) typically use models packaged as a single file.
  • However, ONNX Runtime GenAI models are distributed as a complete folder of several files. Therefore in the config, you'll enter a directory path, not a specific file.

Choosing a Model

Vision-Language Models (VLM) Only

Because the agent relies strictly on visual perception to operate, pure text-only ONNX models (such as standard Phi-3-mini or Llama-3) will not work. You must use a multimodal vision-language model (VLM). These typically have -Vision- or VL in the name, but might also be mentioned on the "model card" (basically the readme page of the model).

Example Vision Model Families:

  • Phi-3 / Phi-3.5 / Phi-4 Vision (e.g., Phi-3-vision-128k-instruct, Phi-4-multimodal-instruct)
  • Gemma-3 Vision (e.g., gemma-3-4b-it)
  • Qwen-2.5-VL (e.g., Qwen2.5-VL-7B-Instruct)

IMPORTANT: Not all models may have an Onnx version available.

Finding Pre-Converted Models on Hugging Face

Many popular models are pre-converted and optimized specifically for ONNX Runtime GenAI.

  1. Search Hugging Face for models containing -onnx or "ONNX Runtime GenAI".
  2. Example repositories:
  3. If there's multiple folders named like cpu and gpu, go into the one appropriate to your situation, like whether you have a dedicated GPU or not.
    • Note: If there's a nested folder inside, keep going until you get to the one containing the genai_config.json and .onnx files and others.
  4. Download the entire folder contents into a folder on your machine.
    • You don't need to download the entire repository, just the single folder with model files (The folder with genai_config.json)

🧐 Tip: You can use this browser tool I made to automatically download all the files in a Hugging Face folder


Repo Page Screenshots:

Image showing list of files at the top level of the 'Phi-3-vision' repository, with arrows pointing to folders called 'cpu_and_mobile' and 'gpu'

Folder deeper in the repository within the 'gpu' directory with all the files highlighted and text that says 'You need all the files in the folder'


Configuring the Agent

Once your model folder is ready, configure it in Thio's Universal Agent web UI:

Navigate to the Config menu. Under Provider Settings, select Local ONNX.

Configure the settings:

  • Click "Detect Capabilities" to load the available settings such as Execution Provider for you system.
  • Model Folder Path: Enter the absolute path to your model directory (e.g., C:\AI\models\phi3-vision-dml). Do not append a file name to this path.
  • Execution Provider (EP):
    • DML (DirectML): Recommended for Windows users. It runs on most hardware (AMD, Intel, NVIDIA).
    • CUDA: If this is available, select this if you have an NVIDIA GPU and the CUDA Toolkit installed.
    • CPU: A compatible fallback, but slow for processing vision/screenshots.