Skip to content

Setup Guide

OnlyMighty edited this page May 10, 2026 · 11 revisions

Installation Guide:

You may refer to:

Installation

How can I install ONCard?:

  1. Download the latest ONCard release.
  2. Locate it in your local storage, and install the app.

Note: The app does not currently have a code license file.

If Windows Defender shows a warning during installation, you may need to allow the app manually.

("More info" -> "Run anyways")


Setup Your Account

How can I setup my first account?

  • After opening the app for the first time, please do the following.
  • Enter your: Name, Profile Name, Age, Grade, Hobbies, Gender.

description_name_age_etc

If you are not "Male" or "Female", please enter your gender in the correct format into the custom gender box:

Example:

Gay | they / them

Wikipedia Search UI

  • Setup your time you would spend on one question with the slider given bellow.

Wikipedia Search shortcut


Model Installation

How can I install the AI models?

  1. If you already have Ollama, continue to Step 2. Otherwise:

    • Press the "Open Ollama website" button
    • This will redirect you to Ollama's download page
    • After downloading, install it
    • Continue with Step 2
  2. Press the button "Install AI models".

Note: The install, installs standard sized models: Gemma3:4b and Nomic-embed-text-v2-moe. The total size sums upto ~5GB, the download speed varies on your internet speed


Performance Testing

How will ONCard perform on my device?

  1. Press on the "Test performance" button.

This will start testing your performance by loading 4 synthetic questions. Then ONCard will consider the time it took to generate the response, and how much tokens it could generate within that time.

Here is a proper breakdown per result:

Tier TPS Range Performance Experience What to Do
🟢 Best Tier 81+ TPS Extremely fast Instant responses, smooth even with complex tasks • Use any model (large or small)
• Increase context length freely
• Use all features without limits
🟡 Smooth — No Lag 38–80 TPS Fast & stable Quick responses, slight delay on long outputs • Use recommended models (gemma3:4b, nomic-embed-text-v2-moe)
• Use all core features
• Keep context length moderate
🟠 Normal 26–37 TPS Moderate Noticeable delay, still usable for study • Use lighter models only (Ministral-3:3b)
• Keep questions short
• Consider Ollama Cloud mode
• Avoid bulk card generation
🔴 Poor 10–25 TPS Slow Laggy responses, long wait times • Strongly use Ollama Cloud mode
• Use smallest model only
• Keep interactions short
• Avoid heavy tasks (large files, bulk cards)

Note: Below 10 TPS, the test itself may fail, and the local model might not be usable at all. In that case, cloud mode is the only practical option.

This chart will visualize your in-app experience considering your GPU. The Chart contains synthetic data by giving GPT5.4 math tools. Tested cards: RTX 3060, RTX 5070, RTX2060: The data is made considering that you run Ollama with default context lengths and the context lengths recommended by the app using windows 11 without any modifications or changes done by the user which may significantly affect performance.

Please refer to the Performance Page to view the benchmarks

Clone this wiki locally