-
Notifications
You must be signed in to change notification settings - Fork 2
Setup Guide
You may refer to:
How can I install ONCard?:
- Download the latest ONCard release.
- Locate it in your local storage, and install the app.
Note: The app does not currently have a code license file.
If Windows Defender shows a warning during installation, you may need to allow the app manually.
("More info" -> "Run anyways")
How can I setup my first account?
- After opening the app for the first time, please do the following.
- Enter your: Name, Profile Name, Age, Grade, Hobbies, Gender.
If you are not "Male" or "Female", please enter your gender in the correct format into the custom gender box:
Example:
Gay | they / them
- Setup your time you would spend on one question with the slider given bellow.
How can I install the AI models?
-
If you already have Ollama, continue to Step 2. Otherwise:
- Press the "Open Ollama website" button
- This will redirect you to Ollama's download page
- After downloading, install it
- Continue with Step 2
-
Press the button "Install AI models".
Note: The install, installs standard sized models: Gemma3:4b and Nomic-embed-text-v2-moe. The total size sums upto ~5GB, the download speed varies on your internet speed
How will ONCard perform on my device?
- Press on the "Test performance" button.
This will start testing your performance by loading 4 synthetic questions. Then ONCard will consider the time it took to generate the response, and how much tokens it could generate within that time.
Here is a proper breakdown per result:
| Tier | TPS Range | Performance | Experience | What to Do |
|---|---|---|---|---|
| 🟢 Best Tier | 81+ TPS | Extremely fast | Instant responses, smooth even with complex tasks | • Use any model (large or small) • Increase context length freely • Use all features without limits |
| 🟡 Smooth — No Lag | 38–80 TPS | Fast & stable | Quick responses, slight delay on long outputs | • Use recommended models (gemma3:4b, nomic-embed-text-v2-moe) • Use all core features • Keep context length moderate |
| 🟠 Normal | 26–37 TPS | Moderate | Noticeable delay, still usable for study | • Use lighter models only (Ministral-3:3b) • Keep questions short • Consider Ollama Cloud mode • Avoid bulk card generation |
| 🔴 Poor | 10–25 TPS | Slow | Laggy responses, long wait times | • Strongly use Ollama Cloud mode • Use smallest model only • Keep interactions short • Avoid heavy tasks (large files, bulk cards) |
Note: Below 10 TPS, the test itself may fail, and the local model might not be usable at all. In that case, cloud mode is the only practical option.
This chart will visualize your in-app experience considering your GPU. The Chart contains synthetic data by giving GPT5.4 math tools. Tested cards: RTX 3060, RTX 5070, RTX2060: The data is made considering that you run Ollama with default context lengths and the context lengths recommended by the app using windows 11 without any modifications or changes done by the user which may significantly affect performance.
Please refer to the Performance Page to view the benchmarks