-
Notifications
You must be signed in to change notification settings - Fork 2
Performance
You may refer:
- About the Chart
- Discrete GPU Master Chart
- Integrated Graphics / APU Master Chart
- Intel Integrated Graphics Coverage Notes
- AMD Integrated Graphics Coverage Notes
- Practical Interpretation for ONCard
- Important Notes
This chart will visualize your in-app experience considering your GPU. The Chart contains synthetic data by giving GPT5.4 math tools. Tested cards: RTX 3060, RTX 5070: The data is made considering that you run Ollama with default context lengths and the context lengths recommended by the app using Windows 11 without any modifications or changes done by the user, which may significantly affect performance.
credits to rizzwixk for benchmarking the RTX 3060:
How to read this:
- Desktop GPUs are the baseline.
- Laptop GPUs usually land 0.5 to 1.5 tiers lower unless they are high-TGP models with strong cooling.
- Integrated graphics depend heavily on dual-channel memory, RAM speed, power limits, cooling, and driver maturity.
- The charts will give you a general idea of how your system will perform.
| Tier | Expected NVIDIA Hardware | Expected AMD Hardware | Expected VRAM | Expected Performance | Expected TPS | Experience |
|---|---|---|---|---|---|---|
| S+ Flagship | RTX 5090, RTX 6000 Ada, RTX 5090 Laptop (very high power only) | Radeon PRO W7900, RX 7900 XTX (best ROCm setups), Radeon AI PRO R9700 | 24-48 GB | Extreme local inference headroom | 550+ TPS | Instant response generation. Massive overhead for long chats, grading, and future larger local models. |
| S Tier Enthusiast | RTX 5080, RTX 4090, RTX 4080 SUPER, RTX 3090 Ti, RTX A5500, RTX A5000, RTX 5000 Ada Laptop (high power) | RX 9070 XT, RX 7900 XT, PRO W7800, RX 7900M (best laptop implementations) | 16-24 GB | Outstanding | 275-400 TPS | Extremely fast. ONCard should feel effortless, even under heavier use. |
| A+ Best Tier | RTX 5070 Ti, RTX 4070 Ti SUPER, RTX 4070 Ti, RTX 3090, RTX 3080 Ti 12GB, RTX A4500, RTX 4080 Laptop (high power) | RX 9070, RX 7900 GRE, RX 6950 XT, RX 6900 XT, RX 6800 XT (best configurations) | 12-24 GB | Very high-end | ~200+ TPS | Best-tier experience. Fast grading, responsive follow-up chat, strong batch work. |
| A Best Tier | RTX 5070, RTX 4070 SUPER, RTX 4070, RTX 3080 12GB, RTX 3080 10GB, RTX A4000 Ada, RTX 4070 Laptop (high TGP), RTX 3080 Laptop 16GB | RX 7800 XT, RX 6800 XT, RX 6800, RX 7900 GRE laptop-class equivalents where applicable | 10-16 GB | High-end | 140+ TPS | Best-tier in real use. This is the class where my personal RTX 5070 falls into for ONCard. |
| A- Lower Best / Upper Smooth | RTX 5060 Ti 16GB, RTX 3070 Ti, RTX 3070, RTX A4000, RTX 3060 Ti, RTX 2000 Ada, RTX 3080 Laptop 8GB, RTX 3070 Laptop (high TGP) | RX 7700 XT, RX 6750 XT, RX 6700 XT, PRO W7700, RX 6800M, RX 7800M | 8-16 GB | Strong upper-midrange | ~95-130 TPS | Very smooth. Core ONCard features feel fast. Some heavier tasks may show only slight delay. |
| B+ Smooth | RTX 3060 12GB, RTX 2080 Ti, Titan Xp, RTX 2070 SUPER, Quadro RTX 5000, RTX 3060 Laptop (high TGP) | RX 7600 XT, RX 6700, RX 6700M, RX 6800S, RX 7700S, RX 7600M XT | 8-12 GB | Strong value / older enthusiast | 80-110 TPS | Smooth and responsive. This is where strong “good enough for everything” local ONCard use begins. Tested RTX 3060 12GB sits at the top edge of this band and can touch Best Tier. |
| B Smooth / Normal Crossover | GTX 1080 Ti, RTX 4060 Ti 16GB, RTX 4060 Ti 8GB, RTX 2080 SUPER, RTX 2080, RTX 2060 12GB, Quadro RTX 4000, RTX 3070 Laptop (lower TGP), RTX 4060 Laptop | RX 7600, RX 6650 XT, RX 6600 XT, RX 5700 XT, RX 7600S, RX 7600M | 8-16 GB | Good but memory / bandwidth / power-limit sensitive | 30-45 TPS | Fully usable. Slight delays appear on longer answers, grading, or bigger batches, but the app still feels solid. |
| C+ Normal | RTX 4060, RTX 2070, RTX 2060 SUPER, RTX 4050 Laptop, RTX 3050 8GB desktop (best case), Tesla P100, Tesla P40 | RX 6600, RX 5700, RX 5600 XT, RX 6600M, RX 6700S, RX 6650M | 6-16 GB | Midrange / older pro / bandwidth-sensitive | 55-95 TPS | Noticeable but acceptable wait time. Fine for everyday study sessions if expectations are realistic. |
| C Normal | RTX 3050 8GB, RTX 3050 6GB, GTX 1660 Ti, GTX 1660 SUPER, GTX 1070 Ti, RTX 2050 Laptop, RTX 3050 Laptop, Tesla T4 | RX 6500 XT, RX 5500 XT, RX 5300M, RX 5500M, RX 5600M | 4-8 GB | Entry-level discrete | ~30 TPS | Usable, but slower. Keep context moderate and avoid very heavy local workloads. |
| D Entry / Borderline | GTX 1660, GTX 1060 6GB, GTX 1650 GDDR6, GTX 980 Ti, MX570, MX550, Tesla P4, Quadro P4000 | RX 580 8GB, RX 590, Vega 56, Vega 64, RX 6400, RX 5500 (OEM) | 4-8 GB | Limited | 10-28 TPS | Borderline comfortable. Core use is possible, but long responses and grading may feel sluggish. |
| E Poor | GTX 1650 4GB, GTX 1050 Ti, GTX 970, MX450, MX350 | RX 570 4GB, RX 560, RX 6400 in weak systems, older mobile Polaris parts | 4 GB or less | Very limited | 6-12 TPS | Local use is possible only for short interactions. Cloud mode is often the better choice. |
| F Very Poor / Not Recommended | GT 1030, GTX 960 2GB, UHD-only laptops paired with weak dGPU, very low-power MX parts | RX 550, legacy mobile Radeon parts with very low bandwidth | 2-4 GB | Severely constrained | Below ~10 TPS | ONCard local AI may become frustrating or fail outright. Cloud mode is the practical option. |
| Tier | Expected Intel Integrated Graphics / CPU Families | Expected AMD Integrated Graphics / APU Families | Effective Shared Memory | Expected Performance | Expected TPS | Experience |
|---|---|---|---|---|---|---|
| S iGPU Halo | No mainstream Intel iGPU currently sits here consistently for ONCard local inference | Ryzen AI Max+ 395 / PRO 395 with Radeon 8060S, Ryzen AI Max 390 / 385 with Radeon 8050S, select Strix Halo systems with fast LPDDR5X | 16-96 GB shared, depending BIOS / UMA allocation | APU-class monster territory | 55-85 TPS | This is the absurd integrated tier. It behaves more like a lower discrete GPU class than a normal iGPU. |
| A+ Premium iGPU | Core Ultra 9 288V, Core Ultra 7 268V / 266V / 258V, Core Ultra 5 238V / 228V with Arc 140V / 130V, strong cooling, fast LPDDR5X | Ryzen AI 9 HX 370 / PRO 370 with Radeon 890M, Ryzen AI 9 365 / AI 7 PRO 360 with Radeon 880M, top Strix Point systems | 8-32 GB shared | Best current mainstream iGPU class | 28-130 TPS | Very strong for integrated graphics. Good enough for serious local ONCard use on the right laptop. |
| A Strong iGPU | Core Ultra 9 / 7 / 5 H-series Meteor Lake or newer with Intel Arc Graphics enabled on qualifying systems, e.g. Ultra 7 155H / 165H / 265H-class laptops with dual-channel RAM and good cooling | Ryzen 7 8700G, Ryzen 7 8845HS, Ryzen 7 7840HS/U, Ryzen Z1 Extreme, Ryzen 7 6800H/HS/U with 780M / 680M, Ryzen AI 7 350 / AI 7 450 with 860M | 6-24 GB shared | High-end iGPU / APU | 18-80 TPS | Smooth to Normal. Strong enough for practical local ONCard use if the system memory and thermals are good. |
| B+ Upper Mid iGPU | 11th gen Tiger Lake i7/i5 with Iris Xe 96 EU / 80 EU, strong 12th/13th/14th gen mobile i7/i5 with Iris Xe, some Core Ultra U / UL systems with weaker configurations | Ryzen 5 8600G with 760M, Ryzen 5 7640HS/U with 760M, Ryzen 5 6600H/U with 660M, Ryzen 5 8540U-class systems, Ryzen 7 7735HS/U with 680M | 4-16 GB shared | Good iGPU class | 12-75 TPS | Clearly usable. Short to medium ONCard interactions feel fine, though long grading or chat chains slow down. |
| B Mid iGPU | 10th gen Ice Lake G7 / G4 Iris Plus, 12th/13th/14th gen desktop i9/i7/i5 with UHD 770 / 730 but fast system RAM, 11th gen i3 / weaker Iris Xe variants | Ryzen 5 8500G / Ryzen 3 8300G with 740M, Ryzen 5 5625U / 5700U / 5500U Vega 7 / 8, Ryzen 7 5800U Vega 8, Ryzen 7 4800U / 4700U Vega 7 / 8 | 4-12 GB shared | Midrange integrated | 8-16 TPS | Borderline comfortable. Good for light local use, but patience is required. |
| C Lower Mid iGPU | 12th/13th/14th gen i3 desktop with UHD 730 / 710, 10th gen Comet Lake UHD, 8th/9th gen i7/i5 desktop with UHD 630, many U-series laptops with older UHD | Ryzen 3 5300U Vega 6, Ryzen 3 4300U Vega 5, Ryzen 5 3400G Vega 11, Ryzen 5 2400G Vega 11, Ryzen 3 3200G Vega 8 | 2-8 GB shared | Limited but workable | 6-12 TPS | Basic local inference only. Cloud mode starts making a lot of sense here. |
| D Weak Legacy iGPU | 8th gen i3/i5/i7 mobile with UHD 620, 8th/9th gen desktop UHD 610 / 630, older Iris Plus 645 / 655 in thinner laptops, Pentium / Celeron UHD variants | Older Ryzen 2000 / 3000 mobile Vega 3 / 6 / 8, Athlon with Vega graphics, lower-end Ryzen 3 mobile APUs | 1-6 GB shared | Weak | 4-8 TPS | Local ONCard AI is possible only in very short bursts. Not ideal for a smooth experience. |
| E Very Weak / Not Recommended | Intel HD 620-class or below in poorly configured machines, single-channel RAM, thermal-throttled thin laptops | Old Vega 3 entry systems, single-channel low-speed memory APUs, low-power mini PCs with tiny memory bandwidth | 1-4 GB shared | Severely constrained | Below ~13 TPS | Technically possible in some cases, practically unpleasant. Cloud mode is strongly recommended. |
| CPU Era / Family | Typical iGPU Families Covered in the Table | Where They Usually Land |
|---|---|---|
| 8th Gen Core i3 / i5 / i7 (U/H/Desktop) | UHD 620, UHD 630, Iris Plus 645 / 655 | D to C |
| 9th Gen Core i3 / i5 / i7 / i9 | UHD 630 | D to C |
| 10th Gen Core i3 / i5 / i7 / i9 | UHD Graphics (Comet Lake), Iris Plus G1 / G4 / G7 (Ice Lake) | C to B |
| 11th Gen Core i3 / i5 / i7 / i9 | UHD, Iris Xe 48 / 80 / 96 EU | C to B+ |
| 12th Gen Core i3 / i5 / i7 / i9 | UHD 710 / 730 / 770, Iris Xe mobile variants | C to B |
| 13th Gen Core i3 / i5 / i7 / i9 | UHD 710 / 730 / 770, Iris Xe mobile variants | C to B |
| 14th Gen Core i3 / i5 / i7 / i9 | UHD 710 / 730 / 770, Iris Xe mobile variants | C to B |
| Core Ultra Series 1 (Meteor Lake) Ultra 5 / 7 / 9 | Intel Graphics or Intel Arc Graphics depending SKU / memory / OEM config | B to A |
| Core Ultra Series 2 (Lunar Lake / newer V-series) Ultra 5 / 7 / 9 | Arc 130V / 140V | A to A+ |
| Core Ultra Series 2 H / U / HL / UL families | Intel Graphics or Arc Graphics depending SKU and OEM enablement | B to A |
| CPU Era / Family | Typical iGPU Families Covered in the Table | Where They Usually Land |
|---|---|---|
| Ryzen 2000G / 3000G desktop APUs | Vega 8 / Vega 11 | C to D |
| Ryzen 2000 / 3000 mobile | Vega 3 / 6 / 8 / 10 | D to C |
| Ryzen 4000 mobile | Vega 5 / 6 / 7 / 8 | C to B |
| Ryzen 5000 mobile | Vega 6 / 7 / 8 | C to B |
| Ryzen 6000 mobile | 660M / 680M | B to A |
| Ryzen 7000 mobile mixed stack | 660M / 680M / 740M / 760M / 780M depending SKU | B to A |
| Ryzen 8000G desktop APUs | 740M / 760M / 780M | B to A |
| Ryzen 8000 mobile refresh | 760M / 780M | B+ to A |
| Ryzen AI 300 | 860M / 880M / 890M | A to A+ |
| Ryzen AI 400 | 860M and related newer RDNA 3.5-class iGPU variants | A to A |
| Ryzen AI Max / Max PRO | 8040S / 8050S / 8060S | S to A+ |
| TPS Band | Practical Meaning |
|---|---|
| 81+ TPS | Best Tier. The system has abundant overhead for ONCard. |
| 38-80 TPS | Smooth. Fast and responsive with only minor delay on longer outputs. |
| 26-37 TPS | Normal. Usable, but response time is noticeable. |
| 10-25 TPS | Poor to borderline. Local inference works, but the study flow slows down significantly. |
| Below 10 TPS | Cloud mode is usually the practical recommendation. |
-
VRAM capacity and bandwidth both matter.
A card with more bandwidth and a wider bus can outperform a newer but narrower card for local inference workloads. -
Laptop GPUs are not the same as desktop GPUs.
A high-power laptop implementation can be excellent; a low-power thin-and-light implementation can land much lower. -
Integrated graphics depend on memory more than you expect.
Fast dual-channel DDR5 / LPDDR5X can materially change the experience. -
Driver maturity matters, especially on AMD and Intel integrated parts.
Backend support changes can move a device up or down in real use. -
For ONCard specifically, once you are above roughly the Smooth band, the app usually feels good.
The higher tiers mainly buy you larger headroom, better multitasking, and more comfort with longer contexts.