rtx-3090

Here are 3 public repositories matching this topic...

thc1006 / qwen3.6-speculative-decoding-rtx3090

First public benchmark of llama.cpp speculative decoding on Qwen3.6-35B-A3B with a single RTX 3090 (post PR #19493 merge, 2026-04-19). 19 configurations covering ngram-cache, ngram-mod, and classic draft with vocab-matched Qwen3.5-0.8B. Finding: no variant achieves net speedup on Ampere + A3B MoE. Raw JSON, plots, full reproducibility.

benchmark cuda moe ampere mixture-of-experts inference-benchmark llama-cpp ggml local-llm llm-inference qwen speculative-decoding qwen3 rtx-3090

Updated Apr 22, 2026
Python

GWD99 / gpu-cpu-tray-monitor

Star

Lightweight GPU & CPU system tray monitor for NVIDIA GPUs (RTX 5090, RTX 6000, RTX 4090, RTX 3090, Tesla, TCC mode). Real-time power, temperature, VRAM & CPU usage badges. Works where HWMonitor, GPU-Z & MSI Afterburner fail.

Updated Feb 19, 2026
Python

Yungblut / TAMARA-PROJECT

Star

100% local voice assistant with Tool Calling, neural TTS, and streaming responses. Runs on RTX 3090 with Ollama + Kokoro TTS + FastAPI. Privacy-first AI.

python text-to-speech voice-assistant fastapi local-ai ollama tool-calling pydantic-ai kokoro-tts rtx-3090

Updated Apr 7, 2026
Python

Improve this page

Add a description, image, and links to the rtx-3090 topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the rtx-3090 topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly