Skip to content

Declan142/ghatak-gpu

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ghatak-gpu

GPU-level judgment daemon for NVIDIA consumer cards — sibling to ghatak-ryzen.

Current target: GTX 1660 (Turing TU116, sm_75, 6 GB, 120 W locked).

Philosophy matches ghatak-ryzen: a cheap local loop handles fast execution (fan PWM, clock lock), Opus-class judgment lives one layer up and switches profiles based on workload classification. No tensor-core fantasies, no power-limit tuning (consumer Turing locks it), no MIG. Only wins that actually exist on this silicon.

What ships today

Piece What it does State
scripts/governor.sh Unified fan curve + clock-lock governor. Classifies workload (idle / browse / record / compute) via NVENC session + compute process + util, picks clock cap, picks fan % from temp. Live
scripts/fan-curve.sh Fan-only variant (no clock lock). Fallback if clock-lock causes trouble. Live
systemd/ghatak-gpu-fan.service User-level systemd unit, auto-starts on boot via linger. Live
systemd/nvidia-persistence-on.service System-level oneshot — nvidia-smi -pm 1 on every boot. Live
config/xorg/20-nvidia-coolbits.conf Coolbits=28 → unlocks manual fan + clock offset via nvidia-settings. Requires X restart. Live
config/sudoers/ghatak-gpu Passwordless nvidia-smi --lock-gpu-clocks / --reset-gpu-clocks for the governor. Live
config/chrome-policy/ghatak-extensions.json Force-installs enhanced-h264ify — YouTube skips AV1 (no HW decode on Turing), falls back to VP9/H.264 on NVDEC. Live
config/obs-profile/Ghatak_NVENC/ OBS 30+ tuned profile: HEVC NVENC, CQP 20, B-frames 3, look-ahead, psycho-visual AQ, preset p6, tune HQ. Live

Governor workload classes

Class Clock cap Trigger
idle 300-1200 MHz Nothing else matches
browse 300-1400 MHz Chrome/Firefox GPU process present, no NVENC, util < 40%
record Unlocked (300-2115) encoder.stats.sessionCount > 0 (OBS / any NVENC user)
compute Unlocked Non-browser CUDA process present, or util > 40% (games)

Fan curve (temp → %):

<45°C → 0   45-50 → 25   50-60 → 40   60-70 → 60   70-78 → 80   78+ → 100

Install (on a fresh machine)

git clone https://github.com/Declan142/ghatak-gpu ~/ghatak-gpu
cd ~/ghatak-gpu

# Persistence mode + boot service
sudo cp systemd/nvidia-persistence-on.service /etc/systemd/system/
sudo systemctl enable --now nvidia-persistence-on

# Coolbits (requires reboot or display-manager restart)
sudo cp config/xorg/20-nvidia-coolbits.conf /etc/X11/xorg.conf.d/
# ⚠ Edit the BusID inside to match `nvidia-xconfig --query-gpu-info`

# Passwordless clock ops for governor
sudo cp config/sudoers/ghatak-gpu /etc/sudoers.d/
sudo chmod 440 /etc/sudoers.d/ghatak-gpu

# Governor daemon (user scope)
mkdir -p ~/tools/ghatak-gpu
cp scripts/governor.sh ~/tools/ghatak-gpu/
chmod +x ~/tools/ghatak-gpu/governor.sh
mkdir -p ~/.config/systemd/user
cp systemd/ghatak-gpu-fan.service ~/.config/systemd/user/
sudo loginctl enable-linger "$USER"
systemctl --user daemon-reload
systemctl --user enable ghatak-gpu-fan   # will start after reboot once Coolbits is active

# Chrome extension force-install (enhanced-h264ify)
sudo mkdir -p /etc/opt/chrome/policies/managed
sudo cp config/chrome-policy/ghatak-extensions.json /etc/opt/chrome/policies/managed/

# OBS tuned profile
cp -r config/obs-profile/Ghatak_NVENC ~/.config/obs-studio/basic/profiles/
# Then: OBS → Profile → Ghatak_NVENC

Verify (after reboot)

nvidia-smi --query-gpu=persistence_mode,fan.speed,temperature.gpu,clocks.gr --format=csv
systemctl --user is-active ghatak-gpu-fan
journalctl --user -u ghatak-gpu-fan -f

Expected idle: Enabled, 25 %, ~47 C, ~600 MHz.

Rollback

# Kill governor
systemctl --user disable --now ghatak-gpu-fan

# Revert Coolbits
sudo rm /etc/X11/xorg.conf.d/20-nvidia-coolbits.conf

# Everything else is idempotent / removable via rm

What's NOT here (and why)

  • Power-limit tuning — consumer Turing has power.max_limit locked at TDP. nvidia-smi -pl rejects writes.
  • MIG partitioning — datacenter-only (A100/H100).
  • Tensor-core / FP8 / FlashAttention paths — TU116 has no tensor cores.
  • AV1 encode (OBS) — Turing NVENC is 7th-gen, HEVC only. Ampere+ gets AV1.
  • DLSS — no tensor cores. FSR 3.1 via Gamescope is the vendor-agnostic substitute (not packaged here yet).
  • Training loops — 6 GB VRAM. Inference of 7B-Q4 is the ceiling.

Roadmap

See ROADMAP.md. Phases:

  1. ✅ Telemetry + baseline profiles (this release)
  2. ✅ Fan curve + clock-lock governor
  3. ◻ VRAM arbiter (Ollama / ComfyUI coexistence)
  4. ◻ Per-model inference tuner (n_gpu_layers / KV quant sweep + freeze)
  5. ◻ Kernel/framework advisor (xformers vs sdp, torch.compile modes)
  6. ◻ Thermal/OC explore (Coolbits clock offset, stability harness)

License

MIT. Not affiliated with NVIDIA.

About

GPU-level judgment daemon for NVIDIA consumer cards. Sibling to ghatak-ryzen. Phase 1: GTX 1660.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages