You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Deploying a Gonka validator today requires executing 50+ manual steps across GPU drivers, Docker configuration, key management, compose file editing, port security, and on-chain registration. Analysis of ~9,700 messages from the Gonka validator DevOps chat reveals that operators face compounding complexity at every stage:
Day-1 (Setup):
Install and validate NVIDIA drivers, Container Toolkit, and Fabric Manager across different Linux distros
Detect GPU architecture (sm_80–sm_120) and choose the correct MLNode image (standard vs Blackwell)
Calculate optimal tensor-parallel size based on GPU count, VRAM, and model requirements
Set gpu-memory-utilization correctly (0.99 causes OOM under load — 30+ chat mentions of miss rate from this)
Fill 15+ environment variables in config.env, edit node-config.json, configure docker-compose.yml
Bind internal ports to 127.0.0.1 (Docker bypasses UFW — port 5050 exposed = node hijacked)
Configure DDoS protection, pruning, persistent peers, state sync
Manage three separate keys (account, consensus, ML operational)
Download 200+ GB model weights
Register on-chain and grant ML permissions
Day-2 (Operations — 90% of operator time):
Monitor miss rate, sync lag, epoch participation, PoC weight
Fix stuck nodes after missed chain upgrades (download correct binaries, place in Cosmovisor dirs)
Manage ML nodes via Admin API (curl commands with JSON payloads)
Recover disk space from Cosmovisor backups
Each step has documented failure modes. Validators regularly break their nodes by setting gpu-memory-utilization too high, exposing internal ports, using wrong MLNode images for their GPU architecture, or botching the 6-step update process.
Solution: Gonka NOP
Gonka NOP is an open-source Go CLI that automates the entire validator lifecycle — from bare metal to producing PoC proofs — in a single command.
Consumed. NOP generates the same compose files and configs, but with automated GPU detection, security hardening, and validated parameters. NOP fetches latest image versions from this directory at setup time
Complementary. NOP focuses on deployment and operations; the exporter focuses on monitoring. NOP's centralized monitoring stack (Ansible) integrates with the exporter
Overlapping scope, different approach. Node Manager proposes a web-based control plane. NOP is a CLI tool following the Kubernetes GPU Operator philosophy — detect hardware, configure runtime, deploy workloads. Both can coexist (NOP for deployment, Node Manager for web-based fleet management)
inc4 team — operating validators on Gonka mainnet and testnet. We analyzed ~9,700 messages from the validator DevOps chat to understand operational pain points and built Gonka NOP to address them systematically. The tool is open-source (MIT) and designed to lower the barrier for new validators while reducing operational burden for existing ones.
Gonka NOP is independently developed and maintained. It consumes the official Gonka deployment configs and does not require any protocol changes. Feedback and contributions welcome.
ecosystemImprovement proposal affecting the Gonka ecosystem, external to the protocol itself
1 participant
Heading
Bold
Italic
Quote
Code
Link
Numbered list
Unordered list
Task list
Attach files
Mention
Reference
Menu
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
Gonka NOP: One-Command Validator Deployment CLI
Category: Proposals
Labels: enhancement, tooling, infrastructure, devops
Demo
Repository: github.com/inc4/gonka-nop
Problem Statement
Deploying a Gonka validator today requires executing 50+ manual steps across GPU drivers, Docker configuration, key management, compose file editing, port security, and on-chain registration. Analysis of ~9,700 messages from the Gonka validator DevOps chat reveals that operators face compounding complexity at every stage:
Day-1 (Setup):
gpu-memory-utilizationcorrectly (0.99 causes OOM under load — 30+ chat mentions of miss rate from this)config.env, editnode-config.json, configuredocker-compose.yml127.0.0.1(Docker bypasses UFW — port 5050 exposed = node hijacked)Day-2 (Operations — 90% of operator time):
Each step has documented failure modes. Validators regularly break their nodes by setting
gpu-memory-utilizationtoo high, exposing internal ports, using wrong MLNode images for their GPU architecture, or botching the 6-step update process.Solution: Gonka NOP
Gonka NOP is an open-source Go CLI that automates the entire validator lifecycle — from bare metal to producing PoC proofs — in a single command.
What It Automates
gonka-nop status(unified dashboard)gonka-nop updategonka-nop repairArchitecture
Gonka NOP is a single static binary (Go, zero runtime dependencies) with a phased setup wizard:
Day-2 Operations
Non-Interactive Mode
For datacenter operators managing fleets via Ansible/Terraform:
gonka-nop setup --yes \ --network mainnet \ --key-workflow quick \ --key-name my-key \ --keyring-password "$PASS" \ --public-ip 1.2.3.4 \ --hf-home /data/hfAll interactive prompts have CLI flag overrides. Compatible with SSH automation, CI/CD pipelines, and batch provisioning scripts.
Security Defaults
Gonka NOP applies security hardening by default, based on real-world attack patterns documented in the validator chat:
127.0.0.1GONKA_API_BLOCKED_ROUTES=poc-batches training, chain API/RPC/GRPC disabledCurrent Status
Gonka NOP is production-ready and actively used on mainnet.
Test coverage: 190+ test functions across 11 test files.
Relationship to Existing Work
Roadmap
Completed (v0.1.8)
Next (v0.2.0)
--type network/--type mlnodefor separate servers)gonka-nop unjailcommandFuture
Who We Are
inc4 team — operating validators on Gonka mainnet and testnet. We analyzed ~9,700 messages from the validator DevOps chat to understand operational pain points and built Gonka NOP to address them systematically. The tool is open-source (MIT) and designed to lower the barrier for new validators while reducing operational burden for existing ones.
Links:
Gonka NOP is independently developed and maintained. It consumes the official Gonka deployment configs and does not require any protocol changes. Feedback and contributions welcome.
Beta Was this translation helpful? Give feedback.
All reactions