Skip to content
@getainode

AINode

Turn any NVIDIA GPU into a local AI platform. Chat, API, and fine-tuning in your browser. One command to start. Apache 2.0.
AINode

AINode

Turn any NVIDIA GPU into your local AI platform.

Chat, API, and fine-tuning in your browser. One command to start. Automatic multi-node clustering. Apache 2.0.


Website Install Docs License

CUDA vLLM Ray Docker

Release Bot — follow AINode updates


Install in 30 seconds

curl -fsSL https://ainode.dev/install | bash

That's it. Pulls the unified container image, registers a systemd service, and opens the chat UI at http://localhost:3000.

Distributed (multi-node) install:

AINODE_PEERS="10.0.0.2,10.0.0.3" curl -fsSL https://ainode.dev/install | bash

What it looks like

Distributed inference across two DGX Sparks

AINode cluster view — two DGX Sparks with tensor-parallel inference

Two DGX Sparks, one sharded model, 244 GB aggregated VRAM. NCCL over RoCE at 200 Gbps on ConnectX-7.

Chat — streaming, OpenAI-compatible

AINode chat UI

API console — LM Studio style, live request tap

AINode server API console

50+ models — Hugging Face catalog, one click

AINode downloads

Fine-tune from the browser

AINode fine-tuning dashboard

Cluster config — set TP / PP, pick the fabric, go

AINode cluster config


Projects in this org

Repo What it is
ainode Product source — CLI, API, web UI, engines, training
ainode.dev Marketing site at ainode.dev

Container images

Public on both GHCR (canonical) and Docker Hub (mirror):

# GHCR — used by the installer, no rate limits
docker pull ghcr.io/getainode/ainode:latest

# Docker Hub — mirror, for discoverability
docker pull argentaios/ainode:latest

Docker Pulls GHCR


Why this exists

Most "local AI" tooling bails out the moment your model is bigger than one GPU. The moment you need two, you're writing Ray configs, debugging NCCL, patching vLLM, and wiring SSH bootstrap — by hand, at 2 AM.

AINode bundles that entire stack into a single container and turns multi-node inference into a UI checkbox:

  • Auto-discovery over UDP on your cluster subnet
  • Tensor-parallel sharding across every GPU the cluster sees
  • Ray head + worker formation via eugr's launcher
  • NCCL over RoCE when ConnectX-7 + RDMA are present
  • Graceful fallback to single-node when the cluster shrinks

If you have one GB10 or ten of them, you run the same install command, and the thing just adds up the VRAM.


Built on giants

AINode doesn't reinvent inference — it composes the best OSS runtimes and makes them boring to operate:


Status

v0.4.0 shipped (April 2026). Distributed TP=2 verified on real GB10 hardware. See the full state-of-play in the product README — including what works, what doesn't, and the lessons learned getting to "it just runs."


Powered by argentos.ai · Apache 2.0 · Made with NVIDIA GB10

If this saved you a weekend, consider sponsoring the work.

Popular repositories Loading

  1. ainode ainode Public

    Turn any NVIDIA GPU into a local AI platform. Inference + fine-tuning in your browser. One command to start, automatic clustering.

    Python 3 1

  2. .github .github Public

    Org profile + shared community health files for getainode

  3. ainode-docs ainode-docs Public

    AINode documentation — docs.ainode.ai

    MDX

Repositories

Showing 3 of 3 repositories

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Loading…

Most used topics

Loading…