GitHub - vClusterLabs-Experiments/vbilling: Usage-based billing for vCluster tenants. Auto-discovers vClusters, meters CPU/memory/GPU/storage/network, generates invoices via Lago.

Usage-based billing for AI Clouds running vCluster or vMetal
Auto-discovers Tenant Clusters, meters node capacity and GPU SKUs, and streams usage events to your billing adapter.

Built for AI Clouds and platform teams running Kubernetes with vCluster or vMetal. vBilling is the pipe, not the billing engine. You keep your billing backend (Lago today; Metronome, Stripe Meters, OpenMeter, or a custom adapter coming next).

Architecture

┌─────────────────────────────────────────────────────────────────────┐
│                      Control Plane Cluster                          │
│                                                                     │
│  ┌───────────────┐  ┌───────────────┐  ┌───────────────┐            │
│  │ Tenant        │  │ Tenant        │  │ Tenant        │            │
│  │ Cluster       │  │ Cluster       │  │ Cluster       │            │
│  │ team-alpha    │  │ team-beta     │  │ team-gpu      │            │
│  │               │  │               │  │               │            │
│  │ private ·     │  │ private ·     │  │ private ·     │            │
│  │ 4× A100       │  │ 2× L40S       │  │ 8× H100       │            │
│  └───────┬───────┘  └───────┬───────┘  └───────┬───────┘            │
│          │                  │                  │                    │
│          └──────────────────┼──────────────────┘                    │
│                             │                                       │
│                 ┌───────────▼────────────┐                          │
│                 │      vBilling          │                          │
│                 │      Controller        │                          │
│                 │                        │                          │
│                 │  • Auto-discovers      │                          │
│                 │  • Meters node capacity│                          │
│                 │  • Streams events      │                          │
│                 └───────────┬────────────┘                          │
└─────────────────────────────┼───────────────────────────────────────┘
                              │ Usage events (HTTP)
                 ┌────────────▼────────────┐
                 │     Billing Adapter     │
                 │                         │
                 │  Lago · Stripe ·        │
                 │  Metronome · Custom     │
                 │                         │
                 │  Plans · Subscriptions  │  ← Provider configures
                 │  Invoices · Wallets     │
                 └─────────────────────────┘

vBilling handles metrics collection and event delivery. Your billing adapter handles pricing, plans, and invoicing. Providers configure pricing in the adapter. vBilling never decides what to charge.

Deployment Model

Each tenant gets dedicated bare-metal nodes (GPUs, high-memory, etc.). Billing is based on full node allocation. The entire node is theirs.

team-gpu's dedicated nodes
├── node-1: 8× H100, 96 CPU, 1TB RAM → metered at full node capacity
├── node-2: 8× H100, 96 CPU, 1TB RAM → metered at full node capacity
└── node-3: 8× H100, 96 CPU, 1TB RAM → metered at full node capacity

Pod-level metering for shared-node platforms is present in the code but not the documented path. Primary focus is dedicated-node tenants running on AI Clouds.

What Gets Metered

Metric	Source	Granularity
Node hours	Node watch	Per dedicated node
CPU core-hours	Node capacity	Full node capacity
Memory GB-hours	Node capacity	Full node capacity
GPU hours (by SKU)	Node labels	Per GPU SKU
GPU utilization	DCGM via Prometheus	Per GPU %
Storage GB-hours	PVC sizes	Per PVC
Network egress GB	CNI / Prometheus	Per tenant
LoadBalancer hours	Service count	Per LB service
Control plane hours	Tenant Cluster watch	1 per cluster

GPU SKU detection reads from node labels:

nvidia.com/gpu.product (NVIDIA GPU Operator)
cloud.google.com/gke-accelerator (GKE)
k8s.amazonaws.com/accelerator (EKS)

An H100 hour and a T4 hour are emitted as separate events so providers can price each SKU differently in their billing adapter.

Quick Start

vBilling ships with a Lago adapter today. Metronome, Stripe Meters, OpenMeter, and custom adapters are on the roadmap. The install pattern is the same regardless of adapter. The walkthrough below uses Lago.

Prerequisites

Kubernetes cluster with Tenant Clusters running (vCluster)
metrics-server installed
A billing adapter: Lago instance (see Deploying Lago)
Optional: Prometheus with DCGM Exporter for GPU utilization

1. Choose & deploy your billing adapter (Lago)

# Clone the vBilling repo (includes Lago docker-compose)
git clone https://github.com/vClusterLabs-Experiments/vbilling.git
cd vbilling/deploy/lago

# Generate RSA key for Lago (required for JWT signing)
openssl genrsa 2048 > lago_rsa.key
openssl rsa -in lago_rsa.key -out lago_rsa.key -traditional 2>/dev/null

# Create .env with Base64-encoded RSA key (Lago expects LAGO_RSA_PRIVATE_KEY)
echo "LAGO_RSA_PRIVATE_KEY=$(base64 -i lago_rsa.key | tr -d '\n')" > .env

# Start Lago
docker compose --env-file .env up -d

# Wait for API to be ready (~30s for database migrations)
# UI: http://localhost:8080 | API: http://localhost:3000

Create an organization in Lago (first-time only):

curl -s -X POST http://localhost:3000/graphql \
  -H "Content-Type: application/json" \
  -d '{"query":"mutation { registerUser(input: { email: \"admin@example.com\", password: \"yourpassword\", organizationName: \"My Org\" }) { token } }"}'

# Get your API key
docker exec lago-db-1 psql -U lago -d lago -t -c "SELECT value FROM api_keys LIMIT 1;"

2. Install vBilling

Option A: Helm (production)

# Build and push the image first
docker buildx build --platform linux/amd64,linux/arm64 \
  -t <your-registry>/vbilling:v0.1.0 --push .

# Install via Helm, point vBilling at your adapter
helm upgrade --install vbilling deploy/helm/vbilling \
  --namespace vbilling-system --create-namespace \
  --set image.repository=<your-registry>/vbilling \
  --set image.tag=v0.1.0 \
  --set adapter=lago \
  --set lago.apiURL=http://lago-api.lago-system:3000 \
  --set lago.apiKey=YOUR_LAGO_API_KEY

Option B: Run locally (development/testing)

make build
LAGO_API_KEY=<key> LAGO_API_URL=http://localhost:3000 ./bin/vbilling

3. Configure pricing in your adapter

vBilling creates billable metrics and a skeleton plan with $0 pricing. You set your own prices in the adapter's UI or API:

Open Lago UI → Plans → vCluster Standard
Edit each charge with your pricing:
- CPU Core-Hours: $0.065 (your cost + margin)
- Memory GB-Hours: $0.009
- GPU Hours (H100): $4.50, or use Lago's graduated pricing for volume discounts
- Storage GB-Hours: $0.0002
- Network Egress GB: $0.09
- Node Hours: $25.00 (for dedicated node billing)
Save. Pricing takes effect immediately for all tenants

You can also create multiple plans (e.g., "GPU Premium", "Dev Tier") and assign different plans to different customers via the adapter's API.

4. Done

vBilling will:

Auto-discover all Tenant Clusters (via StatefulSet labels or Platform API)
Create a billing customer in your adapter for each Tenant Cluster
Stream usage events every 60 seconds
Your adapter generates invoices at the end of each billing period

How It Works

Discovery

vBilling finds Tenant Clusters using two methods:

Label scanning (works with OSS vCluster): Watches StatefulSets and Deployments with app=vcluster label
Platform API (works with vCluster Platform): Lists VirtualClusterInstance resources via the management API

Metrics Collection Loop

Every collection interval (default 60s):

For each discovered Tenant Cluster:
  1. Read dedicated node capacity (labels: vcluster.loft.sh/managed-by=<name>)
     → Read full node capacity: CPU, memory, GPUs, storage

  2. Collect storage from PVCs in the namespace

  3. Collect GPU allocation from pod nvidia.com/gpu requests
     → Detect GPU SKU from the node's nvidia.com/gpu.product label

  4. Count LoadBalancer services

  5. Check spot vs on-demand node status for cost attribution

  6. (If Prometheus configured) Query DCGM for GPU utilization
  7. (If Prometheus configured) Query network egress bytes

  8. Convert all metrics to billing units:
     CPU: cores × interval_hours = core-hours
     Memory: GB × interval_hours = GB-hours
     GPU: count × interval_hours = GPU-hours (tagged with GPU SKU)

  9. Stream all events to the configured billing adapter in batch

Billing Flow

Tenant Cluster created  →  Customer auto-created in adapter
                        →  Subscription started (plan: vcluster-standard)
                        →  Usage events every 60s
                        →  Adapter aggregates over billing period
                        →  Invoice generated (monthly)
                        →  Webhook to payment provider (optional)

Tenant Cluster deleted  →  Subscription terminated
                        →  Final prorated invoice

Configuration

Environment Variables

Variable	Default	Description
`ADAPTER`	`lago`	Billing adapter to use (`lago` today; more adapters coming)
`LAGO_API_URL`	`http://localhost:3000`	Lago API endpoint (when `ADAPTER=lago`)
`LAGO_API_KEY`	(required for Lago)	Lago API key
`COLLECTION_INTERVAL`	`60s`	How often to scrape metrics
`RECONCILE_INTERVAL`	`30s`	How often to discover Tenant Clusters
`DEFAULT_PLAN_CODE`	`vcluster-standard`	Default plan code in the adapter
`BILLING_CURRENCY`	`USD`	Currency for billing
`PROMETHEUS_URL`	(empty)	Prometheus URL for DCGM/network
`SPOT_DISCOUNT_PERCENT`	`60`	Discount for pods on spot nodes

Note: Pricing is NOT configured via environment variables. Configure pricing in your billing adapter's UI or API.

Helm Values

adapter: lago  # billing adapter (lago today)

lago:
  apiURL: "http://lago-api:3000"
  apiKey: ""
  existingSecret: "lago-credentials"  # or use existing K8s secret

billing:
  collectionInterval: "60s"
  reconcileInterval: "30s"

prometheus:
  url: "http://prometheus.monitoring:9090"  # optional

Use Case: AI Cloud

Each customer gets a Tenant Cluster with dedicated bare-metal GPU nodes. vBilling meters the full node allocation by GPU SKU and streams events into whichever billing adapter you run.

Customer signs up
  → Platform provisions Tenant Cluster + dedicated nodes (8× H100)
  → vBilling discovers Tenant Cluster, detects the dedicated nodes
  → Streams events: 8 GPU-hours (H100) + 96 CPU-hours + 1 TB memory-hours per hour
  → Adapter (Lago) invoices monthly at the provider's rates
  → Customer pays via the adapter's payments integration (e.g., Stripe webhooks)

Dashboard

A lightweight billing dashboard is included at dashboard/index.html. It queries the Lago API directly and shows per-tenant usage breakdowns. (Adapter-specific dashboards for Metronome and Stripe will ship alongside those adapters.)

# Serve the dashboard
cd dashboard
python3 -m http.server 9090
# Open http://localhost:9090

Features:

Per-tenant usage cards with metric breakdown
Total spend across all tenants
Auto-refresh every 30 seconds
No framework dependencies

Project Structure

cmd/vbilling/main.go              Entry point
internal/
  config/config.go                Configuration from env vars
  lago/
    client.go                     Lago HTTP API client (current adapter)
    bootstrap.go                  Auto-creates metrics + skeleton plan
  discovery/discovery.go          Tenant Cluster discovery (labels + Platform API)
  metrics/collector.go            All metrics: CPU, memory, GPU, storage,
                                  network, DCGM, dedicated nodes, spot/on-demand
  controller/controller.go        Main reconciliation + event-streaming loop
deploy/
  helm/vbilling/                  Helm chart with RBAC
  lago/                           Docker Compose for Lago
dashboard/index.html              Billing dashboard
scripts/demo.sh                   End-to-end demo using vind
Dockerfile                        Multi-stage distroless build
Makefile                          Build targets

Multi-adapter refactor (Source/Destination plugin pattern) is planned. Today Lago is wired directly; future adapters will live under internal/destinations/<name>/.

Building

make build          # Build binary (local OS/arch)
make docker-build   # Build Docker image (local arch)
make test           # Run tests
make helm-install   # Install via Helm
make tidy           # go mod tidy

Multi-Arch Docker Image

For production K8s clusters (linux/amd64) and Apple Silicon (linux/arm64):

# Build and push multi-arch image
docker buildx create --use --name vbilling-builder 2>/dev/null || true
docker buildx build \
  --platform linux/amd64,linux/arm64 \
  -t <your-registry>/vbilling:v0.1.0 \
  --push .

The Dockerfile uses multi-stage build with gcr.io/distroless/static:nonroot as the final image (~10MB).

Deploying Lago

Docker Compose (Development/Demo)

cd deploy/lago
docker compose --env-file .env up -d

Kubernetes (Production)

Deploy Lago as Kubernetes workloads. Key components: PostgreSQL, Redis, API (Rails), Sidekiq worker, Clock, Frontend. See Lago docs for production guidance.

Roadmap

License

Apache 2.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Architecture

Deployment Model

What Gets Metered

Quick Start

Prerequisites

1. Choose & deploy your billing adapter (Lago)

2. Install vBilling

3. Configure pricing in your adapter

4. Done

How It Works

Discovery

Metrics Collection Loop

Billing Flow

Configuration

Environment Variables

Helm Values

Use Case: AI Cloud

Dashboard

Project Structure

Building

Multi-Arch Docker Image

Deploying Lago

Docker Compose (Development/Demo)

Kubernetes (Production)

Roadmap

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
assets		assets
cmd/vbilling		cmd/vbilling
dashboard		dashboard
deploy		deploy
docs		docs
internal		internal
scripts		scripts
.gitignore		.gitignore
Dockerfile		Dockerfile
Makefile		Makefile
README.md		README.md
go.mod		go.mod
go.sum		go.sum

Folders and files

Latest commit

History

Repository files navigation

Architecture

Deployment Model

What Gets Metered

Quick Start

Prerequisites

1. Choose & deploy your billing adapter (Lago)

2. Install vBilling

3. Configure pricing in your adapter

4. Done

How It Works

Discovery

Metrics Collection Loop

Billing Flow

Configuration

Environment Variables

Helm Values

Use Case: AI Cloud

Dashboard

Project Structure

Building

Multi-Arch Docker Image

Deploying Lago

Docker Compose (Development/Demo)

Kubernetes (Production)

Roadmap

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages