AI Infrastructure Advisor

Enterprise TCO Calculator for GPU Cloud Decisions

Compare DGX Cloud, On-Premises, and Hyperscaler GPU options through workload-driven analysis.
Uncover hidden costs. Make data-driven infrastructure decisions.

Live Demo • Features • Quick Start • TCO Model • Docs

Why This Exists

The Problem: CTOs and infrastructure leaders face a critical decision when scaling AI workloads. The choice between DGX Cloud, on-premises hardware, or hyperscaler GPU instances involves hidden costs that typical calculators miss:

GPU Utilization Waste — On-prem GPUs often run at 40% utilization, but you pay for 100%
Engineer Opportunity Cost — ML engineers spending 50%+ of time on infrastructure instead of models
Time-to-Production Gap — The 8-month delay from on-prem deployment translates to millions in delayed value

The Solution: AI Infrastructure Advisor provides a workload-first approach that surfaces these "aha moments" alongside traditional TCO metrics, enabling truly informed decisions.

✨ Features

Workload-First Analysis

Start with your AI workloads, not infrastructure specs. The calculator understands:

LLM Fine-tuning — Training data, epochs, model sizes
RAG & Retrieval — Vector DB sizing, query patterns
Inference at Scale — Throughput requirements, latency SLAs
Agent Workloads — Multi-model orchestration needs

Industry-Specific Presets

Pre-configured scenarios for:

🏥 Healthcare — HIPAA compliance, medical imaging
🏦 Financial Services — Risk modeling, fraud detection
🏛️ Public Sector — FedRAMP requirements, sovereign data
🏭 Manufacturing — Edge inference, predictive maintenance

3-Tier TCO Model

Comprehensive cost calculation across:

Tier	Components	Weight
Infrastructure	Compute, Storage, Networking	40-60%
Platform	Software, Support, Security	20-30%
Operations	Labor, Training, Opportunity Cost	20-35%

"Aha Moment" Insights

Surface hidden costs that change the decision:

💡 GPU Idle Time Waste
   $847K/year
   On-premises GPUs typically run at 40% utilization.
   You're paying for 100% but using less than half.

👨‍💻 Engineer Time on Infrastructure
   $375K/year
   Your 3 ML engineers spend ~50% of their time on
   infrastructure, not building models.

⏱️ Time-to-Production Delay Cost
   $1.2M
   On-prem deployment takes 12 months vs 4 months for
   DGX Cloud. That 8-month delay costs $1.2M in delayed value.

Scenario Comparison

Side-by-side analysis of four deployment options:

NVIDIA DGX Cloud — Managed, turnkey solution
On-Premises (DGX) — Maximum control, self-managed
Hyperscaler — AWS/Azure/GCP GPU instances
Current State — Your existing infrastructure baseline

🚀 Live Demo

Try it now: ai-infra-advisor.qbitloop.com

📦 Quick Start

Prerequisites

Node.js 18+
npm or pnpm

Installation

# Clone the repository
git clone https://github.com/QbitLoop/ai-infra-advisor.git
cd ai-infra-advisor

# Install dependencies
npm install

# Start development server
npm run dev

Open http://localhost:3000 to see the application.

Build for Production

npm run build
npm run start

📊 TCO Model

Infrastructure Costs (40-60% of TCO)

Component	DGX Cloud	On-Premises	Hyperscaler
Compute	$236K/node/yr	$150K/node/yr*	$523K/node/yr
Storage	$0.10/GB/mo	$0.05/GB/mo	$0.12/GB/mo
Networking	$0.05/GB egress	$0 (internal)	$0.09/GB egress

*Amortized over 4 years + data center costs

Platform Costs (20-30% of TCO)

Component	DGX Cloud	On-Premises	Hyperscaler
Software	Included	$4,500/GPU/yr + MLOps	$30K/yr
Support	$2K/GPU	$3K/GPU	$25K/yr
Compliance	$25K	$75K	$50K

Operations Costs (20-35% of TCO)

Role	Fully Loaded Cost
ML Engineer	$250,000/yr
MLOps Engineer	$220,000/yr
DevOps Engineer	$180,000/yr

Infrastructure Time Allocation:

DGX Cloud: 20% (managed)
On-Premises: 55% (heavy burden)
Hyperscaler: 40% (medium)

📁 Project Structure

ai-infra-advisor/
├── src/
│   ├── app/                    # Next.js App Router
│   │   ├── page.tsx            # Home (wizard)
│   │   ├── results/page.tsx    # Results dashboard
│   │   └── layout.tsx          # Root layout
│   │
│   ├── components/
│   │   ├── ui/                 # shadcn/ui components
│   │   └── wizard/             # Workload wizard steps
│   │       ├── WorkloadStep.tsx
│   │       ├── ScaleStep.tsx
│   │       ├── ConstraintsStep.tsx
│   │       └── PreviewStep.tsx
│   │
│   └── lib/
│       ├── calculations/       # TCO engine
│       │   ├── types.ts        # Type definitions
│       │   └── tco-engine.ts   # Core calculations
│       └── workloads/          # Workload definitions
│           └── types.ts
│
├── docs/
│   ├── COST-MODEL.md           # Detailed methodology
│   └── guides/
│       └── aiops-101.md        # Educational content
│
└── public/                     # Static assets

📚 Documentation

Document	Description
Cost Model Deep Dive	Detailed methodology and data sources
AIOps 101 Guide	Educational primer on AI infrastructure
Architecture Overview	Technical architecture decisions

🎨 Design System

Built following Anthropic Brand Guidelines:

Colors

Token	Hex	Usage
`--foreground`	`#141413`	Primary text
`--background`	`#faf9f5`	Light backgrounds
`--primary`	`#d97757`	CTAs, highlights
`--accent`	`#6a9bcc`	Links, info states
`--success`	`#788c5d`	Positive indicators

Typography

Headings: Poppins (24pt+)
Body: System fonts with Lora fallback

🔧 Tech Stack

Layer	Technology
Framework	Next.js 14 (App Router)
Language	TypeScript 5
Styling	Tailwind CSS 4
Components	shadcn/ui
Deployment	GitHub Pages / Vercel

🤝 Contributing

Contributions are welcome! Please read our contributing guidelines before submitting PRs.

# Fork the repo, then:
git checkout -b feature/your-feature
npm run lint
npm run build
git commit -m "feat: add your feature"
git push origin feature/your-feature

📄 License

MIT License - see LICENSE for details.

🏗️ Built With

Built with Claude by QbitLoop

Designed for NVIDIA GSI Developer Relations use cases

_{⭐ Star this repo if you find it useful!}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.github/workflows		.github/workflows
docs		docs
public		public
src		src
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
components.json		components.json
eslint.config.mjs		eslint.config.mjs
next.config.ts		next.config.ts
package-lock.json		package-lock.json
package.json		package.json
postcss.config.mjs		postcss.config.mjs
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Infrastructure Advisor

Why This Exists

✨ Features

Workload-First Analysis

Industry-Specific Presets

3-Tier TCO Model

"Aha Moment" Insights

Scenario Comparison

🚀 Live Demo

📦 Quick Start

Prerequisites

Installation

Build for Production

📊 TCO Model

Infrastructure Costs (40-60% of TCO)

Platform Costs (20-30% of TCO)

Operations Costs (20-35% of TCO)

📁 Project Structure

📚 Documentation

🎨 Design System

Colors

Typography

🔧 Tech Stack

🤝 Contributing

📄 License

🏗️ Built With

About

Uh oh!

Releases

Packages

Languages

License

QbitLoop/ai-infra-advisor

Folders and files

Latest commit

History

Repository files navigation

AI Infrastructure Advisor

Why This Exists

✨ Features

Workload-First Analysis

Industry-Specific Presets

3-Tier TCO Model

"Aha Moment" Insights

Scenario Comparison

🚀 Live Demo

📦 Quick Start

Prerequisites

Installation

Build for Production

📊 TCO Model

Infrastructure Costs (40-60% of TCO)

Platform Costs (20-30% of TCO)

Operations Costs (20-35% of TCO)

📁 Project Structure

📚 Documentation

🎨 Design System

Colors

Typography

🔧 Tech Stack

🤝 Contributing

📄 License

🏗️ Built With

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages