The Provider-Agnostic LLM Runtime for Node.js.
NodeLLM is a backend orchestration layer designed for building reliable, testable, and provider-agnostic AI systems.
Integrating multiple LLM providers often means juggling different SDKs, API styles, and update cycles. NodeLLM gives you a single, unified API for over 540+ models across multiple providers (OpenAI, Gemini, Anthropic, DeepSeek, OpenRouter, xAI, Ollama, etc.) that stays consistent even when providers change.
npm install @node-llm/coreTo understand NodeLLM, you must understand what it is NOT.
NodeLLM is NOT:
- ❌ A thin wrapper around vendor SDKs (like
openaior@anthropic-ai/sdk) - ❌ A UI streaming library (like Vercel AI SDK)
- ❌ A prompt-only framework
NodeLLM IS:
- ✅ A Backend Runtime: Designed for workers, cron jobs, agents, and API servers.
- ✅ Provider Agnostic: Switches providers via config, not code rewrites.
- ✅ Contract Driven: Guarantees identical behavior for Tools and Streaming across all models.
- ✅ Infrastructure First: Built for evals, telemetry, retries, and circuit breaking.
Most AI SDKs optimize for "getting a response to the user fast" (Frontend/Edge). NodeLLM optimizes for system reliability (Backend).
While most AI SDKs (like Vercel AI SDK) are heavily optimized for Frontend Streaming (Next.js, React Server Components), NodeLLM is built for the Backend.
- Reasoning Models: Native support for OpenAI o1/o3 and Anthropic Thinking models with first-class tokens tracking.
- Middlewares: Intercept and modify requests/responses for auditing, cost tracking, and PII redaction.
- ORM & Persistence: Save entire conversation threads, tool calls, and latency metrics to your database automatically.
- Deterministic Testing: Record and replay LLM interactions with VCR-style testing.
- Strict Process Protection: Preventing hung requests from stalling event loops.
- Decoupling: Isolate your business logic from the rapid churn of AI model versions.
- Production Safety: Native support for circuit breaking, redaction, and audit logging.
- Predictability: A unified Mental Model for streaming, structured outputs, and vision.
import { NodeLLM } from "@node-llm/core";
// 1. Zero-Config (NodeLLM automatically reads NODELLM_PROVIDER and API keys)
const chat = NodeLLM.chat("gpt-4o");
// 2. Chat (High-level request/response)
const response = await chat.ask("Explain event-driven architecture");
console.log(response.content);
// 3. Streaming (Standard AsyncIterator)
for await (const chunk of chat.stream("Explain event-driven architecture")) {
process.stdout.write(chunk.content);
}Built with NodeLLM - Multi-provider AI analysis, tool calling, and structured outputs working together.
A production-grade Next.js application demonstrating @node-llm/orm, real-time streaming, and RAG architectures.
NodeLLM provides a flexible, lazy-initialized configuration system designed for enterprise usage. It is safe for ESM and resolved only when your first request is made, eliminating the common dotenv race condition.
// Recommended for multi-provider pipelines
const llm = createLLM({
openaiApiKey: process.env.OPENAI_API_KEY,
anthropicApiKey: process.env.ANTHROPIC_API_KEY,
ollamaApiBase: process.env.OLLAMA_API_BASE
});
// Support for Custom Endpoints (e.g., Azure or LocalAI)
const llm = createLLM({
openaiApiKey: process.env.AZURE_KEY,
openaiApiBase: "https://your-resource.openai.azure.com/openai/deployments/..."
});Stop rewriting code for every provider. NodeLLM normalizes inputs and outputs into a single, predictable mental model.
import { NodeLLM } from "@node-llm/core";
// Uses NODELLM_PROVIDER from environment (defaults to GPT-4o)
const chat = NodeLLM.chat();
await chat.ask("Hello world");Pass images, PDFs, or audio files directly to both ask() and stream(). We handle the heavy lifting: fetching remote URLs, base64 encoding, and MIME type mapping.
await chat.ask("Analyze this interface", {
files: ["./screenshot.png", "https://example.com/spec.pdf"]
});Define tools once;NodeLLM manages the recursive execution loop for you, keeping your controller logic clean. Works seamlessly with both regular chat and streaming!
import { Tool, z } from "@node-llm/core";
// Class-based DSL
class WeatherTool extends Tool {
name = "get_weather";
description = "Get current weather";
schema = z.object({ location: z.string() });
async execute({ location }) {
return `Sunny in ${location}`;
}
}
// Now the model can use it automatically
await chat.withTool(WeatherTool).ask("What's the weather in Tokyo?");
// Lifecycle Hooks for Error & Flow Control
chat.onToolCallError((call, err) => "STOP");Enable detailed logging for all API requests and responses across every feature and provider:
// Set environment variable
process.env.NODELLM_DEBUG = "true";
// Now see detailed logs for every API call:
// [NodeLLM] [OpenAI] Request: POST https://api.openai.com/v1/chat/completions
// { "model": "gpt-4o", "messages": [...] }
// [NodeLLM] [OpenAI] Response: 200 OK
// { "id": "chatcmpl-123", ... }Covers: Chat, Streaming, Images, Embeddings, Transcription, Moderation - across all providers!
Get type-safe, validated JSON back using Zod schemas.
import { z } from "@node-llm/core";
const Product = z.object({ name: z.string(), price: z.number() });
const res = await chat.withSchema(Product).ask("Generate a gadget");
console.log(res.parsed.name); // Full type-safetyawait NodeLLM.paint("A cyberpunk city in rain");await NodeLLM.transcribe("meeting-recording.wav");Automatically track chat history, tool executions, and API metrics with @node-llm/orm.
import { createChat } from "@node-llm/orm/prisma";
// Chat state is automatically saved to your database (Postgres/MySQL/SQLite)
const chat = await createChat(prisma, llm, { model: "gpt-4o" });
await chat.ask("Hello");
// -> Saves User Message
// -> Saves Assistant Response
// -> Tracks Token Usage & Cost
// -> Logs Tool Calls & ResultsRun multiple providers in parallel safely without global configuration side effects using isolated contexts.
const [gpt, claude] = await Promise.all([
// Each call branch off into its own isolated context
NodeLLM.withProvider("openai").chat("gpt-4o").ask(prompt),
NodeLLM.withProvider("anthropic").chat("claude-3-5-sonnet").ask(prompt)
]);Direct access to the thought process of models like DeepSeek R1 or OpenAI o1/o3 using the .reasoning field.
const res = await NodeLLM.chat("deepseek-reasoner").ask("Solve this logical puzzle");
console.log(res.reasoning); // Chain-of-thoughtSecurity is not an afterthought. NodeLLM includes a native "Invisible Perimeter" to protect your infrastructure:
- Redaction: Automatically masks API keys in logs.
- Guardrails: Integrated support for Bedrock/Azure safety filters.
- Auditing: Full prompt/response tracing via
@node-llm/orm.
| Feature | NodeLLM | Official SDKs | Architectural Impact |
|---|---|---|---|
| Provider Logic | Transparently Handled | Exposed to your code | Low Coupling |
| Streaming | Standard AsyncIterator |
Vendor-specific Events | Predictable Data Flow |
| Streaming + Tools | Automated Execution | Manual implementation | Seamless UX |
| Tool Loops | Automated Recursion | Manual implementation | Reduced Boilerplate |
| Files/Vision | Intelligent Path/URL handling | Base64/Buffer management | Cleaner Service Layer |
| Configuration | Centralized & Global | Per-instance initialization | Easier Lifecycle Mgmt |
npm install @node-llm/coreWant to see it in action? Run this in your terminal:
git clone https://github.com/node-llm/node-llm.git
cd node-llm
npm install
npm run demoWe welcome contributions! Please see our Contributing Guide for more details on how to get started.
Heavily inspired by the elegant design of RubyLLM.
MIT © [NodeLLM contributors]