XR AI v0.1.0 Beta Release
Pre-release
Pre-release
XR AI — v0.1.0 Public Beta
First public release. This is a beta — APIs, configs, and behavior will change.
Report issues at https://github.com/NVIDIA/xr-ai/issues.
XR AI is a developer stack for building multimodal, real-time AI agents for XR.
Agents see and hear the user's environment, reason over live spatial context, call
tools via MCP, and respond with audio or data — across web, iOS/visionOS, Android,
AR glasses, and XR headset clients.
What's included:
- XR Media Hub — real-time media routing with per-participant isolation
- Agent SDK (
xr-ai-agent,xr-ai-models,xr-ai-pipecat) - GPU-backed AI services: VLM, LLM (×3), STT, TTS
- Six MCP servers: scene rendering, OpenXR spatial, vector math, vision, video, transcript
- Two end-to-end samples:
simple-vlm-exampleandxr-render-demo - Client samples for web, web-XR, Android, iOS/visionOS, and native C++
- CloudXR integration layer
Full documentation: https://nvidia.github.io/xr-ai
Known limitations
- Android XR passthrough camera and immersive session APIs not yet wired
- CloudXR requires a separate license; not included
- Breaking changes expected throughout beta