Stars
NVIDIA Isaac GR00T N1 is the world's first open foundation model for generalized humanoid robot reasoning and skills.
[CVPR 2025] Sparse Voxels Rasterization: Real-time High-fidelity Radiance Field Rendering
Generate a video script, voice and a talking face completely with AI
The Large-scale Manipulation Platform for Scalable and Intelligent Embodied Systems
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilizatio…
CUDA accelerated rasterization of gaussian splatting
Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.
An open-source library for GPU-accelerated robot learning and sim-to-real transfer.
Investigating CoT Reasoning in Autoregressive Image Generation
Sky-T1: Train your own O1 preview model within $450
The official repo of MiniMax-Text-01 and MiniMax-VL-01, large-language-model & vision-language-model based on Linear Attention
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
A programming framework for agentic AI 🤖 PyPi: autogen-agentchat Discord: https://aka.ms/autogen-discord Office Hour: https://aka.ms/autogen-officehour
Devika is an Agentic AI Software Engineer that can understand high-level human instructions, break them down into steps, research relevant information, and write code to achieve the given objective…
Official repo for paper "Structured 3D Latents for Scalable and Versatile 3D Generation" (CVPR'25).
Finetune Llama 3.3, DeepSeek-R1, Gemma 3 & Reasoning LLMs 2x faster with 70% less memory! 🦥
HunyuanVideo: A Systematic Framework For Large Video Generation Model
xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism
[ICLR'25] SynCamMaster: Synchronizing Multi-Camera Video Generation from Diverse Viewpoints
Letta (formerly MemGPT) is the stateful agents framework with memory, reasoning, and context management.
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer
The repository demonstrates implementation of immersive 360 depth (image and video) in WebXR powered by A-Frame, Three.js and Depth Anywhere.
A high-throughput and memory-efficient inference and serving engine for LLMs