Lists (4)
Sort Name ascending (A-Z)
Stars
The official Python SDK for Model Context Protocol servers and clients
The official pytorch implemention of our CVPR-2024 paper "MMA: Multi-Modal Adapter for Vision-Language Models".
A latent text-to-image diffusion model
High-Resolution Image Synthesis with Latent Diffusion Models
[AAAI 2024] Follow-Your-Pose: This repo is the official implementation of "Follow-Your-Pose : Pose-Guided Text-to-Video Generation using Pose-Free Videos"
DreamSim: Learning New Dimensions of Human Visual Similarity using Synthetic Data (NeurIPS 2023 Spotlight) / / / / When Does Perceptual Alignment Benefit Vision Representations? (NeurIPS 2024)
[WIP] Resources for AI engineers. Also contains supporting materials for the book AI Engineering (Chip Huyen, 2025)
An anomaly detection library comprising state-of-the-art algorithms and features such as experiment management, hyper-parameter optimization, and edge inference.
Python SDK for AI agent monitoring, LLM cost tracking, benchmarking, and more. Integrates with most LLMs and agent frameworks including OpenAI Agents SDK, CrewAI, Langchain, Autogen, AG2, and CamelAI
[CVPR 2025] Official PyTorch Implementation of MambaVision: A Hybrid Mamba-Transformer Vision Backbone
GPT4V-level open-source multi-modal model based on Llama3-8B
【CVPR 2024 Highlight】Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models
[ICLR 2025] Official implementation of MotionClone: Training-Free Motion Cloning for Controllable Video Generation
OpenDILab Decision AI Engine. The Most Comprehensive Reinforcement Learning Framework B.P.
This repository contains implementations and illustrative code to accompany DeepMind publications
Google Research
[ICLR'25] MovieDreamer: Hierarchical Generation for Coherent Long Visual Sequences
[CVPR2025] SegAgent: Exploring Pixel Understanding Capabilities in MLLMs by Imitating Human Annotator Trajectories
An open source implementation of CLIP.
No fortress, purely open ground. OpenManus is Coming.
[CVPR2024] MotionEditor is the first diffusion-based model capable of video motion editing.
Official Implementation of "Learn from the Learnt (ECCV2024)"
This is the official implementation for DragVideo
A Large-Scale Multimodal Car Dataset with Computational Fluid Dynamics Simulations and Deep Learning Benchmarks
Together Mixture-Of-Agents (MoA) – 65.1% on AlpacaEval with OSS models