Starred repositories
🦉 OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation
Enjoy the magic of Diffusion models!
HunyuanVideo-I2V: A Customizable Image-to-Video Model based on HunyuanVideo
Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting yo…
ControlNet++: All-in-one ControlNet for image generations and editing!
Wan: Open and Advanced Large-Scale Video Generative Models
A bidirectional pipeline parallelism algorithm for computation-communication overlap in V3/R1 training.
MindSpore notebooks on OrangePi AiPro
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
DeepEP: an efficient expert-parallel communication library
MuseTalk: Real-Time High Quality Lip Synchorization with Latent Space Inpainting
SkyReels V1: The first and most advanced open-source human-centric video foundation model
Official implementation of "Sonic: Shifting Focus to Global Audio Perception in Portrait Animation"
Video Generation Foundation Models: https://saiyan-world.github.io/goku/
A set of nodes to edit videos using the Hunyuan Video model
A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations
Official repository of paper “IML-ViT: Benchmarking Image manipulation localization by Vision Transformer”
A comprehensive benchmark of deepfake detection
Clean, minimal, accessible reproduction of DeepSeek R1-Zero
Open Source Image and Video Restoration Toolbox for Super-resolution, Denoise, Deblurring, etc. Currently, it includes EDSR, RCAN, SRResNet, SRGAN, ESRGAN, EDVR, BasicVSR, SwinIR, ECBSR, etc. Also …
A ComfyUI custom node designed for advanced image background removal and object, face, clothes, and fashion segmentation, utilizing multiple models including RMBG-2.0, INSPYRENET, BEN, BEN2, BiRefN…
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and…
STAR: Spatial-Temporal Augmentation with Text-to-Video Models for Real-World Video Super-Resolution
.NET Decompiler with support for PDB generation, ReadyToRun, Metadata (&more) - cross-platform!
Standalone tool for extracting and creating Godot .pck files