Lists (1)
Sort Name ascending (A-Z)
Stars
Clean, minimal, accessible reproduction of DeepSeek R1-Zero
A fork to add multimodal model training to open-r1
arXiv LaTeX Cleaner: Easily clean the LaTeX code of your paper to submit to arXiv
There can be more than Notion and Miro. AFFiNE(pronounced [ə‘fain]) is a next-gen knowledge base that brings planning, sorting and creating all together. Privacy first, open-source, customizable an…
🎓 Update HumanAIGC related papers from ArXiv daily
A Conversational Speech Generation Model
SesameAILabs / whisperX
Forked from m-bain/whisperXWhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
Official implementation of "Make-It-Animatable: An Efficient Framework for Authoring Animation-Ready 3D Characters" (CVPR 2025)
[CVPR 2025] MINIMA: Modality Invariant Image Matching
[CVPR 2025] EgoLife: Towards Egocentric Life Assistant
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…
Compare two version of an arXiv preprint with a single command.
Mobius: Text to Seamless Looping Video Generation via Latent Shift
Realtime Video and Audio Streaming with WebRTC and Gradio
💬 An extensive collection of exceptional resources dedicated to the captivating world of talking face synthesis! ⭐ If you find this repo useful, please give it a star! 🤩
Quantized Attention that achieves speedups of 2.1-3.1x and 2.7-5.1x compared to FlashAttention2 and xformers, respectively, without lossing end-to-end metrics across various models.
利用AI大模型,一键解说并剪辑视频; Using AI models to automatically provide commentary and edit videos with a single click.
🧸 Lobe Vidol - Making Virtual Idols Accessible for EveryOne
为GPT/GLM等LLM大语言模型提供实用化交互接口,特别优化论文阅读/润色/写作体验,模块化设计,支持自定义快捷按钮&函数插件,支持Python和C++等项目剖析&自译解功能,PDF/LaTex论文翻译&总结功能,支持并行问询多种LLM模型,支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, m…
Vector (and Scalar) Quantization, in Pytorch
Create Live Photos from a photo+video pair compatible with Apple Photos
分流完善的 OpenClash 订阅转换模板,搭配保姆级 OpenClash 设置教程,无需套娃其他插件即可实现完美分流、DNS无污染无泄漏,且快速的国内外上网体验。
Inference code to "Adversarially-Guided Portrait Matting"
Convert VMD motion data to a readable text file.