
-
Ant Group
- China Hangzhou
- https://scholar.google.com/citations?user=xRKTHmwAAAAJ&hl=zh-CN
Stars
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
DeepFaceLab is the leading software for creating deepfakes.
Clone a voice in 5 seconds to generate arbitrary speech in real-time
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
vits2 backbone with multilingual-bert
[CVPR 2023] SadTalker:Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
Instant voice cloning by MIT and MyShell. Audio foundation model.
SoftVC VITS Singing Voice Conversion
An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
Chat凉宫春日, An open sourced Role-Playing chatbot Cheng Li, Ziang Leng, and others.
Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!
[ECCV 2022] This is the official implementation of BEVFormer, a camera-only framework for autonomous driving perception, e.g., 3D object detection and semantic map segmentation.
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
BARS: Towards Open Benchmarking for Recommender Systems https://openbenchmark.github.io/BARS
Download and preprocess popular sequential recommendation datasets
Implementation of Imagen, Google's Text-to-Image Neural Network, in Pytorch
PennyLane is a cross-platform Python library for quantum computing, quantum machine learning, and quantum chemistry. Train a quantum computer the same way as a neural network.
🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time
Benchmarks of approximate nearest neighbor libraries in Python
Fengshenbang-LM(封神榜大模型)是IDEA研究院认知计算与自然语言研究中心主导的大模型开源体系,成为中文AIGC和认知智能的基础设施。
Milvus is a high-performance, cloud-native vector database built for scalable vector ANN search
EMNLP'2021: Simple Entity-centric Questions Challenge Dense Retrievers https://arxiv.org/abs/2109.08535