Skip to content

Yuan-ManX/audio-ai-agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 

Repository files navigation

Audio AI Agent

Here we will track the latest Audio AI Agent, including speech, music, sound effects, etc.

2023

Date Source Description Paper Code Trained Model
06.12 JAMMIN-GPT JAMMIN-GPT: Text-based Improvisation using LLMs in Ableton Live arXiv GitHub -
19.11 M2UGen M2UGen: Multi-modal Music Understanding and Generation with the Power of Large Language Models arXiv - -
14.11 Qwen-Audio Qwen-Audio: Advancing Universal Audio Understanding via Unified Large-Scale Audio-Language Models arXiv GitHub -
02.11 FLAP FLAP: Fast Language-Audio Pre-training arXiv - -
29.10 JEN-1 Composer JEN-1 Composer: A Unified Framework for High-Fidelity Multi-Track Music Generation arXiv - -
20.10 SALMONN SALMONN: Towards Generic Hearing Abilities for Large Language Models arXiv GitHub Hugging Face
19.10 Loop Copilot Loop Copilot: Conducting AI Ensembles for Music Generation and Iterative Editing arXiv - -
18.10 MusicAgent MusicAgent: An AI Agent for Music Understanding and Generation with Large Language Models arXiv GitHub -
11.10 LLark LLark: A Multimodal Foundation Model for Music arXiv GitHub -
01.10 UniAudio UniAudio: An Audio Foundation Model Toward Universal Audio Generation arXiv GitHub -
18.09 Dynamic-SUPERB Dynamic-SUPERB: Towards A Dynamic, Collaborative, and Comprehensive Instruction-Tuning Benchmark for Speech arXiv GitHub -

About

Here we will track the latest Audio AI Agent, including speech, music, sound effects, etc.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published