Skip to content

DavesEmployee/RiffusionXL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

RiffusionXL

Music generation merges artistic expression with complex musical theory, making it a captivating endeavor. While many methods exist, most are confined by set genres, rules, and templates, thus stifling compositional diversity. Our project aims to break these molds by creating a system that turns varied text inputs, from simple keywords to detailed phrases, into musical compositions. We're using the advanced Stable Diffusion XL model, a diffusion model known for converting text into images, to generate detailed spectrogram images. This model will be fine-tuned with the SOTA LAION Audio 630k dataset, encompassing a vast spectrum of audio-text pairs. Our technique synthesizes spectrogram images from text prompts using the honed SDXL and then prepares them for audio playback. To ensure quality and originality in the generated music, we will employ cyanite.ai for music tagging and similarity analysis, and the Audio Quality Platform for detailed auditory evaluations. This isn't just a music generation exploration; it's a journey into a space where text prompts fuel unique musical creations.

About

Riffusion using Stable Diffusion XL

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published