RiffusionXL

Music generation merges artistic expression with complex musical theory, making it a captivating endeavor. While many methods exist, most are confined by set genres, rules, and templates, thus stifling compositional diversity. Our project aims to break these molds by creating a system that turns varied text inputs, from simple keywords to detailed phrases, into musical compositions. We're using the advanced Stable Diffusion XL model, a diffusion model known for converting text into images, to generate detailed spectrogram images. This model will be fine-tuned with the SOTA LAION Audio 630k dataset, encompassing a vast spectrum of audio-text pairs. Our technique synthesizes spectrogram images from text prompts using the honed SDXL and then prepares them for audio playback. To ensure quality and originality in the generated music, we will employ cyanite.ai for music tagging and similarity analysis, and the Audio Quality Platform for detailed auditory evaluations. This isn't just a music generation exploration; it's a journey into a space where text prompts fuel unique musical creations.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.gitignore

.gitignore

README.md

README.md

Repository files navigation

RiffusionXL

About

Releases

Packages

DavesEmployee/RiffusionXL

Folders and files

Latest commit

History

.gitignore

.gitignore

README.md

README.md

Repository files navigation

RiffusionXL

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages