AI-based-AudioVis

Audio visualization is something humans generate/enjoy in many forms. Some examples:

Lighting at a concert
Creating levels in any rhythm based game (OSU, Beatsaber, etc.)
Syncing LED lights to music

This works on the following 3 components:
X: Some music is observed (heard)
P: Human brain creates a compact internal representation of observed music
Y: Said representation is converted to an artistic representation
X -> P -> Y
X and Y can be done in various ways. P, the most difficult component, can be emulated by using a representation learning model. Said model would need (X,Y) examples to be trained. This repo explores various ways to implement X (audio), P (model), and Y (visualization).

Requirements

For the VAE layers, I use a cuda kernel nvidia made for a signal processing function for stylegan2. Download a zip of relevant code here: link

Current progress:

Have a functional VAE implementation. For some reason it doesn't work with residual blocks? Must be some change that needs to be made. Additionally the VAE doesn't work very well on spectrograms.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
melspecvae		melspecvae
model		model
wav2vec2		wav2vec2
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

melspecvae

melspecvae

model

model

wav2vec2

wav2vec2

README.md

README.md

Repository files navigation

AI-based-AudioVis

Requirements

Current progress:

About

Releases

Packages

Languages

shahbuland/AI-based-AudioVis

Folders and files

Latest commit

History

Repository files navigation

AI-based-AudioVis

Requirements

Current progress:

About

Resources

Stars

Watchers

Forks

Languages