Skip to content

A torch-trained network that leverages self-attention and LSTMs to generate piano notes from midi files.

Notifications You must be signed in to change notification settings

Ellon-M/AmbientGAN

Repository files navigation

AmbientGAN

Overview

A Generative Model that parses piano notes from MIDI files and uses them as input to generate new note sequences. Most of the input MIDI files are sourced from the ADL Piano MIDI Dataset.

The adversarial network here follows the default architecture of simultaneously training a generative model G that captures the data distribution, and a discriminative model D that estimates the probability that a sample came from the training data rather than G.

Additionally, a dot-product self-attention layer is employed and applied to the last layers of both the generator and the discriminator.

Req: Python 3, Pytorch 1.10, CUDA 10.2.

Training Loss

Training time on GPU: 8-10 Minutes for 2000 Epochs.

alt text

Results

MIDI tempo used for all outputs ranged between 250000 and 500000. Time sig, key sig and any necessary meta messages or lyrics found in MIDI files are pre-defined, and are not part of the model output.

Samples outputs are here.

TODO

  • Implement S.A Layer.
  • Fully Unsupervised Setting.

About

A torch-trained network that leverages self-attention and LSTMs to generate piano notes from midi files.

Topics

Resources

Stars

Watchers

Forks