Skip to content

Latest commit

 

History

History
23 lines (15 loc) · 1.08 KB

README.md

File metadata and controls

23 lines (15 loc) · 1.08 KB

Building your Dataset

After installing Magenta, you can build your first MIDI dataset. We do this by creating a directory of MIDI files and converting them into NoteSequences. If you don't have any MIDIs handy, you can use the Lakh MIDI Dataset or find some at MidiWorld.

Warnings may be printed by the MIDI parser if it encounters a malformed MIDI file but these can be safely ignored. MIDI files that cannot be parsed will be skipped.

You can also convert MusicXML files to NoteSequences.

INPUT_DIRECTORY=<folder containing MIDI and/or MusicXML files. can have child folders.>

# TFRecord file that will contain NoteSequence protocol buffers.
SEQUENCES_TFRECORD=/tmp/notesequences.tfrecord

convert_dir_to_note_sequences \
  --input_dir=$INPUT_DIRECTORY \
  --output_file=$SEQUENCES_TFRECORD \
  --recursive

Data processing APIs

If you are interested in adding your own model, please take a look at how we create our datasets under the hood: Data processing in Magenta