Macchiato is a MIDI file filter
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
Macchiato
.gitignore
COPYING
LICENSE
README.md
pkg.sh

README.md

Introduction

Macchiato is a MIDI file filter.

A filter is a tool that takes an input, and produces an output of the same kind. The UNIX/Linux command line uses a lot of filters: grep, sed, awk, perl… Those tools work on text files: they take a text file, apply some rules to transform it, and output the resulting text file.

Macchiato works in the same way: it takes a MIDI file, transforms it using some rules, and outputs a MIDI file.

Macchiato rules

The Macchiato rules resemble the awk rules: blocks that are run when a condition is met.

Whereas the awk conditions are run for each text line, Macchiato rules are run for each MIDI event, and at some special points: beginning of sequence, beginning of each track, end of each track, end of sequence.

Example: transform NOTE_ON events with a velocity equal to 0 into NOTE_OFF events with the same velocity as the corresponding NOTE_ON event.

// This simple example is basically a tutorial, with a lot of comments.

// Called at each beginning of track. Other special markers are BEGIN
// SEQUENCE, END SEQUENCE, and END TRACK.
BEGIN TRACK {
    // reset global variables at track beginning

    // velocity records the velocity of playing notes, per channel, per pitch
    velocity = [];

    // fill default values
    channel = 0;
    while (channel < 16) {
        velocity[channel] = [];
        channel = channel + 1;
    }
}

(event.type == NOTE_ON) and (event.velocity > 0) {
    velocity[event.channel][event.pitch] = event.velocity;
    emit;
    next;
}

(event.type == NOTE_ON) and (event.velocity == 0) {
    // emits a new event (instead of the analyzed one)
    emit NOTE_OFF(event.channel, event.pitch, velocity[event.channel][event.pitch]);
    next;
}

// the current event is emitted if there no "next" was triggered before

Language grammar

The language has many influences: C, Python, Go, Eiffel… There are quite strict conventions (notably, blocks are mandatory in some cases).

Ruleset ::= (Import | Def | Filter)*

Import ::= "import" Identifier ManifestString ";"

Def ::= "def" Identifier FormalArgs Block

Identifier ::= /[A-Za-z_][0-9A-Za-z]*/

FormalArgs ::= "(" (Identifier ("," Identifier)*)? ")"

Filter ::= Condition Block

Condition ::= "BEGIN" "SEQUENCE"
           |  "BEGIN" "TRACK"
           |  "END" "SEQUENCE"
           |  "END" "TRACK"
           |  Expression

Instruction ::= Assignment
             |  Block
             |  Call
             |  Emit
             |  For
             |  If
             |  Local
             |  Next
             |  While

Block ::= "{" (Instruction)* "}"

Assignment ::= (Identifier | "result") IdentifierSuffix "=" Expression ";"

Call ::= CallName "(" (Expression ("," Expression)*)? ")" ";"

CallName ::= Identifier ("." Identifier)*

If ::= "if" Block ("else" (If | Block))?

While ::= "while" Block ("else" Block)?

For ::= "for" Identifier ("," Identifier)? "in" Expression Block

Local ::= "local" Identifier ("=" Expression);

Emit ::= "emit" (Expression ("at" Expression)?)? ";"

Next ::= "next" ";"

Expression ::= OrLeft (OrRight)*

OrLeft ::= AndLeft (AndRight)*

OrRight ::= ("or" | "xor") OrLeft

AndLeft ::= ComparatorLeft (ComparatorRight)*

AndRight ::= "and" AndLeft

ComparatorLeft ::= AdditionLeft (AdditionRight)*

ComparatorRight ::= ("~" | "=" | "!=" | "<" | "<=" | ">" | ">=") ComparatorLeft

AdditionLeft ::= MultiplicationLeft (MultiplicationRight)*

AdditionRight ::= ("+" | "-") AdditionLeft

MultiplicationLeft ::= PowerLeft (PowerRight)*

MultiplicationRight ::= ("*" | "/" | "\") MultiplicationLeft

PowerLeft ::= Unary

PowerRight ::= "^" PowerLeft

Unary ::= ("not" | "+" | "-") Unary
       | AtomicExpressionWithSuffix

AtomicExpressionWithSuffix ::= AtomicExpression IdentifierSuffix

AtomicExpression ::= ManifestString
                  |  ManifestRegex
                  |  ManifestArray
                  |  ManifestDictionary
                  |  ManifestNumber
                  |  Call
                  |  Identifier
                  |  "result"
                  |  "false"
                  |  "true"
                  |  "(" Expression ")"

IdentifierSuffix ::= ("[" Expression "]" | "." Identifier)*

ManifestString ::= /"([^"]|\\.)*"/

ManifestRegex ::= /\/([^/]|\\.)*\//

ManifestArray ::= "[" (Expression ("," Expression)*)? "]"

ManifestDictionary ::= "{" (Expression ":" Expression ("," Expression ":" Expression)*)? "}"

ManifestNumber ::= /[0-9]+/

Notes:

  • Arrays are sparse: it is possible to add elements at any index (even negative) without intervening indices. Manifest arrays don't have indices: they are created starting from index 0, with incrementing indices. When iterating on sparse arrays, the values are given in ascending index order.
  • The result reserved identifier is used to assign a value that will be returned from a def function.
  • The import clauses help modularize complex rulesets. They import sub-rulesets within a specific scope. The identifier identifies the scope within the importing ruleset, and must be unique. The string is the path of the ruleset to import. The import clauses cannot be defined after filters (but they can be mixed with def clauses). The functions defined in the scope are callable via their name prefixed with the scope identifier and a dot. The filters are imported in order, and before those of the importing ruleset. Imports can be nested, the same rules apply recursively.
  • local variables have meaning only in def functions, not in filters.
  • There is no null. By design.
  • Comments are either bash-style (lines starting with a hashtag) or C-style (// and /**/)

Native values and functions

Some functions and values are provided natively by the interpreter.

Event value

The event variable contains all the information necessary to identify the event currently being treated. It contains the following information:

Identifier Content
event.tick The MIDI tick at which the event happens
event.type The type of the event, either a meta message event: SEQUENCE_NUMBER, TEXT, COPYRIGHT, TRACK_NAME, INSTRUMENT_NAME, LYRICS, MARKER_TEXT, CUE_POINT, CHANNEL_PREFIX, END_OF_TRACK, TEMPO, TIME_SIGNATURE, KEY_SIGNATURE; or a short message event: NOTE_OFF, NOTE_ON, POLY_PRESSURE, CONTROL_CHANGE, PROGRAM_CHANGE, CHANNEL_PRESSURE, PITCH_BEND.
event.sequence Only for SEQUENCE_NUMBER events: the sequence number
event.text Only for text meta events (i.e. TEXT, COPYRIGHT, TRACK_NAME, INSTRUMENT_NAME, LYRICS, MARKER_TEXT, CUE_POINT): the text string
event.bpm Only for TEMPO events: the number of beats per minute
event.numerator Only for TIME_SIGNATURE events: the signature numerator
event.denominator Only for TIME_SIGNATURE events: the signature denominator
event.metronome Only for TIME_SIGNATURE events: the signature metronome, i.e. the number of MIDI ticks per metronome beat (usually 24)
event.ticks Only for TIME_SIGNATURE events: the signature ticks, i.e. the number of 32ths that happen per quarter note (usually 8)
event.keysig Only for KEY_SIGNATUREevents: the key signature, between -7 and 7 (Cb to C# major, or Ab to A# minor)
event.mode Only for KEY_SIGNATUREevents: the key mode, 0 for major and 1 for minor
event.channel Only for short message events (i.e. NOTE_OFF, NOTE_ON, POLY_PRESSURE, CONTROL_CHANGE, PROGRAM_CHANGE, CHANNEL_PRESSURE, PITCH_BEND): the channel to which the event applies
event.velocity Only for NOTE_ON and NOTE_OFF events: the velocity of the note being resp. pushed or released
event.pitch Only for NOTE_ON and NOTE_OFF events: the pitch of the note being resp. pushed or released
event.pressure Only for POLY_PRESSURE and CHANNEL_PRESSURE events: the pressure to apply
event.mpc Only for CONTROL_CHANGE events: the type of Multi-Point Controller: either sliding controllers: BANK, MODULATION_WHEEL, BREATH, FOOT, PORTAMENTO_TIME, CHANNEL_VOLUME, BALANCE, PAN, EXPRESSION, EFFECT_1, EFFECT_2, GENERAL_PURPOSE_1, GENERAL_PURPOSE_2, GENERAL_PURPOSE_3, GENERAL_PURPOSE_4, FINE_BANK, FINE_MODULATION_WHEEL, FINE_BREATH, FINE_FOOT, FINE_PORTAMENTO_TIME, FINE_CHANNEL_VOLUME, FINE_BALANCE, FINE_PAN, FINE_EXPRESSION, FINE_EFFECT_1, FINE_EFFECT_2, FINE_GENERAL_PURPOSE_1, FINE_GENERAL_PURPOSE_2, FINE_GENERAL_PURPOSE_3, FINE_GENERAL_PURPOSE_4; or switch controllers: DAMPER_PEDAL, PORTAMENTO, SOSTENUTO, SOFT_PEDAL, LEGATO_PEDAL, NRPN_MSB, NRPN_LSB, RPN_MSB, RPN_LSB, PARAMETER_VALUE, FINE_PARAMETER_VALUE
event.value For CONTROL_CHANGE events: the value of the Multi-Point Controller (either a number between 0 and 127 for sliding controllers; or a boolean for switch controllers); for PITCH_BEND events: the value of the pitch bend (between -8192 and 8191); for MODULATION events: the value of the modulation wheel (between 0 and 127)
event.patch Only for PROGRAM_CHANGE events: the patch number

Event functions

New events can be emitted. Those events are built using the following functions:

Function name Arguments
CHANNEL_PREFIX not yet implemented
CHANNEL_PRESSURE channel the event channel; pressure: the pressure between 0 and 127
CONTROL_CHANGE channel the event channel; mpc: the Multi-Point Controller (either sliding controllers: BANK, MODULATION_WHEEL, BREATH, FOOT, PORTAMENTO_TIME, CHANNEL_VOLUME, BALANCE, PAN, EXPRESSION, EFFECT_1, EFFECT_2, GENERAL_PURPOSE_1, GENERAL_PURPOSE_2, GENERAL_PURPOSE_3, GENERAL_PURPOSE_4, FINE_BANK, FINE_MODULATION_WHEEL, FINE_BREATH, FINE_FOOT, FINE_PORTAMENTO_TIME, FINE_CHANNEL_VOLUME, FINE_BALANCE, FINE_PAN, FINE_EXPRESSION, FINE_EFFECT_1, FINE_EFFECT_2, FINE_GENERAL_PURPOSE_1, FINE_GENERAL_PURPOSE_2, FINE_GENERAL_PURPOSE_3, FINE_GENERAL_PURPOSE_4, or switch controllers: DAMPER_PEDAL, PORTAMENTO, SOSTENUTO, SOFT_PEDAL, LEGATO_PEDAL, or MIDI parameters: NRPN_MSB, NRPN_LSB, RPN_MSB, RPN_LSB, PARAMETER_VALUE, FINE_PARAMETER_VALUE); value: either a number between 0 and 127 (for sliding controllers and MIDI parameters) or a boolean (for switch controllers)
COPYRIGHT text: the text string
CUE_POINT text: the text string
END_OF_TRACK no arguments
INSTRUMENT_NAME text: the text string
KEY_SIGNATURE keysig: the key signature, between -7 and 7 (Cb to C# major, or Ab to A# minor); mode: 0 for major or 1 for minor
LYRICS text: the text string
MARKER_TEXT text: the text string
MODULATION value: the modulation wheel value between 0 and 127
NOTE_OFF channel the event channel; velocity: the velocity between 0 and 127; pitch the note pitch netween 0 and 127
NOTE_ON channel the event channel; velocity: the velocity between 0 and 127; pitch the note pitch netween 0 and 127
PITCH_BEND channel the event channel; value: the pitch bend value between -8192 and 8191
POLY_PRESSURE channel the event channel; pressure: the pressure between 0 and 127
PROGRAM_CHANGE channel the event channel; patch: the patch number between 0 and 127
SEQUENCE_NUMBER sequence: number between 0 and 127
TEMPO bpm: number of beats per minute
TEXT text: the text string
TIME_SIGNATURE numerator: the signature numerator; denominator: the signature denominator; metronome: the number of MIDI ticks per metronome beat; ticks: the number of 32ths that happen per quarter note
TRACK_NAME text: the text string

Other functions

Other functions are natively defined by the interpreter:

Function name Arguments
random max: the upper bound of the random to draw. The value actually returned will be between 0 and max-1 inclusive.
read file: the file name. The returned value is the content of the file, which must be a valid object.
write file: the file name. value: the object to write. Nothing is returned.
toString value: the value to convert. The returned value is a string that contains the serialized object.
fromString data: the serialized value to convert. The returned value is the deserialized object.

Notes:

  • read and write handle file writing of complete objects. The notation used is similar to (yet different from) JSON. The difference is that the array indexes are also recorded (because they are sparse arrays); and, still, no null.
  • toString and fromString work in a similar way; but the data is kept in a string instead of being read from / written to a file.

Technical design

Macchiatto is split in two parts:

  1. The parser
  2. The interpreter

The parser

It is a simple descending parser. There is an ambiguity in the grammar (see identifiers with dots: assignment vs call).

Its design is a blend of the "tokenizer" and the "analyzer". The analyzer drives the tokenizer, which is more flexible than traditional tokenizers that work without knowledge from the analyzer.

The interpreter

The interpreter runs in two stages:

  1. Simplify
  2. Run

Simplify

This stage simplifies the AST as much as possible, trying to discover constant things and inlining them.

There is much work that can be done in this stage (loop unrolling and such).

For the moment the work is minimal: removing dead branches wherever possible, executing operations on known static data (such as concatenating constant strings, or doing arithmetic operations on numeric constants).

Run

In this stage the MIDI file is opened and the filters are run on each MIDI event.