Clone this wiki locally
OSC is a versatile protocol which can be used however you prefer to. It is far more flexible than MIDI. But one of the reasons OSC has not replaced MIDI yet is that there is no connect-and-play (or plug-and-play if you prefer this worn out phrase) with OSC. There are several reasons for this:
- There is no standard namespace in OSC for interfacing e.g. a synth;
- Connected devices (via Ethernet, WLAN, Bluetooth etc) do not know of each other (and each other's capabilities);
- A file format similar to Standard MIDI File is missing which contains note data etc. to share data between different applications
This is a proposal for a standardized namespace within OSC.
It should have all the possibilities of MIDI (except MIDI show control and similar exotic functions - unless necessary) and beyond. It is not an "OSC-wrapped MIDI protocol" but a completely new system which has no concern for 7bit MIDI messages.
As many things from MIDI are specified in a compatible way, it would also be possible to build a lightweight "MIDI to SynOSCopy"-adapter for existing MIDI synthesizers and keyboards. That is one of the reasons some things here were initially conceived like they were. You are invited to discuss those facts to decide if this is desirable or not.
If you have no idea what OSC is or you are not sure about vocabulary as OSC Address Pattern or OSC Type Tag String, then head over to http://www.cnmat.berkeley.edu/OpenSoundControl/OSC-spec.html to read the OSC specification.
What is the purpose of this protocol:
- Offer ways to do everything that is possible with MIDI and most commonly used
- Note On / Off messages
- More than one note per frequency possible
- Easy retuning (dynamic base frequency + relative note frequencies)
Be sure to also check out http://openmediacontrol.wetpaint.com/ which is not (yet) affiliated with SynOSCopy, but also tries to offer a standardized namespace.
This proposal is very tight bound with old MIDI thoughts and just makes small leaps (Synths may be more complicated). As of Fall 2008, a general OSC "Hello" protocol was discussed in the OSC mailing list. This would make it able for connected devices to know each other's capabilities. For synths, then a standard mapping will have to get conceived. Interesting in this matter could be http://idmil.org/software/mappingtools and http://www.gdif.org/.
You are hereby officially allowed to cannibalize ideas of basic needs from these thoughts here.
On top of the address pattern is the /SYN part. /SYN itself has not got any arguments.
Under /SYN is the /IDx part located. You can think of it as the big brother of the MIDI channel. Every synth has got its own number, it starts with /SYN/ID1, then /SYN/ID2 and so on. It is also used to access synth-sub-patches as with MIDI multimode. In this case, the different patches are accessed by /SYN/ID1.1, /SYN/ID1.2 and so on. Remember: when you want to set arguments for several IDs at once you can use OSCs wildcard system to do so. E.g. /SYN/ID* sets arguments for all IDs and /SYN/ID1.* for all synth-sub-patches within ID1. The IDx-System is good for cases where the sending OSC client does not know who he is speaking with (see "OSC-Hello" in chapter "Some Words"). For other cases the differing IDs can also have an additional plain name which is in the same hierachy /SYN/name/. If one argues that the ID part is unnecessary as messages could be directly sent to the regarding synth, note that the recipient could be a host, hosting several synthesizers, or a synth with several channels, each a separate synthesizer (like with many MIDI-enabled synths which have the possibility of holding 16 different instruments).
Under /IDx is the /Vx part located. V stands for "voice". In the SYN proposal, you do not turn on notes, you turn on voices which have got a note or frequency argument. The big advantage of this system is that you can e.g. have as many "c3" notes playing as you like, each one with different filter cutoff values set or whatever else you can think of.
In this proposal, mostly OSC's (signed) int32 data type is used (and True, False and Nil for flags). It is a 32-bit big-endian two's complement integer. When sending relative numbers, the whole 32 bit are used, so its possible to send value changes of up to +2147483647 or -2147483648. For sending absolute values it was decided to also use the signed integer format, but to use just the positive part, using "only" 31 bit. This was considered more useful, because starting at any of the possible absolute values, you can switch to any other value with a relative value change. Negative values equal 0.
If bigger data types are needed in future, the int32 datatype "i" can easily be exchanged by the int64 datatype "h". But till now 2147483648 values should be enough.
General arguments as ID's and voice's parameters, pedals, volume, velocity, program change and marker numbers simply go from minimum 0 to maximum +2147483647. Volume is sent linear (linear fader position) but should of course be interpreted logarithmic by the recipient. (So the transmitted volume value is a bit like a percentage of loudness). Why? See http://www.dr-lex.34sp.com/info-stuff/volumecontrols.html.
Panning's middle position is 1073741823 (0x 3F FF FF FF). Completely to the left/back/bottom (X-Y-Z) is 0 and completely to the right/front/top is 2147483646 (The maximum 2147483647 is left out and equals 2147483646). (See above why the quirk) It is up to the recipient if it uses circular panning law but it is recommended. See http://www.harmony-central.com/articles/tips/panning_laws/.
Pitch and frequency can either be passed in cent or in Hz. When passed in Hz, the first 16 bit (of course excluding the first sign bit for absolute values) represent the whole Hz value and the following 16 bits the fractional part (so one bit represents 1/65536 of a Hz). Just divide the int32 value by 2^16 to get the Hz value. So the maximum presentable frequency is 2^15=32768Hz. When pitch or frequency is passed in cent, the first 18 bit (excluding the sign bit for absolute values) represent whole cents and the last 14 bit represent the fractional part. Divide the int32 value by 2^14 to get the cent value. 17 bit for the integral part were chosen because human beings hear a bit less then 11 octaves, so about 12*12000 cent should be addressable. With 17 bit you have got that. (quirky) BTW, you can hardly hear an interval of one cent: http://en.wikipedia.org/wiki/Image:One_Cent_Interval.ogg
Tempo is encoded as frequency in Hz above. The first 16 bit (excluding the first sign bit for absolute values) represent whole BPMs and the following 16 bits the fractions of a BPM. Again, just divide the int32 value by 2^16 to get the BPM value.
Time (and Song Position) can either be passed in milliseconds or in beats. When passed in milliseconds, the first 28 bit represent the integral milliseconds value (roughly 1,5 days maximum). The last 4 bit describe the fractional part. So divide the int32 value by 2^4 to get milliseconds. Beats are presented the following way: The first 20 bit (excluding sign bit for absolute values) represent musical quarters. The last 12 bit represent the parts per quarter (PPQ). A PPQ timebase of 3360 is used so when the 12 bit show more then 3359, they equal 3359. Every quarter has been split into 3360 parts to be compatible to MIDI's timebase of 24 and a very often used timebase of 96 and to support accurate polyrhythms with not just triplets but also quintuplets and septuplets (3360 is dividable by 2, 3, 5 and 7).
Notes are coded in the following way: Middle C (= C4 = MIDI note number 60) equals the int32 value 1073741823 (0x 3F FF FF FF). Each note up/down is +1/-1. So when translating MIDI note numbers to OSC notes, just add 1073741763 (0x 3F FF FF C3) to the wanted MIDI notes (also see http://www.harmony-central.com/MIDI/Doc/table2.html). This was considered bad, please discuss. The standard base frequency is 261.6256Hz for the middle C. Commonly the equal temperament is used where each further note has the frequency ratio of 2^(1/12) to its precending note. Of course different tunings can be used which either specify which note has got which frequency (or ratio/cent offset to the reference frequency) or add a cent offset to each note or set several of one octave relatively to a base frequency as in Scala scale file format (http://www.huygens-fokker.org/scala/). The frequency ratio represented by one cent is the 1200th root of 2 BTW. So when a certain amount of cent has to be added to a frequency, the new frequency is oldfrequency*((2^(1/1200))^cents).
For the Tuning Format, see "/SYN/IDx/TUNING" in chapter "Methods" -> "Per-ID Methods" -> "General Helpers".
So there is just one note-tuning system? Wrong. http://www.huygens-fokker.org/scala/
To stay flexible and not get stuck in one view, I would recommend 4 ways to represent note pitches:
- cent offset + a general base frequency
- ratio + a general base frequency
- note number (retuneable to some value of a, b or c)
The tuning methods should accept .scl files (see scala tuning standard: http://www.huygens-fokker.org/scala/scl_format.html). In the scala tuning standard, there is NO base frequency specified for good reasons. So the synth must have some method to adjust that (preferably dynamic at playtime with welldefined semantics for already playing notes).
Here that is /SYN/IDx/BASE and /SYN/IDx/RELBASE. RELBASE is for dynamically shifting the whole synth by a ratio/cent offset. This is an essential thought for dynamic tone systems.
There are several Voice-Level and ID-Level Methods.
Sequencer controlling stuff has been incorporated into IDs to make it possible to do advance setups where multiple sequencing occurs for different IDs.
Per voice the following methods can be invoked:
/SYN/IDx/Vx/ON,TFTi(Fi) Turns on a defined voice. Passed are either the note number or the frequency as fourth argument. It is also possible to set velocity with arguments five and six. If no velocity value is submitted, the recipient should either use a standard velocity value or use the one from the last received voice on message. The first argument tells the receiver to either cut a still sounding voice with the given voice number ("T") or to let the old voice release and play in parallel with the new voice this message creates. This has a big impact on the receivers' voice management! In the second case, the receiver has to create a second sounding voice with a different internal voice number. The old voice then is no more tweakable from SynOSCopy. The problem why this was introduced is that a SynOSCopy sender does not know when a voice has finished its release phase after a Voice-Off message has been sent. So a following Voice-On with the same voice number could cut a voice in a releasing phase. This is not so bad if the same note number / frequency is played (like in MIDI). But with different frequencies, this easily gets noticed negatively. When the second argument is "T", the fourth argument is relative to the frequency/note number of the last voice on message. The third argument tells if either a note number ("T") or an absolute frequency ("F") is passed with the fourth argument. It is important that both possibilities exist for two contrary kinds of synthesizers: 1) drum-machine like ones which trigger a predefined sample which is mapped to a specific key and 2) other synthesizers which play a specific frequency, also think of breath controllers here (or Theremin-like stuff). (One of the problems with integrating SynOSCopy with OMC was that OMC favoured the key-based approach) If the fifth parameter is "T", the sixt argument (velocity) is relative to the last submitted velocity value. It has a big advantage when notes can be / usually are turned on by frequency and not note number: when multiple controllers control one synth, the synth does not have to keep track which controller's note number means which frequency (dynamic retuning plays a big role in this proposal).
/SYN/IDx/Vx/OFF,(Fi) Turns off a defined voice. It is also possible to submit velocity with this function (like you know from note off velocity in MIDI). If the first parameter is "T", the second argument (velocity) is relative to the voice on velocity value of this voice.
/SYN/IDx/Vx/VOL,Fi Sets the volume of the defined voice. When the first argument is "T", the second argument is relative.
/SYN/IDx/Vx/PITCH,FFi Pitches the specified voice. When the first argument is "T", the third argument is relative. When the second argument is "F", the third argument passes cents (then the first argument is ignored and set to relative as cents are always relative), when it is "T" it passes Hz.
/SYN/IDx/Vx/PAN,Fi(ii) Pans the specified voice. When the first argument is "T", the following arguments are relative. If just arguments one and two are passed, it is a simple left-right-pan. When a third argument is passed too, the panning becomes an X-Y panning. And when a fourth argument is passed, panning is 3-dimensional X-Y-Z (open for suggestions).
/SYN/IDx/Vx/P1,Fi Sets parameter 1 of the voice. This could e.g. be the note aftertouch. When the first argument is "T", the second argument is relative. Parameter 2 is method /SYN/IDx/Vx/P2, then comes P3 etc.
Every ID has got the following methods which can be invoked. IDs can also have Sub-IDs.
/SYN/IDx/VOL,Fi Sets the volume of the synth. When the first argument is "T", the second argument is relative.
/SYN/IDx/PITCH,FFi Pitches the whole synth. Most probably pitches every playing and newly turned on voice by set cent or Hz. It is of course up to the synth what it does with the pitching but in most cases it will be used for pitching all voices. When the first argument is "T", the third argument is relative. When the second argument is "F", the third argument passes cents (then the first argument is ignored and set to relative as cents are always relative), when it is "T" it passes Hz.
/SYN/IDx/BASE,i Sets the base frequency of the synth. It is very important that already playing notes are NOT affected, but just new ones. Argument is always absolute in Hz. Standard is 440Hz (for the middle a).
/SYN/IDx/RELBASE,Fi(i) This method is for the realization of a dynamic tone system. It takes a multiplicative offset (either in Cent or in a Ratio) to the base frequency. When the first argument is "F", the second argument passes cents, when it is "T" it passes a whole number ratio (then the second argument is the numerator and the third the denominator). Also here it is important that already playing notes are remaining unaltered.
/SYN/IDx/PAN,Fi(ii) Pans the whole synth. Most probably pans every playing and newly turned on voice. It is of course up to the synth what it does with the panning but in most cases it will be used for panning all notes. When the first argument is "T", the following arguments are relative. If just arguments one and two are passed, it is a simple left-right-pan. When a third argument is passed too, the panning becomes an X-Y panning. And when a fourth argument is passed, panning is 3-dimensional X-Y-Z (open for suggestions).
/SYN/IDx/P1,Fi Sets parameter 1 of the synth. This could e.g. be the modwheel or channel aftertouch or just any knob/slider. When the first argument is "T", the second argument is relative. Parameter 2 is method P2, then comes P3 etc.
/SYN/IDx/PROGCH,Fi This is a program change. When passed to a multimode synth, it either canges all synth-sub-patches (using /SYN/IDx/PROGCH) or just a single one (/SYN/IDx.x/PROGCH). When the first argument is "T", the second argument is relative.
/SYN/IDx/SUSTAIN,Fi The sustain pedal. Voice off messages are completely ignored when the value is set to maximum. When the synth supports it, values below maximum cause the synth to fade the voice out to it is like (the sustain pedal of a piano is no on/off switch either). When the first argument is "T", the second argument is relative.
/SYN/IDx/SOSTENUTO,Fi Like the sostenuto pedal on pianos. Ignores voice off messages for voices which are set on at the moment the sostenuto message is received (with a value greater then 0). Following voice-on-voice-off message pairs trigger normally exept to the first mentioned voices which are not turned off until a sostenuto message with value 0 is received. Learn about the sostenuto pedal on pianos to understand. When the synth supports it, values below the maximum cause the synth to fade the voice out to it is like. When the first argument is "T", the second argument is relative.
/SYN/IDx/SOFT,Fi Like the soft pedal on pianos. Plays the voice softer. The value sets the degree of softness. 0 is normal and 2147483647 is fluffy-soft. When the first argument is "T", the second argument is relative.
/SYN/IDx/LEGATO,T(FFi) The legato pedal. When first argument is set to "T", voices are automatically kept on until a new voice is turned on. This works also with chords. Within a selected timeframe turned on voices play together. The timeframe is set with the fourth parameter. When the second argument is "T", the fourth argument is relative. When the third argument is "T", the fourth argument is in beats otherwise it is in milliseconds.
/SYN/IDx/PORTAMENTO,T(FFi) The portamento pedal. When the first argument is set to "T" and a new voice is played, the old playing note slides to the new one. Send "F" to turn off. The synth now works in monophonic mode. If the synth supports it, parameters will slide too. The portamento time can be passed too with the fourth argument, either in beats or in milliseconds. When the second argument is "T", the fourth argument is relative. When the third argument is "T", the fourth argument is in beats otherwise it is in milliseconds.
/SYN/IDx/SNAP,Fi The snap pedal. When the second argument is set to anything higher then 0, voice on messages with set frequency (and not note number) snap to the nearest note of the used note map (see /SYN/IDx/TUNING) when they are closer then the second argument's value of cents. When the first argument is "T", the second argument is relative. You might head over to http://www.hakenaudio.com/Continuum/html/operation/Pitch.html to get a clue what is meant.
/SYN/IDx/ALLVOFF, Turns off all voices of the selected ID. Like a panic function.
/SYN/IDx/AUTOVOFF,i Turns of all voices which are not turned off after a selected time. Time is passed in milliseconds. Set to 0 to turn off. (Also see /SYN/ID*/ACTIVESENS).
/SYN/IDx/ACTIVESENS,i Sets up active sensing as known from MIDI. The argument sets the maximum time interval in milliseconds in which "/SYN/ID*/ACTIVESENS,T" messages have to follow. Otherwise all voices are turned off by the synth. Set to 0 to turn off active sensing.
/SYN/IDx/RESETALLP, Resets all parameters to 0 or meaningful values respectively (panning to the middle, volume to the maximum etc.).
/SYN/IDx/LOCALCONTR,T Turns on/off the local control as known from MIDI. When set to "F", a hardware synth cannot be played by its own keyboard (but it sends out the messages), it only plays when it gets OSC messages from an external controller.
/SYN/IDx/SENDCONTR,T Turns on/off sending out OSC data to external devices from a synth. When set to "F", a hardware synth can still be played by both incomming OSC messages and its own keyboard but it does not send out the messages.
/SYN/IDx/SYSTEMRESET, Resets the synth.
/SYN/IDx/TUNING,FFi (or just s) This is a more complex method. With this, either the base frequency of the used tone system can be changed, or a microtonal map for several note numbers can be set (as with .tun file format) or an octave can be specified by base frequency and cent offset / ratio to the base frequency for several notes (as with Scala scale format .scl). To just change the base frequency, use the typetag string of ",FFi". When the first argument is "T", the third argument is relative. When the second argument is "F", the third argument passes cents (then the first argument is ignored and set to relative as cents are always relative), when it is "T" it passes Hz. To set a microtonal map for each note (or some notes), use a typetagstring of just ",s". In the OSC string different note mappings are separated by an ";". It's in the format "tun;Comment;int32 note number x=Hz;int32 note number y=Hz;...". So e.g. "tun;Some note scale;1073741823=261.625627;1073741824=277.182696" etc. The frequency can have up to 6 decimal places. When for a note no frequency is set (which is very likely with 2^31 notes), then it remains at its old frequency. When a synth has received such a microtonal map, the base frequency cannot be changed. But all notes can still be offset by cent (using "/SYN/IDx/TUNING,TFi"). To send a Scala-like scale, also use a typetagstring of just ",s". In the OSC string the first cell (as above, cells are separated by a ";") is "scl", the second a name for the scale or just a comment, the third cell tells how many notes are in the scale and the following cells describe those notes either in ratios to the base 1/1 or in cent offsets to the base frequency. The base frequency (1/1) maps to note number 1073741823, the in the string specified number of notes map to the next note numbers respectively. After that, the pattern is repeated, one octave higher (and below one octave lower) and so on. So when the OSC string "scl;3-Tone-Scale;2;3/2;203.91" is received, this maps note 1073741823 to the base frequency (let us say it is 200Hz), note 1073741824 to 300Hz, note 1073741825 to 225Hz. Following notes are mapped to octaves of these notes respectively, so note 1073741826 maps to 400Hz, note 1073741827 to 600Hz and so on. Same for notes below, note 1073741822 maps to 112.5Hz, note 1073741821 to 150Hz and note 1073741820 to 100Hz. You see that the notes are not sorted by frequency, just by the order in the string. When the base frequency is changed with any "/SYN/IDx/TUNING,FFi" message, all note frequencies should get updated at the recipient. Scala scale format definition: http://www.xs4all.nl/~huygensf/scala/scl_format.html
Also single IDs can be controlled for more complex setups. But in most cases, /SYN/ID*/ will be used.
/SYN/ID/START,(FFi)* Starts playing, either at 0 or at a starting position which can be passed too. When the first argument is "T", the third argument is relative. When the second argument is "T", the third argument is in beats otherwise it is in milliseconds.
/SYN/ID/STOP,* Stops playback.
/SYN/ID/CONTINUE,* Continues playback.
/SYN/ID/SETPOS,FFi* (alternatively for markers NFs, NFi or NTi) Sets the position of the playback cursor. When the first argument is "T", the third argument is relative. When it is "F", the third is absolute. When the second argument then is "T", the third argument is in beats, when it is "F' it is in milliseconds. The first argument can be "N", then no absolute or relative time is passed but a marker. This is a bit more complicated. When the second argument is "T", a relative marker can be jumped to with an int32 value as third argument (an integer value of +1 means next marker, -1 means prevous marker and so on). When the second argument is "F", the third argument can either be an int32 specifying an absolute marker or an OSC string containing the marker name. (stupid system?) There cannot be more then one marker with the same name.
/SYN/ID/LOOP,T* (De)Activates the loop (= cycle mode) in the sequencer.
/SYN/ID/LOOPL,FFi* Sets the left loop marker. When the first argument is "T", the third argument is relative. When the second argument is "T", the third argument is in beats otherwise it is in milliseconds.
/SYN/ID/LOOPR,FFi* Sets the right loop marker. When the first argument is "T", the third argument is relative. When the second argument is "T", the third argument is in beats otherwise it is in milliseconds.
/SYN/ID/RECORD,T* Starts recording when the first argument is "T". Otherwise it stops recording (but not playback).
/SYN/ID/MARKER,T(sFFi)* Sets or deletes a marker. When the first argument is "T" this sets a marker either at the actual position or at the optionally passed one. When the first argument is "F" this deletes a marker with the given name or at the passed position. The optional second argument sets the name for the new marker / marker to delete. When the third argument is "T", the fifth argument is relative. When the fourth argument is "T", the fifth argument is in beats otherwise it is in milliseconds. For adding or deleting just a marker on the set position, the string can be skipped by replacing its "s" by an "N" in the OSC type tag string. When creating a new named marker and another marker with the same name already exists, a suffix number should be added by the recipient.
/SYN/ID/PATTERN,T* Switches to pattern mode when argument is "T", to song mode when "F".
/SYN/ID/TEMPO,Fi* Sets the tempo in BPM. When the first argument is "T", the second argument is relative.
/SYN/ID/RAW,b* Used for bulk transfer data as sending samples, firmware-updates etc.
There is already one Implementation of the ideas for the Processing language from Alvaro: http://code.google.com/p/synoscp5/
JGGlatt took some of the ideas of SynOSCopy and built his own solution: http://home.roadrunner.com/~jgglatt/OSC_Synth/osc_synth.htm
add other implementations / similar stuff here
Connect a Controller to a specific Synth
Sort of a Todo.
- A controller can connect to a) a network address + port + protocol and b) one (or more if supported) of the IDs.
- It should be easy & plug&playable => The maximum work should be that the controller asks which synth it should control.
- The parameters of the controller should be easily remappable
- and maybe mappable to multiple destinations
One idea would be the use of something like this mapping tool: http://idmil.org/software/mappingtools but with an additional automatic mappingt
And expansion of this idea would change SynOSCopy completely, the basic (uncomplete) thought would be: A synth has an XML file which contains every parameter (like in the mappingtools) + a classification for every parameter. A controller also has such an XML file (or builtin) with its output parameters + a standard mapping (destination is a classification) that happens automagically. (What is the thing with the IDs)
Really use OSC bundles! If you have not got a quartz clock on your external gear, you can set the OSC time tag to zero which lets the containing methods invoke immediately.
OSC is far from plug and play usable, some sort of an "OSC-Hello" protocol is missing which lets connected devices find each other right after connecting. Without having to set up a OSC server port for a TCP or UDP connection. I think there is something like "OSC-Bonjour" now, I did not check it out yet.
An OSC file format could be just a huge OSC bundle which contains many smaller timetagged (or beat-tagged?) OSC bundles which contain the control messages listed in this document. Additionally song information as tempo, beat, etc. should be included too. Or maybe in XML format (as MIDI XML)?
Do a better readable version of this text and explain each method's arguments in an easier understandable table.
TUNING: support for .tun and .scl formats Integrating MTC to time arguments (!) Something like MIDI Clock? More sequencer control methods? Time signature support? Polyphonic Portamento as in FL Studio? A GETMARKERS method to let devices share marker positions SETPOS: an argument to jump immediately or in tempo context?
18.8.2010 imported from wetpaint
20.9.2007 v0.9 Explained all data types, introduced markers and record method, started with tuning format explanation. Removed additional optional arguments from voice on and off as it was getting too complicated.
18.9.2007 v0.9 Finally got my head around and refined this proposal to a readable form. Still much to do about defining passed data types and so on.