Ninjam Protocol

wahjam edited this page Nov 10, 2011 · 6 revisions

Ninjam Protocol

This document describes the Ninjam network protocol used to provide online music collaboration. The protocol is not standardized and this document was created by studying the source code to the Ninjam curses client 0.01a and server 0.06.

Overview

A Ninjam session is hosted by a server which must be reachable by all clients who wish to join. Clients authenticate, exchange control messages, and transfer audio with the server. The server is used for all communication and clients never connect to each other directly. This makes the protocol friendly to firewalls and Network Address Translation (NAT) found in broadband routers.

Upon connecting to the server a client must authenticate. It is possible to password-protect accounts but anonymous access without credentials may be permitted.

An authenticated client can share a number of audio channels with other participants. Each channel typically represents a digital audio source such as a recording interface or a software instrument.

Time is divided into intervals which allow each client to upload audio that other clients will receive and play when the next interval starts. Therefore clients do not play audio together in real-time; audio from remote clients is always delayed by one interval. This design recognizes the fact that latency on the internet is too high for real-time audio collaboration.

The tempo is defined by the Beats Per Minute (bpm) and Beats Per Interval (bpi) session settings. Clients typically offer metronome functionality based on the bpm setting to enable musicians to play in time.

Protocol

The protocol uses a TCP network connection. All values are in little endian representation so that the least significant byte comes first.

Message Header

Client and server messages start with the following header:

Offset Type     Field
0x0    uint8_t  Type
0x1    uint32_t Length (bytes, not including header)
0x5    ...      Payload

Each supported command in the protocol has a Type value. All message types, their payload layout, and their semantics are described below.

Message Types

Server Auth Challenge (0x00)

First message when a new client connection is accepted. The payload layout is:

Offset Type       Field
0x0    uint8_t[8] Challenge
0x8    uint32_t   Server Capabilities
0xc    uint32_t   Protocol Version
0x10   ...        License Agreement (NUL-terminated)

The Protocol Version field should contain 0x00020000.

If the Server Capabilities field has bit 0 set then the License Agreement is present.

The Server Capabilities field bits 8-15 contains the client keepalive interval in seconds. The client sends a Keepalive message if it has sent no messages for the interval.

The client responds with the Client Auth User message.

Server Auth Reply (0x01)

Server reply to Client Auth User. The payload layout is:

Offset Type    Field
0x0    uint8_t Flag
0x1    ...     Error Message (NUL-terminated)
x+0x0  uint8_t Max Channels

The Error Message field is optional. If it is omitted then the Max Channels field is also missing.

If authentication succeeded then Flag field bit 0 is set. On success the Error Message field may be present and contains an updated username.

If authentication failed then Error Message may be present and contains a human-readable string describing the error.

The Max Channels field is the maximum number of channels per user.

The client responds with Client Set Channel Info.

Server Config Change Notify (0x02)

Informs clients that bpm/bpi have changed. The payload layout is:

Offset Type     Field
0x0    uint16_t Beats Per Minute
0x02   uint16_t Beats Per Interval

Changing bpm/bpi takes effect on the next interval.

Server Userinfo Change Notify (0x0x03)

Informs clients that a new user has joined or channels have changed. The payload layout is:

Offset Type    Field
0x0    uint8_t Active
0x1    uint8_t Channel Index
0x2    int16_t Volume (dB gain, 0=0dB, 10=1dB, -30=-3dB, etc)
0x4    int8_t  Pan [-128, 127]
0x5    uint8_t Flags
0x6    ...     Username (NUL-terminated)
x+0x0  ...     Channel name (NUL-terminated)

There may be one or more channel changes in a single message.

When a user logs out all their channels will be deactivated.

Server Download Interval Begin (0x04)

Informs clients of a new interval. The payload layout is:

Offset Type        Field
0x0    uint8_t[16] GUID (binary)
0x10   uint32_t    Estimated Size
0x14   uint8_t[4]  FourCC
0x18   uint8_t     Channel Index
0x19   ...         Username (NUL-terminated)

If the GUID field is zero then the download should be stopped.

If the FourCC field is zero then the download is complete.

If the FourCC field contains "OGGv" then this is a valid Ogg Vorbis encoded download.

Server Download Interval Write (0x05)

Transfers audio data to clients. The payload layout is:

Offset Type        Field
0x0    uint8_t[16] GUID (binary)
0x10   uint8_t     Flags
0x11   ...         Audio Data

If the Flags field has bit 0 set then this download should be aborted.

Client Auth User (0x80)

Client reply to Server Auth Challenge message. The payload layout is:

Offset Type        Field
0x0    uint8_t[20] Password Hash (binary hash value)
0x14   ...         Username (NUL-terminated)
x+0x0  uint32_t    Client Capabilities
x+0x4  uint32_t    Client Version

The Password Hash field is calculated by SHA1(SHA1(username + ":" + password) + challenge).

If the user acknowledged a license agreement from the server then Client Capabilities bit 0 is set.

The server responds with Server Auth Reply.

Client Set Usermask (0x81)

Enables or disables receiving from a channel. The payload layout is:

Offset Type     Field
0x0    ...      Username (NUL-terminated)
x+0x0  uint32_t Channel Flags

There may be one or more usermasks in a message.

The Channel Flags field is a bitmask where a set bit indicates the client wishes to receive from a channel.

Client Set Channel Info (0x82)

Channel parameters from a client. The payload layout is:

Offset Type     Field
0x0    uint16_t Channel Parameter Size (bytes)
0x2    ...      Channel Name (NUL-terminated)
x+0x0  int16_t  Volume (dB gain, 0=0dB, 10=1dB, -30=-3dB, etc)
x+0x2  int8_t   Pan [-128, 127]
x+0x3  uint8_t  Flags
x+0x4  ...      Zero Padding (up to Channel Parameter Size)

Each channel has its parameters added to the message so there may be zero or more channels described in one message. If there are no channels then the payload is empty and zero length.

TODO document Flags field

Client Upload Interval Begin (0x83)

Informs the server of a new interval. The payload layout is:

Offset Type        Field
0x0    uint8_t[16] GUID (binary)
0x10   uint32_t    Estimated Size
0x14   uint8_t[4]  FourCC
0x18   uint8_t     Channel Index

Client Upload Interval Write (0x84)

Transfers audio data to the server. The payload layout is:

Offset Type        Field
0x0    uint8_t[16] GUID (binary)
0x10   uint8_t     Flags
0x11   ...         Audio Data

If the Flag field bit 0 is set then the upload is complete.

Chat Message (0xc0)

A chat message. The payload layout is:

Offset Type Field
0x0    ...  Command (NUL-terminated)
a+0x0  ...  Argument 1 (NUL-terminated)
b+0x0  ...  Argument 2 (NUL-terminated)
c+0x0  ...  Argument 3 (NUL-terminated)
d+0x0  ...  Argument 4 (NUL-terminated)

The client-to-server commands are:

  • MSG <text> -- broadcasts a message
  • PRIVMSG <username> <text> -- sends a message to a user
  • TOPIC <text> -- sets the server topic (requires permissions)
  • ADMIN topic|kick|bpm|bpi <value> -- administrator commands

A client-to-server MSG may be !vote bpm|bpi <value>. This allows users to change the bpm and bpi settings.

The server-to-client commands are:

  • MSG <username> <text> -- a broadcast message
  • PRIVMSG <username> <text> -- a private message
  • TOPIC <username> <text> -- server topic change
  • JOIN <username> -- user enters server
  • PART <username> -- user leaves server
  • USERCOUNT <users> <maxusers> -- server status

Client Keepalive (0xfd)

Sent periodically by client. It has no payload.