Skip to content


Subversion checkout URL

You can clone with
Download ZIP
tree: d634f05161
Fetching contributors…

Cannot retrieve contributors at this time

523 lines (408 sloc) 22.279 kb
Table of Contents
1. Overview
2. Frame header
2.1. version
2.2. flags
2.3. stream
2.4. opcode
2.5. length
3. Notations
4. Messages
4.1. Requests
4.1.1. STARTUP
4.1.3. OPTIONS
4.1.4. QUERY
4.1.5. PREPARE
4.1.6. EXECUTE
4.2. Responses
4.2.1. ERROR
4.2.2. READY
4.2.5. RESULT Void Rows Set_keyspace Prepared
4.2.6. EVENT
5. Compression
6. Error codes
1. Overview
The CQL binary protocol is a frame based protocol. Frames are defined as:
0 8 16 24 32
| version | flags | stream | opcode |
| length |
| |
. ... body ... .
. .
. .
The protocol is big-endian (network byte order).
Each frame contains a fixed size header (8 bytes) followed by a variable size
body. The header is described in Section 2. The content of the body depends
on the header opcode value (the body can in particular be empty for some
opcode values). The list of allowed opcode is defined Section 2.3 and the
details of each corresponding message is described Section 4.
The protocol distinguishes 2 types of frames: requests and responses. Requests
are those frame sent by the clients to the server, response are the ones sent
by the server. Note however that while communication are initiated by the
client with the server responding to request, the protocol may likely add
server pushes in the future, so responses does not obligatory come right after
a client request.
Note to client implementors: clients library should always assume that the
body of a given frame may contain more data than what is described in this
document. It will however always be safe to ignore the remaining of the frame
body in such cases. The reason is that this may allow to sometimes extend the
protocol with optional features without needing to change the protocol
2. Frame header
2.1. version
The version is a single byte that indicate both the direction of the message
(request or response) and the version of the protocol in use. The up-most bit
of version is used to define the direction of the message: 0 indicates a
request, 1 indicates a responses. This can be useful for protocol analyzers to
distinguish the nature of the packet from the direction which it is moving.
The rest of that byte is the protocol version (1 for the protocol defined in
this document). In other words, for this version of the protocol, version will
have one of:
0x01 Request frame for this protocol version
0x81 Response frame for this protocol version
2.2. flags
Flags applying to this frame. Currently only one bit (the lower-most one, the
one masked by 0x01) has a meaning and indicates whether the frame body is
compressed. The actual compression to use should have been set up beforehand
through the Startup message (which thus cannot be compressed; Section 4.1.1).
The rest of the flags is kept for future use.
2.3. stream
A frame has a stream id (one signed byte). When sending request messages, this
stream id must be set by the client to a positive byte (negative stream id
are reserved for streams initiated by the server; currently all EVENT messages
(section 4.2.6) have a streamId of -1). If a client sends a request message
with the stream id X, it is guaranteed that the stream id of the response to
that message will be X.
This allow to deal with the asynchronous nature of the protocol. If a client
sends multiple messages simultaneously (without waiting for responses), there
is no guarantee on the order of the responses. For instance, if the client
writes REQ_1, REQ_2, REQ_3 on the wire (in that order), the server might
respond to REQ_3 (or REQ_2) first. Assigning different stream id to these 3
requests allows the client to distinguish to which request an received answer
respond to. As there can only be 128 different simultaneous stream, it is up
to the client to reuse stream id.
Note that clients are free to use the protocol synchronously (i.e. wait for
the response to REQ_N before sending REQ_N+1). In that case, the stream id
can be safely set to 0.
2.4. opcode
An integer byte that distinguish the actual message:
0x00 ERROR
0x02 READY
0x07 QUERY
Messages are described in Section 4.
2.5. length
A 4 byte integer representing the length of the body of the frame (note:
currently a frame is limited to 256MB in length).
3. Notations
To describe the layout of the frame body for the messages in Section 4, we
define the following:
[int] A 4 bytes integer
[short] A 2 bytes unsigned integer
[string] A [short] n, followed by n bytes representing an UTF-8
[long string] An [int] n, followed by n bytes representing an UTF-8 string.
[string list] A [short] n, followed by n [string].
[bytes] An [int] n, followed by n bytes if n >= 0. If n < 0,
no byte should follow and the value represented is `null`.
[option] A pair of <id><value> where <id> is a [short] representing
the option id and <value> depends on that option (and can be
of size 0). The supported id (and the corresponding <value>)
will be described when this is used.
[option list] A [short] n, followed by n [option].
[inet] An address (ip and port) to a node. It consists of one
[byte] n, that represents the address size, followed by n
[byte] repesenting the IP address (in practice n can only be
either 4 (IPv4) or 16 (IPv6)), following by one [int]
representing the port.
[string map] A [short] n, followed by n pair <k><v> where <k> and <v>
are [string].
[string multimap] A [short] n, followed by n pair <k><v> where <k> is a
[string] and <v> is a [string list].
4. Messages
4.1. Requests
Note that outside of their normal responses (described below), all requests
can get an ERROR message (Section 4.2.1) as response.
4.1.1. STARTUP
Initialize the connection. The server will respond by either a READY message
(in which case the connection is ready for queries) or an AUTHENTICATE message
(in which case credentials will need to be provided using CREDENTIALS).
This must be the first message of the connection, except for OPTIONS that can
be sent before to find out the options supported by the server. Once the
connection has been initialized, a client should not send any more STARTUP
The body is a [string map] of options. Possible options are:
- "CQL_VERSION": the version of CQL to use. This option is mandatory and
currenty, the only version supported is "3.0.0". Note that this is
different from the protocol version.
- "COMPRESSION": the compression algorithm to use for frames (See section 5).
This is optional, if not specified no compression will be used.
Provides credentials information for the purpose of identification. This
message comes as a response to an AUTHENTICATE message from the server, but
can be use later in the communication to change the authentication
The body is a list of key/value informations. It is a [short] n, followed by n
pair of [string]. These key/value pairs are passed as is to the Cassandra
IAuthenticator and thus the detail of which informations is needed depends on
that authenticator.
The response to a CREDENTIALS is a READY message (or an ERROR message).
4.1.3. OPTIONS
Asks the server to return what STARTUP options are supported. The body of an
OPTIONS message should be empty and the server will respond with a SUPPORTED
4.1.4. QUERY
Performs a CQL query. The body of the message consists of a CQL query as a [long
The server will respond to a QUERY message with a RESULT message, the content
of which depends on the query.
4.1.5. PREPARE
Prepare a query for later execution (through EXECUTE). The body consists of
the CQL query to prepare as a [long string].
The server will respond with a RESULT message with a `prepared` kind (0x00003,
see Section 4.2.5).
4.1.6. EXECUTE
Executes a prepared query. The body of the message must be:
- <id> is the prepared query ID. It's an [int] returned as a response to a
PREPARE message.
- <n> is a [short] indicating the number of following values.
- <value_1>...<value_n> are the [bytes] to use for bound variables in the
prepared query.
The response from the server will be a RESULT message.
Register this connection to receive some type of events. The body of the
message is a [string list] representing the event types to register to. See
section 4.2.6 for the list of valid event types.
The response to a REGISTER message will be a READY message.
Please note that if a client driver maintains multiple connections to a
Cassandra node and/or connections to multiple nodes, it is advised to
dedicate a handful of connections to receive events, but to *not* register
for events on all connections, as this would only result in receiving
multiple times the same event messages, wasting bandwidth.
4.2. Responses
This section describes the content of the frame body for the different
responses. Please note that to make room for future evolution, clients should
support extra informations (that they should simply discard) to the one
described in this document at the end of the frame body.
4.2.1. ERROR
Indicates an error processing a request. The body of the message will be an
error code ([int]) followed by a [string] error message. Then, depending on
the exception, more content may follow. The error codes are defined in
Section 6, along with their additional content if any.
4.2.2. READY
Indicates that the server is ready to process queries. This message will be
sent by the server either after a STARTUP message if no authentication is
required, or after a successful CREDENTIALS message.
The body of a READY message is empty.
Indicates that the server require authentication. This will be sent following
a STARTUP message and must be answered by a CREDENTIALS message from the
client to provide authentication informations.
The body consists of a single [string] indicating the full class name of the
IAuthenticator in use.
Indicates which startup options are supported by the server. This message
comes as a response to an OPTIONS message.
The body of a SUPPORTED message is a [string multimap]. This multimap gives
for each of the supported STARTUP options, the list of supported values.
4.2.5. RESULT
The result to a query (QUERY, PREPARE or EXECUTE messages).
The first element of the body of a RESULT message is an [int] representing the
`kind` of result. The rest of the body depends on the kind. The kind can be
one of:
0x0001 Void: for results carrying no information.
0x0002 Rows: for results to select queries, returning a set of rows.
0x0003 Set_keyspace: the result to a `use` query.
0x0004 Prepared: result to a PREPARE message
The body for each kind (after the [int] kind) is defined below. Void
The rest of the body for a Void result is empty. It indicates that a query was
successful without providing more information. Rows
Indicates a set of rows. The rest of body of a Rows result is:
- <metadata> is composed of:
- <flags> is an [int]. The bits of <flags> provides information on the
formatting of the remaining informations. A flag is set if the bit
corresponding to its `mask` is set. Supported flags are, given there
0x0001 Global_tables_spec: if set, only one table spec (keyspace
and table name) is provided as <global_table_spec>. If not
set, <global_table_spec> is not present.
- <columns_count> is an [int] representing the number of columns selected
by the query this result is of. It defines the number of <col_spec_i>
elements in and the number of element for each row in <rows_content>.
- <global_table_spec> is present if the Global_tables_spec is set in
<flags>. If present, it is composed of two [string] representing the
(unique) keyspace name and table name the columns return are of.
- <col_spec_i> specifies the columns returned in the query. There is
<column_count> such column specification that are composed of:
The initial <ksname> and <tablename> are two [string] are only present
if the Global_tables_spec flag is not set. The <column_name> is a
[string] and <type> is an [option] that correspond to the column name
and type. The option for <type> is either a native type (see below),
in which case the option has no value, or a 'custom' type, in which
case the value is a [string] representing the full qualified class
name of the type represented. Valid option ids are:
0x0000 Custom: the value is a [string], see above.
0x0001 Ascii
0x0002 Bigint
0x0003 Blob
0x0004 Boolean
0x0005 Counter
0x0006 Decimal
0x0007 Double
0x0008 Float
0x0009 Int
0x000A Text
0x000B Timestamp
0x000C Uuid
0x000D Varchar
0x000E Varint
0x000F Timeuuid
0x0010 Inet
0x0020 List: the value is an [option], representing the type
of the elements of the list.
0x0021 Map: the value is two [option], representing the types of the
keys and values of the map
0x0022 Set: the value is an [option], representing the type
of the elements of the set
- <rows_count> is an [int] representing the number of rows present in this
result. Those rows are serialized in the <rows_content> part.
- <rows_content> is composed of <row_1>...<row_m> where m is <rows_count>.
Each <row_i> is composed of <value_1>...<value_n> where n is
<columns_count> and where <value_j> is a [bytes] representing the value
returned for the jth column of the ith row. In other words, <rows_content>
is composed of (<rows_count> * <columns_count>) [bytes]. Set_keyspace
The result to a `use` query. The body (after the kind [int]) is a single
[string] indicating the name of the keyspace that has been set. Prepared
The result to a PREPARE message. The rest of the body of a Prepared result is:
- <id> is an [int] representing the prepared query ID.
- <metadata> is defined exactly as for a Rows RESULT (See section
4.2.6. EVENT
And event pushed by the server. A client will only receive events for the
type it has REGISTER to. The body of an EVENT message will start by a
[string] representing the event type. The rest of the message depends on the
event type. The valid event types are:
- "TOPOLOGY_CHANGE": events related to change in the cluster topology.
Currently, events are sent when new nodes are added to the cluster, and
when nodes are removed. The body of the message (after the event type)
consists of a [string] and an [inet], corresponding respectively to the
type of change ("NEW_NODE" or "REMOVED_NODE") followed by the address of
the new/removed node.
- "STATUS_CHANGE": events related to change of node status. Currently,
up/down events are sent. The body of the message (after the event type)
consists of a [string] and an [inet], corresponding respectively to the
type of status change ("UP" or "DOWN") followed by the address of the
concerned node.
All EVENT message have a streamId of -1 (Section 2.3).
5. Compression
Frame compression is supported by the protocol, but then only the frame body
is compressed (the frame header should never be compressed).
Before being used, client and server must agree on a compression algorithm to
use, which is done in the STARTUP message. As a consequence, a STARTUP message
must never be compressed. However, once the STARTUP frame has been received
by the server can be compressed (including the response to the STARTUP
request). Frame do not have to be compressed however, even if compression has
been agreed upon (a server may only compress frame above a certain size at its
discretion). A frame body should be compressed if and only if the compressed
flag (see Section 2.2) is set.
6. Error codes
The supported error codes are described below:
0x0000 Server error: something unexpected happened. This indicates a
server-side bug.
0x000A Protocol error: some client message triggered a protocol
violation (for instance a QUERY message is sent before a STARTUP
one has been sent)
0x1000 Unavailable exception. The rest of the ERROR message body will be
<cl> is a [string] representing the consistency level of the
query having triggered the exception.
<required> is an [int] representing the number of node that
should be alive to respect <cl>
<alive> is an [int] representing the number of replica that
were known to be alive when the request has been
processed (since an unavailable exception has been
triggered, there will be <alive> < <required>)
0x1001 Overloaded: the request cannot be processed because the
coordinator node is overloaded
0x1002 Is_bootstrapping: the request was a read request but the
coordinator node is bootstrapping
0x1003 Truncate_error: error during a truncation error.
0x1100 Write_timeout: Timeout exception during a write request. The rest
of the ERROR message body will be
<cl> is a [string] representing the consistency level of the
query having triggered the exception.
<received> is an [int] representing the number of nodes having
acknowledged the request.
<blockfor> is the number of replica whose acknowledgement is
required to achieve <cl>.
0x1200 Read_timeout: Timeout exception during a read request. The rest
of the ERROR message body will be
<cl> is a [string] representing the consistency level of the
query having triggered the exception.
<received> is an [int] representing the number of nodes having
answered the request.
<blockfor> is the number of replica whose response is
required to achieve <cl>. Please note that it is
possible to have <received> >= <blockfor> if
<data_present> is false. And also in the (unlikely)
case were <cl> is achieved but the coordinator node
timeout while waiting for read-repair
<data_present> is a single byte. If its value is 0, it means
the replica that was asked for data has not
responded. Otherwise, the value is != 0.
0x2000 Syntax_error: The submitted query has a syntax error.
0x2100 Unauthorized: The logged user doesn't have the right to perform
the query.
0x2200 Invalid: The query is syntactically correct but invalid.
0x2300 Config_error: The query is invalid because of some configuration issue
0x2400 Already_exists: The query attempted to create a keyspace or a
table that was already existing. The rest of the ERROR message
body will be <ks><table> where:
<ks> is a [string] representing either the keyspace that
already exists, or the keyspace in which the table that
already exists is.
<table> is a [string] representing the name of the table that
already exists. If the query was attempting to create a
keyspace, <table> will be present but will be the empty
Jump to Line
Something went wrong with that request. Please try again.