RDB dump data format

Baoyi Chen edited this page Jun 13, 2018 · 34 revisions

(Everyone can edit this wiki directly.)

The E-BNF

RDB        =    'REDIS', $version, [AUX], [MODULE_AUX], {DBSELECT, [DBRESIZE], {RECORD}}, '0xFF', [$checksum];
RECORD     =    [EXPIRED], [IDLE | FREQ], KEY, VALUE;
DBSELECT   =    '0xFE', $length;
AUX        =    '0xFA', $string, $string;              (*Introduced in rdb version 7*)
MODULE_AUX =    '0xF7', $module2;                      (*Introduced in rdb version 9*)
DBRESIZE   =    '0xFB', $length, $length;              (*Introduced in rdb version 7*)
EXPIRED    =    ('0xFD', $second) | ('0xFC', $millisecond);
IDLE       =    '0xF8', $value-type;                   (*Introduced in rdb version 9*)
FREQ       =    '0xF9', $length;                       (*Introduced in rdb version 9*)
KEY        =    $string;
VALUE      =    $value-type, ( $string
                             | $list
                             | $set
                             | $zset
                             | $hash
                             | $zset2                  (*Introduced in rdb version 8*)
                             | $module                 (*Introduced in rdb version 8*)
                             | $module2                (*Introduced in rdb version 8*)
                             | $hashzipmap
                             | $listziplist
                             | $setintset
                             | $zsetziplist
                             | $hashziplist
                             | $listquicklist          (*Introduced in rdb version 7*)
                             | $streamlistpacks);      (*Introduced in rdb version 9*)

RECORD

If EXPIRED is empty. that represents this key value pair doesn't have an expiry.

DBSELECT

The $length represents the selected db number.

AUX

Represents aux fields.

  1. The firset $string represents aux key.
  2. The second $string represents aux value.

For example:

auxKey='redis-ver', auxValue='4.0.2'
auxKey='redis-bits', auxValue='64'
auxKey='ctime', auxValue='1486560833'
auxKey='used-mem', auxValue='568376'
auxKey='aof-preamble', auxValue='0'
auxKey='repl-id', auxValue='ff8b0d184a8f5d5a849fa23fb0f832f13ecfcdb8'
auxKey='repl-offset', auxValue='0'
auxKey='repl-stream-db', auxValue='0'

DBRESIZE

  1. The first $length represents db key counts in selected db number.
  2. The second $length represents expired keys in selected db number.

EXPIRED

Expired by $second or by $millisecond

MODULE_AUX

Check if specified module exist. but only compare module name except module version.
Useless in rdb version 9. more details please refer to $module2.

FREQ

Read 1 byte int value. represent the frequency of the key operating.

IDLE

The $length represent the idle seconds of the key.

$version

The 4 bytes store the version number of the rdb format. The 4 bytes are interpreted as ascii characters and then converted to an integer using string to integer conversion.

00 00 00 03 # Version = 3

$millisecond

The 8 bytes represent the unix time. This number is an unix timestamp in milliseconds precision, and represents the expiry of this key.

$second

The 4 bytes represent the unix time. This number is an unix timestamp in seconds precision, and represents the expiry of this key.

$value-type

A 1 byte flag indicates encoding used to save the Value.

  • 0 = $string
  • 1 = $list
  • 2 = $set
  • 3 = $zset
  • 4 = $hash
  • 5 = $zset2
  • 6 = $module
  • 7 = $module2
  • 9 = $hashzipmap
  • 10 = $listziplist
  • 11 = $setintset
  • 12 = $zsetziplist
  • 13 = $hashziplist
  • 14 = $listquicklist
  • 15 = $streamlistpacks

$length

$length is used to store the length of the next object in the stream. $length is a variable byte encoding designed to use as few bytes as possible.

This is how $length works :

  1. 1 byte is read from the stream, and the 2 most significant bits are read.
  2. If starting bits are 00, then the next 6 bits represent the length
  3. If starting bits are 01, then an additional byte is read from the stream. The combined 14 bits represent the length
  4. If starting bits are 11, then the next object is encoded in a special format. The remaining 6 bits indicate the format. This encoding is generally used to store numbers as strings, or to store encoded strings. See $string
  5. If read byte are 0x80 then read next 4 bytes represent the length
  6. If read byte are 0x81 then read next 8 bytes represent the length

As a result of this encoding -

  1. Numbers upto and including 63 can be stored in 1 byte
  2. Numbers upto and including 16383 can be stored in 2 bytes
  3. Numbers upto 2^32 -1 can be stored in 5 bytes
  4. Numbers upto 2^64 -1 can be stored in 9 bytes

$string

Redis Strings are binary safe - which means you can store anything in them. They do not have any special end-of-string token. It is best to think of Redis Strings as a byte array.

There are three types of Strings in Redis -

  1. Length prefixed strings
  2. An 8, 16 or 32 bit integer
  3. A LZF compressed string

Length prefixed strings

Length prefixed strings are quite simple. The length of the string in bytes is first encoded using $length. After this, the raw bytes of the string are stored.

Integers as String

First read the section $length, specifically the part when the first two bits are 11. In this case, the remaining 6 bits are read. If the value of those 6 bits is -

  1. 0 indicates that an 8 bit integer follows
  2. 1 indicates that a 16 bit integer follows
  3. 2 indicates that a 32 bit integer follows

Compressed Strings

First read the section $length, specifically the part when the first two bits are 11. In this case, the remaining 6 bits are read. If the value of those 6 bits is 4, it indicates that a compressed string follows.

The compressed string is read as follows -

  1. The compressed length clen is read from the stream using $length
  2. The uncompressed length is read from the stream using $length
  3. The next clen bytes are read from the stream
  4. Finally, these bytes are decompressed using LZF algorithm

$list

A redis list is represented as a sequence of strings.

  1. The size of the list size is read from the stream using $length
  2. size strings are read from the stream using $string
  3. The list is then re-constructed using these Strings

$set

Sets are encoded exactly like lists.

$zset

  1. First, the size of the sorted set size is read from the stream using $length
  2. Step 2, size strings are read from the stream using $string
  3. Step 3, read 1 byte as length, and then read the length bytes as string. then converted this string to double.
  4. The list is then re-constructed using step 2 and step 3.

$hash

  1. First, the size of the hash size is read from the stream using $length
  2. Next, 2 * size strings are read from the stream using $string
  3. Alternate strings are key and values
  4. For example, 2 us washington india delhi represents the map {"us" => "washington", "india" => "delhi"}

$zset2

  1. First, the size of the sorted set size is read from the stream using $length
  2. Step 2, size strings are read from the stream using $string
  3. Step 3, read 8 byte as double.
  4. The list is then re-constructed using step 2 and step 3.

$module

  1. Read from the stream using $length
  2. Extract module id, module version from step 1.
  3. Read the module format using spec module id and module version

Extract example

  1. Above step 1 read the 64 bits long are -8797388646930352128L
  2. First 54 bits are 100001011110100101100101101000101101110010101001011110
  3. Group every 6 bits: 100001, 011110, 100101, 100101, 101000, 101101, 1100101, 010010, 011110
  4. Converted groups bits to int: 33, 30, 37, 37, 40, 45, 50, 41, 30
  5. Lookup table [A-Za-z0-9-_] using result of step 4, We can get module name is hellotype
  6. The last 10 bits 0000000000 to int &1023 is module version, We can get module version is 0

$module2

  1. Read from the stream using "$length"
  2. Extract module id, module version from step 1.
  3. Read the module format using spec module id and module version
  4. Read the EOF using $length, make sure the EOF equals 0

Extract example

  1. Above step 1 read the 64 bits long are -8797388646930352128L
  2. First 54 bits are 100001011110100101100101101000101101110010101001011110
  3. Group every 6 bits: 100001, 011110, 100101, 100101, 101000, 101101, 1100101, 010010, 011110
  4. Converted groups bits to int: 33, 30, 37, 37, 40, 45, 50, 41, 30
  5. Lookup table [A-Za-z0-9-_] using result of step 4, We can get module name is hellotype
  6. The last 10 bits 0000000000 to int &1023 is module version, We can get module version is 0

$hashzipmap

NOTE : $hashzipmap encoding are deprecated starting Redis 2.6. Small hashmaps are now encoded using ziplists.

A $hashzipmap is a hashmap that has been serialized to a string. In essence, the key value pairs are stored sequentially. Looking up a key in this structure is O(N). This structure is used instead of a dictionary when the number of key value pairs are small.

To parse a zipmap, first a string is read from the stream using "$string". This string is the envelope of the zipmap. The contents of this string represent the zipmap.

The structure of a zipmap within this string is as follows - <zmlen><len>"foo"<len><free>"bar"<len>"hello"<len><free>"world"<zmend>

  1. zmlen : Is a 1 byte length that holds the size of the zip map. If it is greater than or equal to 254, value is not used. You will have to iterate the entire zip map to find the length.
  2. len : Is the length of the following string, which can be either a key or a value. This length is stored in either 1 byte or 5 bytes (yes, it differs from $length described above). If the first byte is between 0 and 252, that is the length of the zipmap. If the first byte is 253, then the next 4 bytes read as an unsigned integer represent the length of the zipmap. 254 and 255 are invalid values for this field.
  3. free : This is always 1 byte, and indicates the number of free bytes after the value. For example, if the value of a key is "America" and its get updated to "USA", 4 free bytes will be available.
  4. zmend : Always 255. Indicates the end of the zipmap.

Worked Example 18 02 06 4d 4b 44 31 47 36 01 00 32 05 59 4e 4e 58 4b 04 00 46 37 54 49 ff ..

  1. Start by decoding this using $string. You will notice that 18 is the length of the string. Accordingly, we will read the next 24 bytes i.e. upto FF
  2. Now, we are parsing the string starting at 02 06... using the $hashzipmap
  3. 02 is the number of entries in the hashmap.
  4. 06 is the length of the next string. Since this is less than 254, we don't have to read any additional bytes
  5. We read the next 6 bytes i.e. 4d 4b 44 31 47 36 to get the key "MKD1G6"
  6. 01 is the length of the next string, which would be the value
  7. 00 is the number of free bytes
  8. We read the next 1 byte(s), which is 0x32. Thus, we get our value "2"
  9. In this case, the free bytes is 0, so we don't skip anything
  10. 05 is the length of the next string, in this case a key.
  11. We read the next 5 bytes 59 4e 4e 58 4b, to get the key "YNNXK"
  12. 04 is the length of the next string, which is a value
  13. 00 is the number of free bytes after the value
  14. We read the next 4 bytes i.e. 46 37 54 49 to get the value "F7TI"
  15. Finally, we encounter FF, which indicates the end of this zip map
  16. Thus, this zip map represents the hash {"MKD1G6" => "2", "YNNXK" => "F7TI"}

$listziplist

A $listziplist is a list that has been serialized to a string. In essence, the elements of the list are stored sequentially along with flags and offsets to allow efficient traversal of the list in both directions.

To parse a ziplist, first a string is read from thee stream using $string. This string is the envelope of the ziplist. The contents of this string represent the ziplist.

The structure of a ziplist within this string is as follows - <zlbytes><zltail><zllen><entry><entry><zlend>

  1. zlbytes : This is a 4 byte unsigned integer representing the total size in bytes of the zip list. The 4 bytes are in little endian format - the least signinficant bit comes first.
  2. zltail : This is a 4 byte unsigned integer in little endian format. It represents the offset to the tail (i.e. last) entry in the zip list
  3. zllen : This is a 2 byte unsigned integer in little endian format. It represents the number of entries in this zip list
  4. entry : An entry represents an element in the zip list. Details below
  5. zlend : Is always equal to 255. It represents the end of the zip list.

Each entry in the zip list has the following format : <length-prev-entry><special-flag><raw-bytes-of-entry>

  1. length-prev-entry : This field stores the length of the previous entry, or 0 if this is the first entry. This allows easy traversal of the list in the reverse direction. This length is stored in either 1 byte or in 5 bytes. If the first byte is less than or equal to 253, it is considered as the length. If the first byte is 254, then the next 4 bytes are used to store the length. The 4 bytes are read as an unsigned integer.

  2. Special flag : This flag indicates whether the entry is a string or an integer. It also indicates the length of the string, or the size of the integer. The various encodings of this flag are shown below :

  • |00xxxxxx| - 1 byte : String value with length less than or equal to 63 bytes (6 bits).
  • |01xxxxxx|xxxxxxxx| - 2 bytes : String value with length less than or equal to 16383 bytes (14 bits).
  • |10______|xxxxxxxx|xxxxxxxx|xxxxxxxx|xxxxxxxx| - 5 bytes : String value with length greater than or equal to 16384 bytes.
  • |1100____| - Read next 2 bytes as a 16 bit signed integer
  • |1101____| - Read next 4 bytes as a 32 bit signed integer
  • |1110____| - Read next 8 bytes as a 64 bit signed integer
  • |11110000| - Read next 3 bytes as a 24 bit signed integer
  • |11111110| - Read next byte as an 8 bit signed integer
  • |1111xxxx| - (with xxxx between 0000 and 1101) immediate 4 bit integer. Unsigned integer from 0 to 12. The encoded value is actually from 1 to 13 because 0000 and 1111 can not be used, so 1 should be subtracted from the encoded 4 bit value to obtain the right value.
  1. Raw Bytes : After the special flag, the raw bytes of entry follow. The number of bytes was previously determined as part of the special flag.

Worked Example 1

23 23 00 00 00 1e 00 00 00 04 00 00 e0 ff ff ff ff ff ff ff 7f 0a d0 ff ff 00 00 06 c0 fc 3f 04 c0 3f 00 ff ... 
  |           |           |     |                             |                 |           |           |       

  1. Start by decoding this using $string. 23 is the length of the string, therefore we will read the next 35 bytes till ff
  2. Now, we are parsing the string starting at 23 00 00 ... using $hashzipmap
  3. The first 4 bytes 23 00 00 00 represent the total length in bytes of this ziplist. Notice that this is in little endian format
  4. The next 4 bytes 1e 00 00 00 represent the offset to the tail entry. 1e = 30, and this is a 0 based offset. 0th position = 23, 1st position = 00 and so on. It follows that the last entry starts at 04 c0 3f 00 ..
  5. The next 2 bytes 04 00 represent the number of entries in this list.
  6. From now on, we start reading the entries
  7. 00 represents the length of previous entry. 0 indicates this is the first entry.
  8. e0 is the special flag. Since it starts with the bit pattern 1110____, we read the next 8 bytes as an integer. This is the first entry of the list.
  9. We now start the second entry
  10. 0a is the length of the previous entry. 10 bytes = 1 byte for prev. length + 1 byte for special flag + 8 bytes for integer.
  11. d0 is the special flag. Since it starts with the bit pattern 1101____, we read the next 4 bytes as an integer. This is the second entry of the list
  12. We now start the third entry
  13. 06 is the length of previous entry. 6 bytes = 1 byte for prev. length + 1 byte for special flag + 4 bytes for integer
  14. c0 is the special flag. Since it starts with the bit pattern 1100____, we read the next 2 bytes as an integer. This is the third entry of the list
  15. We now start the last entry
  16. 04 is length of previous entry
  17. c0 indicates a 2 byte number
  18. We read the next 2 bytes, which gives us our fourth entry
  19. Finally, we encounter ff, which tells us we have consumed all elements in this ziplist.
  20. Thus, this ziplist stores the values [0x7fffffffffffffff, 65535, 16380, 63]

$setintset

An $setintset is a binary search tree of integers. The binary tree is implemented in an array of integers. An intset is used when all the elements of the set are integers. An Intset has support for upto 64 bit integers. As an optimization, if the integers can be represented in fewer bytes, the array of integers will be constructed from 16 bit or 32 bit integers. When a new element is inserted, the implementation takes care to upgrade if necessary.

Since an Intset is a binary search tree, the numbers in this set will always be sorted.

An Intset has an external interface of a Set.

To parse an Intset, first a string is read from thee stream using $string. This string is the envelope of the Intset. The contents of this string represent the Intset.

Within this string, the Intset has a very simple layout : <encoding><length-of-contents><contents>

  1. encoding : is a 32 bit unsigned integer. It has 3 possible values - 2, 4 or 8. It indicates the size in bytes of each integer stored in contents. And yes, this is wasteful - we could have stored the same information in 2 bits.
  2. length-of-contents : is a 32 bit unsigned integer, and indicates the length of the contents array
  3. contents : is an array of $length-of-contents bytes. It contains the binary tree of integers

Example 14 04 00 00 00 03 00 00 00 fc ff 00 00 fd ff 00 00 fe ff 00 00 ...

  1. Start by decoding this using $string. 14 is the length of the string, therefore we will read the next 20 bytes till 00
  2. Now, we start interpreting the string starting at 04 00 00 ...
  3. The first 4 bytes 04 00 00 00 is the encoding. Since this evaluates to 4, we know we are dealing with 32 bit integers
  4. The next 4 bytes 03 00 00 00 is the length of contents. So, we know we are dealing with 3 integers, each 4 byte long
  5. From now on, we read in groups of 4 bytes, and convert it into a unsigned integer
  6. Thus, our intset looks like - 0x0000FFFC, 0x0000FFFD, 0x0000FFFE. Notice that the integers are in little endian format i.e. least significant bit came first.

$zsetziplist

A $zsetziplist encoding is stored just like the Ziplist described above. Each element in the sorted set is followed by its score in the ziplist.

Example ['Manchester City', 1, 'Manchester United', 2, 'Tottenham', 3]

As you see, the scores follow each element.

$hashziplist

In this, key=value pairs of a hashmap are stored as successive entries in a ziplist.

Note : This was introduced in rdb version 4. This deprecates zipmap encoding that was used in earlier versions.

Example {"us" => "washington", "india" => "delhi"}

is stored in a ziplist as :

["us", "washington", "india", "delhi"]

$listquicklist

  1. Read length using $length.
  2. For each length read string using $string
  3. Each string read from step 2 can show the following format <zlbytes><zltail><zllen><entry>...<entry><zlend>
  4. Using above steps to re-constructed the list.

The explanation of step 3.

  1. zlbytes : This is a 4 byte unsigned integer representing the total size in bytes of the zip list. The 4 bytes are in little endian format - the least signinficant bit comes first.
  2. zltail : This is a 4 byte unsigned integer in little endian format. It represents the offset to the tail (i.e. last) entry in the zip list
  3. zllen : This is a 2 byte unsigned integer in little endian format. It represents the number of entries in this zip list
  4. entry : An entry represents an element in the zip list. Details below
  5. zlend : Is always equal to 255. It represents the end of the zip list.

Each entry in the zip list has the following format : <length-prev-entry><special-flag><raw-bytes-of-entry>

  1. length-prev-entry : This field stores the length of the previous entry, or 0 if this is the first entry. This allows easy traversal of the list in the reverse direction. This length is stored in either 1 byte or in 5 bytes. If the first byte is less than or equal to 253, it is considered as the length. If the first byte is 254, then the next 4 bytes are used to store the length. The 4 bytes are read as an unsigned integer.

  2. Special flag : This flag indicates whether the entry is a string or an integer. It also indicates the length of the string, or the size of the integer. The various encodings of this flag are shown below :

  • |00xxxxxx| - 1 byte : String value with length less than or equal to 63 bytes (6 bits).
  • |01xxxxxx|xxxxxxxx| - 2 bytes : String value with length less than or equal to 16383 bytes (14 bits).
  • |10______|xxxxxxxx|xxxxxxxx|xxxxxxxx|xxxxxxxx| - 5 bytes : String value with length greater than or equal to 16384 bytes.
  • |1100____| - Read next 2 bytes as a 16 bit signed integer
  • |1101____| - Read next 4 bytes as a 32 bit signed integer
  • |1110____| - Read next 8 bytes as a 64 bit signed integer
  • |11110000| - Read next 3 bytes as a 24 bit signed integer
  • |11111110| - Read next byte as an 8 bit signed integer
  • |1111xxxx| - (with xxxx between 0000 and 1101) immediate 4 bit integer. Unsigned integer from 0 to 12. The encoded value is actually from 1 to 13 because 0000 and 1111 can not be used, so 1 should be subtracted from the encoded 4 bit value to obtain the right value.
  1. Raw Bytes : After the special flag, the raw bytes of entry follow. The number of bytes was previously determined as part of the special flag.

$streamlistpacks

$listpack

The listpack format is like the following:
<total-bytes><num-elements><listpack-entry>...<listpack-entry><lpend>
<total-bytes> : 4 bytes unsigned integer.
<num-elements> : 2 bytes unsigned integer.

<listpack-entry> discribed as following:
<encoding-type><element-data><element-tot-len>
The encoding type is basically useful to understand what kind of data follows since strings can be encoded as little endian integers, and strings can have multiple string length fields bits in order to reduce space usage. The element data is the data itself, like an integer or an array of bytes representing a string. Finally the element total length, is used in order to traverse the list backward from the end of the listpack to its head, and is needed since otherwise there is no unique way to parse the entry from right to left, so we need to be able to jump to the left of the specified amount of bytes.
<encoding-type> :
|0xxxxxxx| 7 bit unsigned integer
|10xxxxxx| 6 bit unsigned integer as string length. then read the length bytes as string.
|110xxxxx|xxxxxxxx| 13 bit signed integer.
|1110xxxx|xxxxxxxx| string with length up to 4095.
|11110001|xxxxxxxx|xxxxxxxx| next 2 bytes as 16bit int.
|11110010|xxxxxxxx|xxxxxxxx|xxxxxxxx| next 3 bytes as 24bit int.
|11110011|xxxxxxxx|xxxxxxxx|xxxxxxxx|xxxxxxxx| next 4 bytes as 32bit int.
|11110100|xxxxxxxx|xxxxxxxx|xxxxxxxx|xxxxxxxx|xxxxxxxx|xxxxxxxx|xxxxxxxx|xxxxxxxx| next 8 bytes as 64bit long.
|11110000|xxxxxxxx|xxxxxxxx|xxxxxxxx|xxxxxxxx| next 4 bytes as string length. then read the length bytes as string.

<element-tot-len> :
For example, 500, two bytes will be required. The binary representation of 500 is the following:
111110100
We can split the representation in two 7-bit halves:
0000011 1110100
Note that, since we parse the entry length from right to left, the entry is stored in big endian (but it's not vanilla big endian since only 7 bits are used and the 8th bit is used to signal the more bytes condition).

However we need to also add the bit to signal if there are more bytes, so the final representation will be:

[0]0000011          [1]1110100
 |                   |
 `- no more bytes    `- more bytes to the left!

The actual encoding will be:
"\xf4\x03" -- 500 bytes entry length

<lpend> : 0xff

$stream

<listpack-length><stream-id><listpack>...<stream-id><listpack><active-len><last-id><group-length><group>...<group>
Read $length as listpack-length.
Read 16 bytes as stream-id.
Read $string as listpack. the listpack format discribed in above.
The first stream entry named Master Entry as following:

 +-------+---------+------------+---------+--/--+---------+---------+-+-----+--------+-------+-/-+-------+--------+
 | count | deleted | num-fields | field_1 | field_2 | ... | field_N |0|flags|entry-id|value-1|...|value-N|lp-count|
 +-------+---------+------------+---------+--/--+---------+---------+-+-----+--------+-------+-/-+-------+--------+

Notice that every element is a <listpack-entry> discribed in $listpack

Other stream entry's format is like the following:

+----------+-------+-------+-/-+-------+-------+--------+
|num-fields|field-1|value-1|...|field-N|value-N|lp-count|
+----------+-------+-------+-/-+-------+-------+--------+

<active-len> : $length, the active length of stream entries.
<last-id> : 16 bytes as last entry's stream-id
<group-length> : $length

<group> discribed as following:
<group-name><group-last-id><gpel-length><gpel>...<gpel><consumer-length><consumer>...<consumer>
<group-name> : $string
<group-last-id> : 16 bytes as group last entry's stream-id
<gpel-length> : $length

<gpel> discribed as following:
<entry-id><delivery-time><delivery-count>
<entry-id> : 16 bytes as entry's stream-id
<delivery-time> : $millisecond
<delivery-count> : $length
<consumer-length> : $length

consumer discribed as following:
<consumer-name><seen-time><cpel-len><entry-id>...<entry-id>
<consumer-name> : $string
<seen-time> : $millisecond
<cpel-len> : $length
<entry-id> : 16 bytes as entry's stream-id

$checksum

Starting with RDB Version 5, an 8 byte CRC 64 checksum is added to the end of the file. It is possible to disable this checksum via a parameter in redis.conf. When checksum is disabled, this field will have zeroes.

References

You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.