# Brief description of VByte encoing

**VByte** encoding is perhaps best understood through examples. VByte represents an integer $x$ as a sequence of one or more bytes. If $x < 128$ then the VByte encoding for $x$ is simply a single byte containing the value $128+x$. For example, the VByte code for $x = 23$ (which is $< 128$) is $151$, or `10010111` in binary. 

Bigger numbers $x \geq 128$ are encoded using more than one byte. For example, for $x = 500$, which is `111110100` in binary, VByte would output the following bytes (shown in binary):

<pre>
<span style="color:red"><b>0</b></span><u>1110100</u>
<span style="color:red"><b>1</b></span>00000<u>11</u>
</pre>



Consider the two bytes above (in decimal they are $116$ and $131$, respectively). Looking at the underlined bits of each byte, we see precisely the bits that make up the original number $x = 500 = $`111110100`. The lower order $7$ bits of $x$ are underlined in the first byte, and the remaining bits are underlined in the second byte.

Now look at the leftmost bit of each byte (in <span style="color:red">**bold**</span>), which is called the stop bit. In the first byte we have a $0$ and in the second byte a $1$. The $0$ in the first byte indicates that the encoded number $x = 500$ is greater than $127$ and that some more of its bits are contained in the next byte. So we read the next byte. It’s stop bit is $1$. This indicates that after this byte there are no more bits needed to reconstruct $x$.

So we have found the bits for $x$, but how do we reconstruct $x$ itself? The following calculation does the job: $x = B_0 + (B_1 -128)*128$, where $B0$ is the first byte above, and $B1$ is the second. Spend some time verifying this for yourself.

The scheme works for any size number. For example, the number $20,000,000$ (20 million), which is `1001100010010110100000000` in binary, would have the following VByte encoding:

```
00000000
01011010
01000100
10001001
```

So, after inspecting the $n$<sup>th</sup> byte of a VByte encoded number $x$ we have $7n$ bits of $x$ and know whether or not we need to inspect the next byte to decode some more of $x$ (stop bit = 0) or not (stop bit = 1). Spend some time writing down the equation to reconstruct $x$ from the four bytes above that encode $20,000,000$, as we did for the encoding of $500$.

Your first action for this task should be to write some pseudocode for VByte encoding and decoding. Once you think about it, you’ll see that we don’t need to look at bits directly at all: integer division and modulus (for encoding) and multiplication and addition (for decoding) are all that is required. To get you started, here is some pseudocode for encoding.

```
VByteEncodeInteger(i): (INPUT: one integer, OUTPUT: one or more bytes)
1 while true
2 	b = i mod 128
3 	if i < 128 then
4 		OutputByte(b+128) and BREAK
5	OutputByte(b)
5 	i ← i div 128
```

If you get really stuck in understanding VByte, contact one of the teachers - but do invest some time thinking about it before deciding if you really need to.
