## Rice coding

### Introduction

As we have seen, Rice coding can be applied to reduce the bits required to represent a number. Rice coding is a specialised form of [Golomb coding](https://en.wikipedia.org/wiki/Golomb_coding).

Let's analyse the Rice's coding algorithm:

### Rice's algorithm

**Encoding**
1. Fix an integer value K.
2. Compute the modulus, M by using the equation $ M = 2^K $
3. For S, the number to be encoded, find
    1. quotient = $ q = int(S/M) $
    2. remainder = $ r = S  modulo  M $
4. Generate Codeword
    1. The Quotient_Code is q in unary format.
    2. The Remainder_Code is r in binary using K bits.
    3. The Codeword will have the format <Quotient_Code\> <Remainder_Code\>

**Decoding**
1. Determine q by counting the number of 1s before the first 0.
2. Determine r reading the next K bits as a binary value.
3. Write out S, the encoded number, as q × M + r.

### Examples

**Example 1. Encoding**

Let's encode the 8-bit value 19 (00010011).

1. Fix an integer value K.

Let's say K = 4. Why 4? For any particular reason. It could also be 2, 3, 5, etc. You should analyse the data to be encoded to determine the optimal K. Thus, in many applications, a two-pass approach is implemented. First, the block of data is analysed, second the optimal K is determined.

2. Compute the modulus, M using by the equation $ M = 2^K $

This is easy. $ M = 2^4 = 16 $ 

3. For S, the number to be encoded, find
    1. quotient = $ q = int(S/M) $
    2. remainder = $ r = S  modulo  M $

This is also easy. $ q = int(19/16) = 1 $ and $ r = 19  modulo  16 = 3 $

4. (1) The Quotient_Code is q in unary format.

In unary coding a value N may be represented by N 1s followed by a 0.

So, for example, 2 in unary may be represented by 110, 3 by 1110, etc. (Note: 0 in unary is 0).

So, $ q = int(19/16) = 1 = 10 $ (in unary)

4. (2) The Remainder_Code is r in binary using K bits.

r = 3 in binary using 4 bits is 0011

4. (3) The Codeword will have the format <Quotient_Code\> <Remainder_Code\>

Thus, 19 (00010011) can be written as 100011, saving 2 bits.

**Note**: as we have seen in the previous video, the remainder can be 'simplified' by removing the leading zeros. So r = 0011 = 11, and 20 can be encoded as 1010. This approach is correct if we encode a single number but, as we will see, it is not an useful approach if we want to enconde a block of data (several numbers).

**Example 2. Decoding**

Now, let's decode the encoded value 100011 when K = 4 (M = 16).

1. Determine q by counting the number of 1s before the first 0.

The number of 1s before the first 0 is '1'. So, $ q = 1 $ (in decimal)

2. Determine r reading the next K bits as a binary value.

The next 4 bits are 0011, so $ r = 0011 = 3 $ (in decimal)

3. Write out S, the encoded number, as q × M + r.

S = 1 × 16 + 3 = 19

**Example 3. Encoding**

Let's encode the 8-bit value 33 (00100001), with $ K = 4 $.

$ q = 2 = 110 $ (in unary)  and $ r = 1 = 0001 $ (in decimal, using K bits)  

So, 33 (00100001) can be written as 1100001, saving 1 bit.

### Exercise 1. Rice coding 'by hand'

**Exercise 1.1.**

Encode the 8-bit value 23 (00010111), with $ K = 4 $.

**Exercise 1.2.**

Encode the 8-bit value 51 (00110011), with $ K = 5 $.

**Exercise 1.3.**

Decode the encoded value 1100011 when $ K = 4 $.

**Exercise 1.4.**

The list of 8-bit values 17, 25, 37 can be written as 00010001, 00011001, 00100101 or, without the commas, 000100010001100100100101.
Encode the 8-bit values 17, 25 and 37 with 𝐾=4 and generate a 'encoded' data block. Which data block is shorter, the encoded or the non-encoded one?

**Exercise 1.5.**

The following data block 110001110000111100101 contains (in this order) the encoded numbers *a*, *b*, and *c*, with 𝐾=4. What are the values of these numbers in decimal?

### Exercise 2. Rice coding implementation

**Exercise 2.1.**

Implement the Rice's algorithm in python and solve Exercise 1 using this application.

You can use as a reference the example of Rice coding/encoding detailed in http://michael.dipperstein.com/rice/index.html.

In [1]:
import wave
import numpy

In [2]:
# Read file to get buffer                                                                                               
ifile = wave.open("Sound1.wav")
samples = ifile.getnframes()


In [3]:
samples

501022

In [4]:
audio = ifile.readframes(samples)

In [11]:
len(audio)

1002044

In [12]:
type(audio)

bytes

In [13]:
from  scipy.io import wavfile 

sampling_rate, data = wavfile.read('Sound1.wav')
# wavfile.write('sound1.wav', sampling_rate, data)

In [14]:
type(data)

numpy.ndarray

In [164]:
data[0:100]

array([-7, -7, -7, -7, -8, -7, -6, -7, -5, -5, -6, -5, -6, -7, -5, -5, -5,
       -5, -4, -5, -4, -5, -5, -5, -6, -4, -3, -4, -5, -4, -3, -5, -4, -4,
       -5, -5, -3, -3, -1, -4, -3, -3, -2, -2, -3, -4, -2, -2, -4, -2, -2,
       -1, -3, -1, -2, -1, -2, -2, -1, -1,  1, -1,  1,  1,  1,  1,  0,  1,
        2,  2,  1,  2,  1,  1,  2,  0,  0,  1,  0,  1,  1,  0,  2,  0,  0,
        1,  3,  3,  0, -1,  1,  2,  0,  2,  2,  2,  1,  1,  1,  2],
      dtype=int16)

In [54]:
import numpy as np
import scipy.signal as signal
from pydub import AudioSegment
from pydub.utils import make_chunks
import math

audio_file = 'Sound1.wav'
audio = AudioSegment.from_file(audio_file)
audio_array = np.array(audio.get_array_of_samples())

In [95]:
test = audio_array[0]

In [79]:
s = 18

In [101]:
k = 4
m = 2 ** k 
m

16

In [98]:
q = math.floor(test/m)

In [104]:
q = test // m

In [107]:
q = 5

In [108]:
print('1' * q)

11111


In [97]:
r = test % m 
r

9

In [109]:
int(np.floor(np.log2(k)))

2

In [32]:
bin(q)[0:]  

'0b1'

In [91]:
def to_unary(num):
    int_string = ''
    for i in range (num):
        int_string +="1"
    return int_string+'0'
    

In [41]:
temp = to_unary

'0b1010'

In [88]:
def padded_binary(i, width):
    s = bin(i)
    return  s[2:].zfill(width)

In [89]:
temp = padded_binary(1,4)

In [90]:
type(temp)

str

In [72]:
x = format(1, f'#0{k+2}b')

In [73]:
type(x)

str

In [74]:
x

'0b0001'

In [51]:
bin(x)

TypeError: 'str' object cannot be interpreted as an integer

In [185]:
encoded_s = np.zeros(len(audio_array))
strings_list = []
k = 4
m = 2 ** k
for i in range (len(audio_array)):
    q = audio_array[i] // m
    r = audio_array[i] % m
    encoded_string = to_unary(q) + padded_binary(r, k)
    num = int(encoded_string, 2)
    strings_list.append(encoded_string)
#     encoded_s[i]= num
    if q < 0: 
        encoded_s[i] = (-1 * num)
    else: 
        encoded_s[i] = num
        
    
   

In [181]:

encoded_int = np.array(encoded_s, dtype='int16')

In [182]:
encoded_int[4000:4050]

array([    5,    96,   228,  2017,  8162,  1000,   -11,   -13,    -5,
           0,   -13,    -5,     5,    11,    -3,   -10,   -10,   -13,
          -4,   -13,     7,  2022,  8168,  2029,   102,   -12,   -12,
         -11,   238,  1000,   110,   -13,   -13,    -7,   -11,    -3,
          43,  4076,  4074,    97,    -5,    -3,    -2,     0,    10,
        2026, 32744,  8173,   228,    -3], dtype=int16)

In [183]:
audio_array[4000:4050]

array([   5,   32,   52,   97,  130,   88,   -5,  -67,  -75,  -64,  -51,
        -27,    5,   11,  -13,  -54, -102, -147, -156,  -99,    7,  102,
        136,  109,   38,  -36,  -52,   -5,   62,   88,   46,  -51, -163,
       -233, -213, -109,   27,  124,  122,   33,  -75, -125, -110,  -64,
         10,  106,  168,  141,   52,  -29], dtype=int16)

In [93]:
encoded

'100010'

In [94]:
my_num = int(encoded, 2)
my_num

34

In [103]:
bin(18)

'0b10010'

In [110]:
def rice_encode(signal, k):
    """Encodes a signal using Rice's algorithm."""
    prefix = int(np.floor(np.log2(k)))
    suffix = k - 2**prefix
    n = len(signal)
    encoded_signal = []
    for i in range(n):
        quotient = signal[i] // k
        remainder = signal[i] % k
        encoded_signal.append('1' * quotient + '0' + np.binary_repr(remainder, width=prefix))
    encoded_signal = ''.join(encoded_signal)
    return '1' * suffix + encoded_signal

k = 4  # Set the value of k
encoded_signal = rice_encode(audio_array, k)


In [111]:
encoded_signal[0]

'0'

In [112]:
encoded_signal[0:100]

'0010010010010000010100010110110100110100010110110110110000110000110110110100000010000110000010110000'

In [124]:
encoded_s = np.array(len(audio_array))

In [125]:
encoded_s

array(501022)

In [127]:
encoded_s = np.empty(1)

In [128]:
encoded_s.shape

(1,)