# Video Codec

In this project, we're going to implement a simple codec for a video.

## Import Libraries

First of all, we need to import the neccessary modules e.g. OpenCV and Numpy.

In [1]:
import cv2
import numpy as np

## Capture video

Then we use OpenCV for capturing the mentioned video. we extract it's frame one by one, store each in a list named gray_frames. it's important to say that we'll work with GRAY frames(not colored oens).

In [2]:
cap = cv2.VideoCapture('sample 30 frame 1 min.mp4')
if (cap.isOpened()== False): 
    print("Error opening video stream or file")

gray_frames = []    
while(cap.isOpened()):
  
    ret, frame = cap.read()
    if ret == True:
        gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
        gray_frames.append(gray)
        cv2.imshow('Frame', gray)

        if cv2.waitKey(25) & 0xFF == ord('q'):
            break
    else: 
        break

cap.release()
cv2.destroyAllWindows()

## Preprocessing(Padding)

Since we're going to work with 8 * 8 blocks in each frame, we need to have frames with dimensions multiples of 8 or if not, we need to make their dimensions multiples of 8! So we'll pad the frames to reach this goal.

In [94]:
for i in range(len(gray_frames)):
    current_gray_frame = gray_frames[i]
    gray_frames[i] = cv2.copyMakeBorder(current_gray_frame, 2, 2, 0, 0, cv2.BORDER_REPLICATE) 
    

In [95]:
print(gray_frames[0].shape)

(548, 960)


## Encoder

In [96]:
motion_compensated_frames = [None] * len(gray_frames)
for i in range(len(gray_frames)):
    if i % 6 == 0:
        motion_compensated_frames[i] = gray_frames[i]
    else:
        motion_compensated_frames[i] = gray_frames[i] - gray_frames[i-1]

In [97]:
motion_compensated_frames[1]

array([[0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       ...,
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0]], dtype=uint8)

### DCT

Now we're ready for doing Discrete Cosine Transform or DCT. we iterate over all the frames, and in each frame, we apply DCT on 8 * 8 blocks of the frame. we stroe transformed frames in a list called transformed_frames.

In [99]:
block_width = 8
block_height = 8

num_frames = len(gray_frames)
transformed_frames = []

print('First gray frame: ')
print(gray_frames[0])
print()

# print('First motion compensated frame: ')
# print(motion_compensated_frames[1])
# print()


for f in range(num_frames):
    current_frame = gray_frames[f]
#     current_frame = motion_compensated_frames[f]
    current_frame_transformed = np.empty_like(current_frame, dtype=np.float32)
    frame_height, frame_width = current_frame.shape
    
    for i in range(0, frame_height, block_height):
        for j in range(0, frame_width, block_width):
            current_block = np.array(current_frame[i: i + block_height, j: j + block_width], dtype=np.float32)
            transformed_current_block = cv2.dct(current_block)
            current_frame_transformed[i: i + block_height, j: j + block_width] = transformed_current_block
    transformed_frames.append(current_frame_transformed)

print('First transformed frame: ')
print(transformed_frames[1])
print()

First gray frame: 
[[ 59  96 118 ... 141 108  65]
 [ 59  96 118 ... 141 108  65]
 [ 59  96 118 ... 141 108  65]
 ...
 [ 53  93 123 ... 124 102  65]
 [ 53  93 123 ... 124 102  65]
 [ 53  93 123 ... 124 102  65]]

First transformed frame: 
[[ 9.3487500e+02 -1.2147029e+02 -9.4812218e+01 ...  1.1281408e+01
   8.2490568e+00 -7.0935102e+00]
 [-7.2011879e+01  7.4312663e+00  7.0826888e+00 ... -1.0792222e+00
   3.2036445e-01  5.0717503e-01]
 [ 3.0062557e+01 -3.4434628e+00 -4.2829304e+00 ...  3.4486741e-02
  -3.0177662e-01  3.5485172e-01]
 ...
 [ 0.0000000e+00  0.0000000e+00  0.0000000e+00 ...  0.0000000e+00
   0.0000000e+00  0.0000000e+00]
 [ 0.0000000e+00  0.0000000e+00  0.0000000e+00 ...  0.0000000e+00
   0.0000000e+00  0.0000000e+00]
 [ 0.0000000e+00  0.0000000e+00  0.0000000e+00 ...  0.0000000e+00
   0.0000000e+00  0.0000000e+00]]



### Quantization

For Quantization step, we first store the sign of the entries of each frame in 'is_negative' list for future processes.

In [100]:
is_negative = [transformed_frames[i] < 0.0 for i in range(len(transformed_frames))]
is_negative[1]

array([[False,  True,  True, ..., False, False,  True],
       [ True, False, False, ...,  True, False, False],
       [False,  True,  True, ..., False,  True, False],
       ...,
       [False, False, False, ..., False, False, False],
       [False, False, False, ..., False, False, False],
       [False, False, False, ..., False, False, False]])

In [101]:
transformed_frames[1]

array([[ 9.3487500e+02, -1.2147029e+02, -9.4812218e+01, ...,
         1.1281408e+01,  8.2490568e+00, -7.0935102e+00],
       [-7.2011879e+01,  7.4312663e+00,  7.0826888e+00, ...,
        -1.0792222e+00,  3.2036445e-01,  5.0717503e-01],
       [ 3.0062557e+01, -3.4434628e+00, -4.2829304e+00, ...,
         3.4486741e-02, -3.0177662e-01,  3.5485172e-01],
       ...,
       [ 0.0000000e+00,  0.0000000e+00,  0.0000000e+00, ...,
         0.0000000e+00,  0.0000000e+00,  0.0000000e+00],
       [ 0.0000000e+00,  0.0000000e+00,  0.0000000e+00, ...,
         0.0000000e+00,  0.0000000e+00,  0.0000000e+00],
       [ 0.0000000e+00,  0.0000000e+00,  0.0000000e+00, ...,
         0.0000000e+00,  0.0000000e+00,  0.0000000e+00]], dtype=float32)

As seen above, frames after applying DCT, have some values shown above. For quantization, we need to convert these values to 'unsigned int' ones. But before that, we get absoulute values of them.

In [102]:
abs_transformed_frames = [np.abs(transformed_frames[i]).astype(np.uint32) for i in range(len(transformed_frames))]
abs_transformed_frames[1]

array([[934, 121,  94, ...,  11,   8,   7],
       [ 72,   7,   7, ...,   1,   0,   0],
       [ 30,   3,   4, ...,   0,   0,   0],
       ...,
       [  0,   0,   0, ...,   0,   0,   0],
       [  0,   0,   0, ...,   0,   0,   0],
       [  0,   0,   0, ...,   0,   0,   0]], dtype=uint32)

Now, for quantization, we shift the values by 4 bits to reduce and remove the least significant bits in the entries of each frame.

In [103]:
shift_transformed_frames = [abs_transformed_frames[i] >> 4 for i in range(len(transformed_frames))]
shift_transformed_frames[1]

array([[58,  7,  5, ...,  0,  0,  0],
       [ 4,  0,  0, ...,  0,  0,  0],
       [ 1,  0,  0, ...,  0,  0,  0],
       ...,
       [ 0,  0,  0, ...,  0,  0,  0],
       [ 0,  0,  0, ...,  0,  0,  0],
       [ 0,  0,  0, ...,  0,  0,  0]], dtype=uint32)

After that, we convert values to 'signed int' and with help of 'is_negative' list, we'll specify the sign of the values in each frame.

In [104]:
shift_quantized_frames = [shift_transformed_frames[i].astype(np.int32) for i in range(len(transformed_frames))]
shift_quantized_frames[1] # (544, 960)

array([[58,  7,  5, ...,  0,  0,  0],
       [ 4,  0,  0, ...,  0,  0,  0],
       [ 1,  0,  0, ...,  0,  0,  0],
       ...,
       [ 0,  0,  0, ...,  0,  0,  0],
       [ 0,  0,  0, ...,  0,  0,  0],
       [ 0,  0,  0, ...,  0,  0,  0]])

In [105]:
shift_quantized_frames = [np.where(is_negative[i] == True, -shift_quantized_frames[i], shift_quantized_frames[i]) for i in range(len(transformed_frames))]
shift_quantized_frames[1] # (544, 960)

array([[58, -7, -5, ...,  0,  0,  0],
       [-4,  0,  0, ...,  0,  0,  0],
       [ 1,  0,  0, ...,  0,  0,  0],
       ...,
       [ 0,  0,  0, ...,  0,  0,  0],
       [ 0,  0,  0, ...,  0,  0,  0],
       [ 0,  0,  0, ...,  0,  0,  0]])

Congrats! We've done the Quantization stage. Now Let's go to Zig-Zag and RLE scans!

### Zig-Zag Scan

In [21]:
def zigzag(input):
    #initializing the variables
    #----------------------------------
    h = 0
    v = 0

    vmin = 0
    hmin = 0

    vmax = input.shape[0]
    hmax = input.shape[1]
    
    #print(vmax ,hmax )

    i = 0

    output = np.zeros(( vmax * hmax), dtype=int)
    #----------------------------------

    while ((v < vmax) and (h < hmax)):
    
        if ((h + v) % 2) == 0:                 # going up
            
            if (v == vmin):
                #print(1)
                output[i] = input[v, h]        # if we got to the first line

                if (h == hmax):
                    v = v + 1
                else:
                    h = h + 1                        

                i = i + 1

            elif ((h == hmax -1 ) and (v < vmax)):   # if we got to the last column
                #print(2)
                output[i] = input[v, h] 
                v = v + 1
                i = i + 1

            elif ((v > vmin) and (h < hmax -1 )):    # all other cases
                #print(3)
                output[i] = input[v, h] 
                v = v - 1
                h = h + 1
                i = i + 1

        
        else:                                    # going down

            if ((v == vmax -1) and (h <= hmax -1)):       # if we got to the last line
                #print(4)
                output[i] = input[v, h] 
                h = h + 1
                i = i + 1
        
            elif (h == hmin):                  # if we got to the first column
                #print(5)
                output[i] = input[v, h] 

                if (v == vmax -1):
                    h = h + 1
                else:
                    v = v + 1

                i = i + 1

            elif ((v < vmax -1) and (h > hmin)):     # all other cases
                #print(6)
                output[i] = input[v, h] 
                v = v + 1
                h = h - 1
                i = i + 1




        if ((v == vmax-1) and (h == hmax-1)):          # bottom right element
            #print(7)        	
            output[i] = input[v, h] 
            break

    #print ('v:',v,', h:',h,', i:',i)
    return output


We'll apply zigzag() function above, on each frame. After that, we'll have frames ready to RLE scan.

In [106]:
def zigzag_encode_frames(shift_quantized_frames):
    zigzag_frames = []
    for f in range(len(shift_quantized_frames)):
        current_frame = shift_quantized_frames[f]
        zigzag_output = zigzag(current_frame)
        zigzag_frames.append(zigzag_output)
    return zigzag_frames

zigzag_frames = zigzag_encode_frames(shift_quantized_frames)
print(zigzag_frames[1])

[58 -7 -4 ...  0  0  0]


In [107]:
# ---- TODO: Accelerate Zig-Zag scan with numpy like below:
#     zigzag_output = np.concatenate([np.diagonal(current_frame[::-1,:], k)[::(2*(k % 2)-1)] for k in range(1-current_frame.shape[0], current_frame.shape[0])])

### Run-lenth Scan

In [23]:
def rle_encode(in_list):
    
#     if not in_list:
#         return []

    # Init output list so that first element reflect first input item.
    out_list = [(in_list[0], 1)]

    # Then process all other items in sequence.
    for item in in_list[1:]:
        
        # If same as last, up count, otherwise new element with count 1.
        if item == out_list[-1][0]:
            out_list[-1] = (item, out_list[-1][1] + 1)
        else:
            out_list.append((item, 1))

    return out_list

We'll apply rle_encode() function above, on each frame. After that, we'll have frames encoded!

In [107]:
def rle_encode_frames(zigzag_frames):
    rle_frames = []
    for f in range(len(zigzag_frames)):
        current_frame = zigzag_frames[f]
        output = rle_encode(current_frame)
        rle_frames.append(output)
    return rle_frames
rle_frames = rle_encode_frames(zigzag_frames)
print(rle_frames[1])

[(58, 1), (-7, 1), (-4, 1), (1, 1), (0, 1), (-5, 1), (-4, 1), (0, 3), (-1, 1), (0, 3), (-2, 1), (0, 21), (65, 1), (0, 7), (64, 1), (0, 1), (-4, 1), (0, 6), (-8, 1), (0, 3), (-6, 1), (0, 5), (1, 1), (0, 10), (-4, 1), (0, 7), (-2, 1), (0, 3), (-1, 1), (0, 26), (1, 1), (0, 13), (-1, 1), (0, 8), (65, 1), (0, 7), (72, 1), (0, 7), (65, 1), (0, 1), (-5, 1), (0, 14), (-8, 1), (0, 3), (-6, 1), (0, 13), (1, 1), (0, 5), (1, 1), (0, 12), (-4, 1), (0, 7), (-2, 1), (0, 11), (-1, 1), (0, 20), (-1, 1), (0, 35), (-1, 1), (0, 16), (65, 1), (0, 7), (72, 1), (0, 7), (72, 1), (0, 7), (65, 1), (0, 1), (-5, 1), (0, 22), (-8, 1), (0, 3), (-6, 1), (0, 21), (2, 1), (0, 26), (-4, 1), (0, 7), (-3, 1), (0, 19), (-1, 1), (0, 28), (-1, 1), (0, 29), (1, 1), (0, 13), (-1, 1), (0, 24), (66, 1), (0, 7), (72, 1), (0, 7), (72, 1), (0, 7), (72, 1), (0, 7), (65, 1), (0, 1), (-5, 1), (0, 30), (-9, 1), (0, 3), (-6, 1), (0, 29), (2, 1), (0, 34), (-4, 1), (0, 7), (-3, 1), (0, 27), (-1, 1), (0, 74), (1, 1), (0, 13), (-1, 1), (0,

In [108]:
freq_dict = {'0': 0, '1': 0, '2': 0, '3': 0, '4': 0, '5': 0, '6': 0, '7': 0, '8': 0, '9': 0, ' ': 0, '-': 0, '*': 0, 'f': 0,}
def calc_char_freq(rle_frame):
    for run_length in rle_frame:
        run, length = run_length
        run_str = str(run)
        for char in run_str:
            freq_dict[char] += 1
        freq_dict[' '] += 1
        length_str = str(length)
        for char in length_str:
            freq_dict[char] += 1
        freq_dict['*'] += 1
    freq_dict['f'] += 1

In [109]:
for f in range(len(rle_frames)):
    current_rle_frame = rle_frames[f]
    calc_char_freq(current_rle_frame)

In [110]:
freq_dict

{'0': 1002249,
 '1': 1612174,
 '2': 248535,
 '3': 209039,
 '4': 160007,
 '5': 206129,
 '6': 328899,
 '7': 744309,
 '8': 114540,
 '9': 101160,
 ' ': 1969091,
 '-': 315859,
 '*': 1969091,
 'f': 50}

In [111]:
class node:
    def __init__(self, freq, symbol, left=None, right=None):
    
        self.freq = freq
        self.symbol = symbol
        self.left = left
        self.right = right
        self.huff = ''

In [135]:
def printNodes(node, val=''):
   
    newVal = val + str(node.huff)
 
    if(node.left):
        printNodes(node.left, newVal)
    if(node.right):
        printNodes(node.right, newVal)
 
    if(not node.left and not node.right):
        print(f"{node.symbol} -> {newVal}")

In [136]:
nodes = []
chars = list(freq_dict.keys())
freq = list(freq_dict.values()) 

for x in range(len(chars)):
    nodes.append(node(freq[x], chars[x]))
    
while len(nodes) > 1:
    
    nodes = sorted(nodes, key=lambda x: x.freq)
 
    
    left = nodes[0]
    right = nodes[1]
 
    left.huff = 0
    right.huff = 1
 
    newNode = node(left.freq+right.freq, left.symbol+right.symbol, left, right)
 
    nodes.remove(left)
    nodes.remove(right)
    nodes.append(newNode)
    
printNodes(nodes[0])

  -> 00
* -> 01
3 -> 10000
f -> 1000100
9 -> 1000101
8 -> 100011
2 -> 10010
- -> 10011
0 -> 101
6 -> 11000
4 -> 110010
5 -> 110011
7 -> 1101
1 -> 111


In [126]:
node = nodes[0]
while node.left or node.right:
    node.left

<__main__.node at 0x2780f8ce610>

Now we can store coded frames in a file for future needs.

In [93]:
motion_compensated_frames = open("motion_compensated_frames.txt", "a")
for f in range(len(rle_frames)):
    current_frame = rle_frames[f]
    for run_length in current_frame:
        run, length = run_length
        motion_compensated_frames.write(str(run))
        motion_compensated_frames.write(' ')
        motion_compensated_frames.write(str(length))
        motion_compensated_frames.write('*')
    motion_compensated_frames.write('f')
motion_compensated_frames.close()

In [81]:
coded_frames = open("coded_frames.txt", "a")
for f in range(len(rle_frames)):
    current_frame = rle_frames[f]
    for run_length in current_frame:
        run, length = run_length
        coded_frames.write(str(run))
        coded_frames.write(' ')
        coded_frames.write(str(length))
        coded_frames.write('*')
    coded_frames.write('f')
coded_frames.close()

## Decoder

In [73]:
coded_frames_file = open("coded_frames.txt", "r")
content = coded_frames_file.readlines()
coded_frames_file.close()
raw_frames = content[0].split('f')
raw_frames = raw_frames[: len(raw_frames) - 1]

In [77]:
def parse(raw_frame):
    rle_frame = []
    run_length_list = raw_frame.split('*')
    for i in range(len(run_length_list) - 1):
        run_length = run_length_list[i]
        run_length_pair = run_length.split(' ')
        run = run_length_pair[0]
        length =  run_length_pair[1]
        rle_frame.append((int(run), int(length)))

    return rle_frame

In [80]:
def parse_raw_frames(raw_frames):
    rle_frames = []
    for f in range(len(raw_frames)):
        current_raw_frame = raw_frames[f]
        rle_frame = parse(current_raw_frame)
        # rle_frame = np.array(rle_frame)
        rle_frames.append(rle_frame)
    return rle_frames

In [81]:
rle_frames = parse_raw_frames(raw_frames)

For Decoding step, we can apply mentioned steps in Encoding step in reverse; so we start by Inverse RLE sccan.

### Inverse Run-length Scan

In [82]:
def rle_decode(in_list):
    out_list = []
    for i in range(len(in_list)):
        value, length = in_list[i]
        for j in range(length):
            out_list.append(value)
        
    return out_list

We'll apply rle_decode() function above, on each frame. After that, we'll have frames ready for Inverse Zig-Zag scan.

In [83]:
inverse_zigzag_frames = []
for f in range(len(rle_frames)):
    current_frame = rle_frames[f]
    zigzag = rle_decode(current_frame)
    inverse_zigzag_frames.append(zigzag)
inverse_zigzag_frames[0]

[61,
 -8,
 -4,
 -1,
 0,
 -6,
 -4,
 0,
 0,
 0,
 1,
 0,
 0,
 0,
 -2,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 65,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 67,
 0,
 -5,
 0,
 0,
 0,
 0,
 0,
 0,
 -8,
 0,
 0,
 0,
 -6,
 0,
 0,
 0,
 0,
 0,
 -1,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 -4,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 -2,
 0,
 0,
 0,
 1,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 -1,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 65,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 72,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 68,
 0,
 -5,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 -8,
 0,
 0,
 0,
 -6,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 -2,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 -4,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 -2,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 1,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,

### Inverse Zig-Zag Scan

In [84]:
def inverse_zigzag(input, vmax, hmax):

    #print input.shape

    # initializing the variables
    #----------------------------------
    h = 0
    v = 0

    vmin = 0
    hmin = 0

    output = np.zeros((vmax, hmax), dtype=int)

    i = 0
    #----------------------------------

    while ((v < vmax) and (h < hmax)): 
        #print ('v:',v,', h:',h,', i:',i)   	
        if ((h + v) % 2) == 0:                 # going up
            
            if (v == vmin):
                #print(1)

                output[v, h] = input[i]        # if we got to the first line

                if (h == hmax):
                    v = v + 1
                else:
                    h = h + 1                        

                i = i + 1

            elif ((h == hmax -1 ) and (v < vmax)):   # if we got to the last column
                #print(2)
                output[v, h] = input[i] 
                v = v + 1
                i = i + 1

            elif ((v > vmin) and (h < hmax -1 )):    # all other cases
                #print(3)
                output[v, h] = input[i] 
                v = v - 1
                h = h + 1
                i = i + 1

        
        else:                                    # going down

            if ((v == vmax -1) and (h <= hmax -1)):       # if we got to the last line
                #print(4)
                output[v, h] = input[i] 
                h = h + 1
                i = i + 1
        
            elif (h == hmin):                  # if we got to the first column
                #print(5)
                output[v, h] = input[i] 
                if (v == vmax -1):
                    h = h + 1
                else:
                    v = v + 1
                i = i + 1

            elif((v < vmax -1) and (h > hmin)):     # all other cases
                output[v, h] = input[i] 
                v = v + 1
                h = h - 1
                i = i + 1




        if ((v == vmax-1) and (h == hmax-1)):          # bottom right element
            #print(7)        	
            output[v, h] = input[i] 
            break


    return output

We'll apply inverse_zigzag() function above, on each frame. After that, we'll have frames ready for Inverse Quantization.

In [85]:
inverse_quantized_frames = []
vmax = gray_frames[0].shape[0]
hmax = gray_frames[0].shape[1]
for f in range(len(inverse_zigzag_frames)):
    zigzag = inverse_zigzag_frames[f]
    inverse_quantized_frame = inverse_zigzag(zigzag, vmax, hmax)
    inverse_quantized_frames.append(inverse_quantized_frame)

In [86]:
inverse_quantized_frames[0]

array([[61, -8, -6, ...,  0,  0,  0],
       [-4,  0,  0, ...,  0,  0,  0],
       [-1,  0,  0, ...,  0,  0,  0],
       ...,
       [ 0,  0,  0, ...,  0,  0,  0],
       [ 0,  0,  0, ...,  0,  0,  0],
       [ 0,  0,  0, ...,  0,  0,  0]])

### Inverse Quantization

For Inverse Quantiztion, we first get absolute values of each frame and then convert them to 'unsigned int' ones. Then, we shift the values to left by 4. And finally we reuse 'is_negative' list to specify the sign of values.

In [87]:
abs_quantized_frames = [np.abs(inverse_quantized_frames[i]) for i in range(len(inverse_quantized_frames))]
abs_quantized_frames[0]

array([[61,  8,  6, ...,  0,  0,  0],
       [ 4,  0,  0, ...,  0,  0,  0],
       [ 1,  0,  0, ...,  0,  0,  0],
       ...,
       [ 0,  0,  0, ...,  0,  0,  0],
       [ 0,  0,  0, ...,  0,  0,  0],
       [ 0,  0,  0, ...,  0,  0,  0]])

In [88]:
abs_quantized_frames = [abs_quantized_frames[i].astype(np.uint32) for i in range(len(transformed_frames))]
abs_quantized_frames[0]

array([[61,  8,  6, ...,  0,  0,  0],
       [ 4,  0,  0, ...,  0,  0,  0],
       [ 1,  0,  0, ...,  0,  0,  0],
       ...,
       [ 0,  0,  0, ...,  0,  0,  0],
       [ 0,  0,  0, ...,  0,  0,  0],
       [ 0,  0,  0, ...,  0,  0,  0]], dtype=uint32)

In [89]:
shift_transformed_frames = [abs_quantized_frames[i] << 4 for i in range(len(transformed_frames))]
shift_transformed_frames[0]

array([[976, 128,  96, ...,   0,   0,   0],
       [ 64,   0,   0, ...,   0,   0,   0],
       [ 16,   0,   0, ...,   0,   0,   0],
       ...,
       [  0,   0,   0, ...,   0,   0,   0],
       [  0,   0,   0, ...,   0,   0,   0],
       [  0,   0,   0, ...,   0,   0,   0]], dtype=uint32)

In [90]:
transformed_frames = [shift_transformed_frames[i].astype(np.float32) for i in range(len(transformed_frames))]
transformed_frames[0]

array([[976., 128.,  96., ...,   0.,   0.,   0.],
       [ 64.,   0.,   0., ...,   0.,   0.,   0.],
       [ 16.,   0.,   0., ...,   0.,   0.,   0.],
       ...,
       [  0.,   0.,   0., ...,   0.,   0.,   0.],
       [  0.,   0.,   0., ...,   0.,   0.,   0.],
       [  0.,   0.,   0., ...,   0.,   0.,   0.]], dtype=float32)

In [91]:
transformed_frames = [np.where(is_negative[i] == True, -transformed_frames[i], transformed_frames[i]) for i in range(len(transformed_frames))]
transformed_frames[0]

array([[ 976., -128.,  -96., ...,    0.,    0.,   -0.],
       [ -64.,    0.,    0., ...,   -0.,    0.,   -0.],
       [ -16.,    0.,    0., ...,    0.,   -0.,    0.],
       ...,
       [  -0.,   -0.,   -0., ...,   -0.,   -0.,    0.],
       [  -0.,    0.,   -0., ...,   -0.,   -0.,    0.],
       [   0.,   -0.,   -0., ...,    0.,    0.,   -0.]], dtype=float32)

### IDCT

This step is the same as DCT in Encoding step, but there's just one difference and that is we use 'idct' function instead of 'dct'. other process are the same.

In [92]:
inverse_transformed_frames = []
print('First transformed frame: ')
print(transformed_frames[0])
print()

for f in range(num_frames):
    current_frame = transformed_frames[f]
    current_frame_inverse_transformed = np.empty_like(current_frame, dtype=np.float32)
    frame_height, frame_width = current_frame.shape
    
    for i in range(0, frame_height, block_height):
        for j in range(0, frame_width, block_width):
            current_block = np.array(current_frame[i: i + block_height, j: j + block_width], dtype=np.float32)
            inverse_transformed_current_block = cv2.idct(current_block)
            current_frame_inverse_transformed[i: i + block_height, j: j + block_width] = inverse_transformed_current_block
    inverse_transformed_frames.append(current_frame_inverse_transformed)

print('First inverse transformed frame: ')
print(inverse_transformed_frames[0])
print()

print('First gray frame: ')
print(gray_frames[0])
print()

First transformed frame: 
[[ 976. -128.  -96. ...    0.    0.   -0.]
 [ -64.    0.    0. ...   -0.    0.   -0.]
 [ -16.    0.    0. ...    0.   -0.    0.]
 ...
 [  -0.   -0.   -0. ...   -0.   -0.    0.]
 [  -0.    0.   -0. ...   -0.   -0.    0.]
 [   0.   -0.   -0. ...    0.    0.   -0.]]

First inverse transformed frame: 
[[ 59.012154  91.18938  119.3101   ... 135.20482  107.98327   70.591805]
 [ 58.23221   90.40943  118.53015  ... 138.0788   110.85723   73.46577 ]
 [ 63.518433  95.695656 123.81638  ... 141.60666  114.38511   76.993645]
 ...
 [ 58.905304  92.613266 122.89878  ... 124.550896 105.703835  74.46008 ]
 [ 53.619076  87.32704  117.61255  ... 119.264656 100.41761   69.17386 ]
 [ 54.399025  88.10699  118.392494 ... 120.04461  101.197556  69.953804]]

First gray frame: 
[[ 59  96 118 ... 141 108  65]
 [ 59  96 118 ... 141 108  65]
 [ 59  96 118 ... 141 108  65]
 ...
 [ 53  93 123 ... 124 102  65]
 [ 53  93 123 ... 124 102  65]
 [ 53  93 123 ... 124 102  65]]



In [93]:
inverse_transformed_frames = [inverse_transformed_frames[i].astype(np.uint8) for i in range(len(inverse_transformed_frames))]
inverse_transformed_frames[0] # (544, 960)

array([[ 59,  91, 119, ..., 135, 107,  70],
       [ 58,  90, 118, ..., 138, 110,  73],
       [ 63,  95, 123, ..., 141, 114,  76],
       ...,
       [ 58,  92, 122, ..., 124, 105,  74],
       [ 53,  87, 117, ..., 119, 100,  69],
       [ 54,  88, 118, ..., 120, 101,  69]], dtype=uint8)

Congrats again! We completed Decoding successfully!

### Save Video

Now we have decoded frames. we can use them to make a video.

In [94]:
color_frames = []    
for f in range(num_frames):
    frame = inverse_transformed_frames[f]
    color = cv2.cvtColor(frame, cv2.COLOR_GRAY2BGR)
    color_frames.append(color)


In [97]:
out = cv2.VideoWriter('decoder_output.avi',cv2.VideoWriter_fourcc(*'DIVX'), 30, (960, 544))
for i in range(len(inverse_transformed_frames)):
    out.write(color_frames[i])
out.release()

# Resources & References

1. https://learnopencv.com/read-write-and-display-a-video-using-opencv-cpp-python/
2. https://docs.opencv.org/master/dd/d43/tutorial_py_video_display.html
3. https://ottverse.com/discrete-cosine-transform-dct-video-compression/
4. https://yasoob.me/posts/understanding-and-writing-jpeg-decoder-in-python/
5. https://docs.scipy.org/doc/scipy/reference/generated/scipy.fftpack.dct.html#scipy.fftpack.dct
6. https://docs.scipy.org/doc/scipy/reference/fftpack.html
7. https://fairyonice.github.io/2D-DCT.html
8. https://www.geeksforgeeks.org/python-opencv-cv2-copymakeborder-method/
9. https://www.hdm-stuttgart.de/~maucher/Python/MMCodecs/html/transforms.html
10. https://numpy.org/doc/stable/reference/generated/numpy.bitwise_and.html#numpy.bitwise_and
11. https://stackoverflow.com/questions/26303171/how-to-mask-out-lower-x-bits-of-python-integer-of-unknown-size
12. https://stackoverflow.com/questions/39440633/matrix-to-vector-with-python-numpy
13. https://realpython.com/python-bitwise-operators/
14. https://numpy.org/doc/stable/reference/generated/numpy.where.html
15. https://github.com/getsanjeev/compression-DCT/blob/master/zigzag.py
16. https://stackoverflow.com/questions/61524872/python-run-length-encoding
17. https://docs.opencv.org/3.4/dd/d9e/classcv_1_1VideoWriter.html
18. https://theailearner.com/2018/10/15/creating-video-from-images-using-opencv-python/
19. https://learnopencv.com/read-write-and-display-a-video-using-opencv-cpp-python/
20. https://www.geeksforgeeks.org/zigzag-or-diagonal-traversal-of-matrix/
21. https://numpy.org/doc/1.20/user/basics.types.html
22. https://www.geeksforgeeks.org/huffman-decoding/
23. https://medium.com/@xww0701/hackerrank-huffman-decoding-python-solution-440b628ee355
24. https://en.wikipedia.org/wiki/Modified_Huffman_coding
25. https://github.com/lionell/huffman-rle/blob/master/main.cc
26. https://www.geeksforgeeks.org/huffman-coding-greedy-algo-3/